Big Data for Big Insights: Quantifying the Adverse Effect of Air Pollution on the Tourism Industry in China

Adverse meteorological conditions and air pollution resulting from human activities, such as extreme weather and smog, adversely affect the global tourism industry. However, such impacts are difficult to quantify. This study strives to quantify the adverse impact of air pollution on foreign tourists’ revisiting behaviors to China by analyzing large numbers of TripAdvisor reviews. The study first identifies travelers affected by air pollution through analyzing their reviews. It then employs propensity score matching technique to detect a matching group of travelers with identical characteristics who did not report air-pollution-related issues in reviews. By estimating their respective likelihoods of revisiting, our results indicate that travelers who encountered air pollution during their trips are 92.857% less likely to revisit a specific city and 93.421% less likely to revisit China. Our study enriches the tourism literature by quantifying the adverse impact of air pollution on a country’s inbound tourism using big data.


Introduction
Understanding the impacts of meteorological conditions on tourists' behavior is an important topic in tourism research. Studies have revealed how meteorological factors affect tourist behavior in areas such as travel and activity participation (Becken and Wilson 2013), travel destination preferences (Førland et al. 2013), tourism experience (Jeuring and Peters 2013), and trip satisfaction (Coghlan and Prideaux 2009;Jeuring 2017). Meteorological factors, such as sunlight, temperature, and air quality, constitute the aesthetical aspects of climate, affecting the attractiveness of a travel location to tourists (Goh 2012). Therefore, maintaining a good tourism environment with favorable meteorological conditions is critical for the development of the tourism industry, even if it is not possible to control all meteorological conditions.
Recent years have seen a rise in several adverse meteorological conditions, such as extreme weather, El Niño, and air pollution caused by human activities (Abraham 2018;Anas Baig 2017;Vose et al. 2014). As has been widely discussed, increasing occurrences of these adverse meteorological factors and air pollution, especially extreme weather and toxic smog, may suffocate the future development of the tourism sector (Parkin 2019;Ross 2019;Saksornchai 2019). Many tourism destinations are afflicted by increasing air pollution such as smog . Pollutant emissions such as greenhouse gas emissions are a by-product of modern economic development (Nepal, al Irsyad, and Nepal 2019) and are often considered an inevitable outcome pertinent to the development of heavy industry and increasing use of automobiles (Guan, Zheng, and Zhong 2017;Guttikunda 2017), even at the cost of human health, the environment, and the future of tourism (Omoju 2014).
The prevalence of air pollution in recent years offered an unprecedented challenge to tourism research because recent statistics have violated the long-held assumption that air pollution decreases inbound tourists (Becken et al. 2017). In particular, based on the international tourism data from the World Bank (2019), countries with the worst air pollution are found to have the most promising tourism market with increasing numbers of inbound tourists in recent years, implying a possible spurious correlation between air pollution and tourism volume. Specifically, countries including Bangladesh, Mongolia, India, Indonesia, Bahrain, and China have been criticized as the most polluted countries (IQAir 2020), but the numbers of their inbound tourists have constantly increased in recent years, as shown in Appendix A1. In other words, the presence of air pollution coincides with an increase in inbound tourists.
This fact violates the long-term assumption on the effect of air pollution on the tourism sector, and there is a paucity of empirical evidence to justify such an effect. In this study, we attempt to investigate the impact of air pollution on the revisiting behavior of individual tourists through the use of a large sample of customer reviews. Through this study, we aim to provide a more complete understanding of the relationship between air pollution and the revisiting behavior of travelers. To the best of our knowledge, there remains a research void with regard to quantifying the impact of air pollution on the revisiting behaviors of tourists.
Furthermore, through incorporating theories of destination image and country image, we investigated a possible extending effect of air pollution: the trip experience to a city may affect tourists' future visiting behavior to other cities of the country. Such an extending effect, to the best of our knowledge, has been seldom investigated in tourism literature. Understanding this extending effect would enrich country image literature in the tourism domain and unveil novel implications for the tourism industry.
Specifically, this study strives to understand how air pollution, especially smog, deters tourists from revisiting a city or a country. Smog frequently plagued many Chinese cities, which has been the prevalent air pollution issue receiving wide public concerns (Peng and Xiao 2018). In the current study, we analyzed the likelihood of international travelers to revisit China by delving into their reviews of Chinese hotels. A large-scale dataset encapsulating 269,847 TripAdvisor reviews of 5,142 hotels in 15 major Chinese cities posted by 181,698 travelers was collected and analyzed. We postulated that if tourists specifically mention air pollution in their review, their likelihood of revisiting the same country and city will drop significantly.
The remainder of the paper is structured as follows. In the next section, we review the literature on tourism experience and revisiting behavior. Next, we set hypotheses about the effects of air pollution on tourists' revisiting behavior. We then outline the methodological procedures and techniques for validating our hypothesized relationships. We conclude by presenting the results, highlighting both implications for theory and practice, and discussing the limitations of this study and future research directions.

Meteorological Factors and Tourism
Climate and weather, manifested through various meteorological factors, are interconnected with the tourism industry (Matzarakis 2006). Climate represents the average weather for a particular region over a time period, usually taken over 30 years, while weather or meteorological conditions are normally measured in terms of a specific day, hour, or minute (Shepherd, Shindell, and O'Carroll 2005). Ample studies have explained the vital role of meteorological factors, such as rainfall, sunshine, and temperature, in affecting tourism in several ways (Agnew and Palutikof 2006;Álvarez-Díaz and Rosselló-Nadal 2010;Rosselló-Nadal, Riera-Font, and Cárdenas 2011).
Special meteorological features are essential natural resources of a location promoting tourism (Smith 1993), and they define the "tourism potential" of the location (de Freitas 2003). For instance, warm and sunny weather is normally favorable for beach tourism (Moreno, Amelung, and Santamarta 2008;Rutty and Scott 2016). Adequate snow is mandatory for ski resorts (Gorman-Murray 2008;Hopkins 2015;Williams, Dossa, and Hunt 1997). Weather variables like temperature, wind, and snow depth were found to significantly affect various tourism outcomes, such as visitation to different tourism places (Becken 2013;Nicholls 2011, 2012;Shih, Nicholls, and Holecek 2009), tourist satisfaction with a destination (Vojtko et al. 2020), and tourism spending (Wilkins et al. 2018), even though urban tourists are more weather resilient (McKercher et al. 2015). Such as people travel to warm destinations to escape the cold of winter (Becken and Wilson 2013;Wall 2007). Therefore, climate, to a large extent, determines the attractiveness of a travel destination (Hu and Ritchie 1993).
Tourists often make their travel decisions based on the climatic conditions of a particular travel destination (Becken and Hay 2007). In a study by Hamilton and Lau (2005), most tourists who were surveyed accentuated climate as one of the most important factors when deciding on a travel destination. For example, nearly 60% of travelers tracked the weather in their travel destinations before departure. Esthetic aspects of climate and scenery also contribute to tourism experiences (Becken and Hay 2007). Specific weather conditions can add to the "uniqueness" of a tourism experience (Jeuring and Peters 2013). Keller et al. (2005) measured the association between weather and human psychological changes and found that pleasant meteorological factors improve people's mood and broaden their cognition. Likewise, meteorological factors should affect traveler mood. Damm et al. (2017, 31) stated that "under +2°C warming, the weather-induced risk of losses in winter overnight stays related to skiing tourism in Europe amounts to up to 10.1 million nights per winter season." Enjoyable climatic conditions are often used in advertisements to lure visitors (Gómez Martín 2005). The Cayman Islands claim a "perpetual summer," Florida is "The Sunshine State," and Barbados even offers a money-back "perfect weather guarantee" in 2009 to attract tourists (Scott, Lemieux, and Malone 2011, 116). These examples substantiate the importance of meteorological conditions in shaping the attractiveness of a tourism destination (Lohmann and Kaim 1999). While the preponderances of favorable weather have been well documented, prior literature on tourism has paid a dearth of attention to the potential deterrence effect of adverse meteorological conditions on travelers' decisions (Buckley 2012). Given travelers' sensitivity to climate factors and the worldwide surge of adverse meteorological conditions, it is plausible to postulate that the emergence of unfavorable meteorological conditions can have a negative effect on the attractiveness of a tourism destination. We have summarized the reported effects of climatic or meteorological factors on tourism in Table A2.1.
Concerns regarding environmental factors have been well discussed in the tourism literature. In a well-cited work by Buckley (2012), population, peace, prosperity, pollution, and protection were identified as the key angles for understanding the sustainability of tourism. Williams and Ponsford (2009, 396) noted that tourism "depends on the protection of the ecological integrity of these features [environmental resources] for sustained competitiveness." In this vein, many of the past studies have focused on the strategies of effectively using extant resources and the efforts of reducing pollution devised by the industry (e.g., hotel solid waste and wastewater) (Mai and Smith 2018;Nepal, al Irsyad, and Nepal 2019). Air quality, as an integral aspect of weather and climate, exerts strongly influence on the tourism industry ). More recently, because the proliferation of air pollution has threatened the development of many tourism destinations, air pollution has attracted more attention. It has been shown that air pollution has a push effect on the outbound tourism in the local city (Wang, Fang, and Law 2018), adversely affects tourist arrivals (Churchill, Pan, and Paramati 2020), and even magnifies tourists' suspicion of service providers . The studies on the effects of air pollution on tourism are summarized in Table A2.2.
It is worth noting that the tourism sector is a victim of environmental pollution resulting from economic activities. Adverse weather conditions, disease outbreaks, and various forms of environmental pollution cumulatively underscore the importance of understanding these factors to achieve sustainable development of tourism (Williams and Ponsford 2009). By studying geotagged social media data in Beijing in 2013,  found that tourists express fewer positive sentiments and more health issues in social media posts when air pollution increases.  reported that perceived air pollution increases tourists' feelings of pessimism, which in turn brings about greater social suspicion of local service providers.
Evidently, quantifying the adverse effects of air pollution can offer important information to help government agencies understand the gains and losses of environmental pollution, which often results from the development of heavy industry and the transportation sector (e.g., Guan, Zheng, and Zhong 2017), albeit at the cost of the tourism industry. To the best of our knowledge, such studies are elusive. In this vein, a recent study by Zhang et al. (2020, 14) called for an effort to "monitor international tourists' experiences amidst air pollution to explore how such pollution influences tourists' destination loyalty and electronic word of mouth."

Air Pollution as a Risk to Tourist Safety
As a result of human activities, multiple adverse meteorological factors and even extreme weather conditions have surfaced and become prevalent, affecting both human, environment, and the tourism experience (Jeuring and Becken 2013;Wang, Fang, and Law 2018). Human activity is a major course of air pollution, which consists of harmful chemicals or particles in the air. Air pollutants take many forms, which can be gases, liquid droplets, or solid particles. Although not all pollutants in the air are perceptible, sometimes meteorological conditions interact with air pollutants to generate perceptible conditions. Smog, the so-called "smoky fog," is a portmanteau of "smoke" and "fog" (Allaby 2003). It has manifested as a severe environmental problem, especially for countries like China, India, and so forth (World Health Organization 2016).
According to a global assessment of ambient air pollution by the World Health Organization (WHO), air pollution has been identified as the biggest environmental risk to health, and it continues to rise at an alarming rate (WHO 2016). Air pollution is a health hazard that adversely affects people's health (Hughes 2012). Globally, three million deaths were attributable solely to outdoor ambient air pollution each year, mainly because of causing non-communicable diseases (WHO 2016).
Worry about safety has been found to play a key role in choosing a tourist destination (Jeuring and Becken 2013;Larsen, Brun, and Øgaard 2009). Air pollution, as a hazardous meteorological condition, may stimulate negative affective responses such as uncertainty, fear, or worry (Griffin et al. 2004). Air pollution, like smog, may also cause respiratory and cardiac problems (Davis, Bell, and Fletcher 2002;Nemery, Hoet, and Nemmar 2001;WHO 2016), resulting in worries about health. Consequently, air pollution might affect tourists' destination choices.
After the Fukushima disaster, it was reported that the perception of physical risks, such as natural disasters and radioactive contamination of food and the environment, deterred repeat tourists from returning to Japan (Chew and Jahari 2014). Wang et al. (2018) found that poor local air quality pushes residents to outbound tours in pursuit of clean air. Tourists worry about the environmental deterioration of travel destinations. Pollution in travel attractions and unsatisfactory previous trips have been identified as deterrents to repeat tourism (Rittichainuwat and Chakraborty 2009). Chew and Jahari (2014) revealed that worries about health risks have a negative effect on tourists' intentions to revisit a travel destination.
Apart from both health and safety concerns, smog and associated air pollution have become major hurdles to enticing visitors due to the fact that smog could compromise tourists' travel experience . Air pollution may adversely affect tourists' travel experiences by reducing visibility. Denstadli and Jacobsen (2014) found that a reduction in visibility caused by weather elements negatively impacts tourists' intention to revisit. Poor weather was also found to negatively affect travel experience and tourist satisfaction due to unrealized tour expectations (Coghlan and Prideaux 2009) and travel changes (Becken and Wilson 2013). Therefore, air pollution may not only raise travelers' concerns about health, uncertainty, and worry but also deteriorate the travel experience and travel satisfaction, which may discourage tourists from revisiting a destination.
Even though tourism practitioners have frequently expressed concerns about the adverse influence of air pollution, this influence has remained difficult to quantify. For instance, India's toxic air was claimed to prompt visitors to defer or cancel trips to destinations such as Delhi, Agra, and Varanasi (Parkin 2019). Tourism practitioners have alleged that toxic air might have turned a large number of tourists away from Thailand (Saksornchai 2019), Indonesia, and Singapore (Ross 2019). However, the number of inbound tourists has continued to increase in these countries (World Bank 2019).
Economic growth may exhibit a confounding factor in the relationship between air pollution and tourism sector development. On one hand, a country's economic development (e.g., China and India) may be associated with greater pollution, such as emissions of air pollution produced by heavy industry and the transportation sector (Hao et al. 2018;Guan, Zheng, and Zhong 2017). On the other hand, economic growth also attracts more inbound business visitors, as it raises the fame of the country on a global scale and leads to better infrastructure (e.g., transport connectivity, travel facilities, etc.), thus attracting the attention of global tourists. As a result, quantifying the adverse effect of air pollution on the tourism sector is a challenging topic in the field.

Repeat Visit of Travelers and Air Pollution
Repeat visitors have been widely acknowledged as an appealing market segment for tourism practitioners. Not only are repeat visitors more habitual in visits (Oppermann 1998), but they are also more destination loyal (Oppermann 2000). Marketing costs for repeat patrons are six times less than pursuing new customers, making repeat visitors particularly important for the tourism sector (Rosenberg and Czepiel 1984). Thus, repeat visitations are a desirable phenomenon for mature travel destinations (Huang and Hsu 2009). More than just as a reliable source of revenue stream, repeat visitors also act as word-of-mouth channels that can attract potential tourists (Reid and Reid 1994). Losing repeat visitors can cause a severe negative chain effect not only on revenue but also on future development of the tourism industry. Although the importance of tourists' future behaviors has been indicated in a significant body of literature, studies in the field mainly rely on survey data with small sample sizes (Hu et al. 2019). Even though a handful of studies investigated the association of air pollution on inbound tourist volume (Churchill, Pan, and Paramati 2020;Wang and Chen 2021), little is known on how air pollution affects loyalty of foreign tourists, such as their revisiting behaviors to travel destinations. In other words, though big social data have been applied in research on various travel-related topics, there is a dearth of research that leverages the massive social and behavioral traces revealed by tourists online to probe their revisit behaviors.

Hypothesis Development
Frequent occurrences of air pollution can damage the image of a tourism location. Generally speaking, "destination image" refers to the holistic impression that an individual holds of a particular destination (Baloglu and McCleary 1999). In this study, we consider Chinese cities as destinations and view destination image at the city level, which is in line with extant literature (Becken 2013;Wang, Fang, and Law 2018). Destination image and an intention to revisit are very much linked (Baloglu and McCleary 1999;Li et al. 2010;Wang and Hsu 2010), indicating the positive impact a pleasing destination image has on tourist revisit intention. The quality of tourists' prior experience also influences their decision on revisiting particular attractions (Lehto, O'Leary, and Morrison 2004). Destinations delivering a satisfactory and memorable tourism experience can attract more repeated tourist patronage (Assaker, Vinzi, and O'Connor 2011;Kim, Ritchie, and Tung 2010;Tsai 2016;Zhang, Wu, and Buhalis 2018).
Meteorological conditions have been alluded to serve as core attributes of tourism locations that contribute to the destination image, such as the image of a specific city (Gómez Martín 2005;Lohmann and Kaim 1999). Air pollution, as an ambient factor, may deteriorate the travel experience and damage the image of the destination. Air and water quality are among the factors deciding travelers' choices of destination (Jang and Wu 2006). Air pollution was also found to significantly reduce international inbound tourism demands as well as domestic tourist arrivals in the local city (Dong, Xu, and Wong 2019;Zhou et al. 2019). Anaman and Looi (2000) claimed that air pollution decreased the number of tourists to Brunei Darussalam. Thus, it is likely that if a tourist's visit to a travel destination (such as a city) is adversely affected by air pollution, it will also deteriorate the perceived image of the travel destination and, ultimately, reduce the likelihood of the tourist to revisit the specific travel destination. Thus, we propose that:

Hypothesis 1: Air pollution negatively influences tourists' revisit behavior toward a travel destination.
Country image and travel destination image are concepts with substantial overlap (Mossberg and Kleppe 2005). "Travel destination image" refers to the image of a specific travel destination such as an attraction or city. Compared to the image of a travel destination, "country image" is a more comprehensive image that is placed on the highest level of the hierarchy and includes tourists' perception and evaluation of various aspects of a country, including history, geography, culture, resident hospitality, political maturity, economic and technological development, and environmental management (Zhang et al. 2016;Zhang, Wu, and Buhalis 2018). In marketing and consumer behavior literature, country image is usually considered the sum of beliefs and impressions people hold about a given country (Roth and Diamantopoulos 2009). Changes in individuals' perception of a city (e.g., for tourism) also alter their perception of the country (see Cubillo-Pinilla et al. 2017).
Damage to destination image may spread to the country level. Suffering from toxic smog may generate a negative attitude among tourists, which adversely affects their perceived destination image and country image. Psychological studies have also revealed a "horn effect" (also called the "reverse halo effect" or "devil effect") wherein an unfavorable reputation often invites further image damage through negative assumptions (Coombs and Holladay 2002;MacDougall et al. 2008) because of a tendency to maintain cognitive consistency (Freedman 1968;Holbrook 1983). Due to the horn effect, a negative first impression of a certain entity (e.g., a product, brand, or destination) can affect the evaluation of or attitude toward similar or associated entities and eclipse the excellence of other attributes (Dodds 2017;Nicolau, Mellinas, and Martín-Fuentes 2020). The horn effect or halo effect has long been an important theoretical basis to understand the development and outcome of country image (Han 1989).
Evidently, individuals' perception of a country may largely attribute to their past tourism experience to the cities that they have visited (see Martin and Eroglu 1993). Suffering from toxic smog may contribute to a persistent memory of visiting China for an individual. Such a memory may surface to affect decision-making when the individual is considering the destination for the next trip. In this vein, a deteriorated perception of a city would negatively affect an individual's image of the country, therefore reducing the chance for them to revisit the country, including other cities of the country.
The above-proposed effect resonates with marketing research on a mutual influence between product/destination experience and country image (e.g., Nebenzahl, Jaffe, and Lampert 1997). Country image can be established through a direct experience, such as visiting the country, or through an indirect experience like opinions gained from using products originating in a specific country (Nebenzahl, Jaffe, and Lampert 1997). A multitude of studies has documented that "consumers use the country images as a halo to infer their product evaluation" (Tse and Lee 1993, 27). "Consumers form images of countries that in turn influence their beliefs, and willingness to purchase products made in these countries" (Lala, Allred, and Chakraborty 2009, 51), including tourism products. In tourism research, destination image has been conceptualized with a strong association of country image, which affects tourists' intention to revisit (Mossberg and Kleppe 2005;Nadeau et al. 2008). In line with the above studies, we argue that the experience of visiting a city affects individuals' country image, which in turn affects the purchase of products from the country, including tourism services.
Furthermore, it is worth noting that smog often surfaces across a large region, affecting many cities simultaneously. Tourists who suffered from toxic air may take smog into account when determining the next tourism destination, such as by studying the air quality information of a specific Chinese city. In this vein, tourists who take air pollution into account are less likely to revisit China in comparison to those who have not experienced smoggy weather. Taken together, we assume that a negative perception of a city due to air pollution will likely introduce a negative impression of the country to which the city belongs. We postulate that: Hypothesis 2: Tourists who are affected by air pollution during a prior trip are less likely to revisit China.

Data and Variables
To test the proposed hypotheses, we drew on a dataset of online hotel reviews from TripAdvisor generated before December 2019, including 269,847 reported trip experiences posted by 181,698 travelers on 5,142 hotels in 15 major Chinese cities, including Beijing, Chengdu, Haikou, Hangzhou, Hefei, Jinan, Kunshan, Nanjing, Ningbo, Sanya, Shanghai, Shenzhen, Suzhou, Wuhan, and Wuxi. The selected cities have a relatively higher level of economic development and geographically represent different regions of China. Evidently, economically developed cities normally have better tourism-related infrastructure and are more likely to attract international travelers for a number of reasons, such as business or leisure. In addition, to ensure the representativeness of the sample, we also consider cities of different sizes to reach a balance between big cities and small but rapidly developing cities. In line with past tourism studies (e.g., Shin, Perdue, and Pandelaere 2020;Stamolampros et al. 2019;Toral, Martínez-Torres, and Gonzalez-Rodriguez 2018), we utilized online customer reviews as a reliable source of data to investigate travelers' trip experience. Although travelers normally may not mention air pollution as an issue if it does not affect their trip experience, they would likely point out it in their review when air pollution, like smog, emerges to adversely affect their trip experience. For each hotel review, the ratings of both hotels and hotel attributes were collected. We also collected information about the reviewers, such as gender, age, level, contribution count, and review count. These variables are presented in Table 1.
We began by identifying English-speaking travelers who explicitly mentioned air pollution in their reviews to construct a treatment group. The rationale behind analyzing English reviews on Chinese hotels is that these reviews posted at TripAdvisor are mainly from international visitors. On the one hand, most Chinese living in China do not speak English, especially in daily life. On the other hand, TripAdvisor is an unpopular site for Chinese domestic travelers with less than one percent of total Internet traffic comes from China, but mostly used by travelers from the USA, the UK, Poland, Canada, Germany, and so forth (Similarweb 2021). In contrast, Chinese travelers prefer to using local platforms to make hotel bookings and post comments (Kapadia 2019). We queried the database with keywords in English, including "smog," "smoggy," "haze," "pollution," and "air quality," and identified an initial collection of 2,211 air-pollution-relevant reviews, excluding the reviews posted by anonymous users (n = 12). We used a shorter keyword "pollution" rather than "air pollution" to ensure the coverage of the extracted sample for further processing. One of the authors went through all 2,211 reviews to conduct a manual check which ascertains that each review kept is related to air pollution. This resulted in 1,820 air-pollution-relevant reviews retained in the treatment group, including only the reviews that actually reported an adverse experience due to air pollution. A few travelers mentioned that they fortunately did not experience smoggy weather during the trip. For instance, one review read that "[. . .] we hit a great blue sky period with almost no smog [. . .]." These reviews were not labeled as air-pollution-relevant reviews. Figure 1 shows the number of posted reviews per year.
Next, we quantified the likelihood of revisiting the same city among the 15 major Chinese cities by using their later reviews as a proxy variable of revisit behavior. In the literature, ample studies have demonstrated the applicability of customer reviews, tweets, online orders, and payment card transactions in analyzing tourism demand and mobility patterns because obtaining actual data for these variables from individuals is difficult (Granados, Gupta, and Kauffman 2012;Hawelka et al. 2014;Hu et al. 2019;Sobolevsky et al. 2014;Wang, Fang, and Law 2018). Specifically, the existence of a latter review has been used as a reliable proxy of revisit behavior (e.g., Hu et al. 2019). We conducted the analyses both at the country level and at the city level. Regarding analyzing the likelihood of a traveler to revisit China, which is at the country level, a revisit was tied to the same person posting a later review of any hotel in China. At the city level, a revisit to a certain city was tied to the same user posting a later review of any hotel in the same city. Given the difficulty of collecting TripAdvisor reviews pertaining to all Chinese cities, we limited the reviews related to 15 major Chinese cities to quantify the likelihood of revisiting the country. The selected 15 Chinese cities are economically developed and/or famous tourist destinations. Tourists who visited these cities should represent an important portion of tourists visiting China.
We contrasted the attributes of travelers who mentioned air pollution in their reviews with those who did not. Descriptive statistics and comparisons of all focal variables between the two groups are detailed in Table 2 below, The official class rating of a hotel. Travel type The type of visit selected by users when posting reviews.

Time of visit
The sequential order of a visit made by a traveler.
*Natural logarithmic transformation was conducted to normalize the distribution (Greene 2003).  including the travelers' demographics (e.g., gender and age), user profile (e.g., user level, hotel review count, etc.), visited city (15 major cities in China), and review characteristics (i.e., review length, days of availability, and overall rating). As shown in Table 2, almost all major attributes between travelers who mentioned air pollution or smog and those who did not are significantly different, indicating a risk of selection bias when travelers belonging to the two groups are compared directly. It is possible that travelers' personal characteristics and socioeconomic features of the destination, rather than air pollution, can determine their revisit decision. For example, economically developed cities tend to have more revisiting travelers and more salient air pollution, implying a selection bias. Therefore, in order to reliably construct control (i.e., travelers who did not mention air pollution issues) and treatment (i.e., travelers mentioning air pollution issues) groups, controlling for potential self-selection and endogeneity when exploring the effect of air pollution on tourists' revisit behaviors was necessary.

Propensity Score Matching
We employed propensity score matching (PSM) to control for potential selection bias. Briefly, PSM is a widely used statistical method that enables scholars to control the impact of selection bias and endogeneity by creating a statistical equivalence between the treatment and control groups by using observational data (Andrews et al. 2016;Austin 2007;Rishika et al. 2013). The method has been widely used across different disciplines such as economics (Lechner 2002), biology (Rosenbaum and Rubin 1983), medicine (Gum et al. 2001), information systems (Ma et al. 2014;Rishika et al. 2013;Susarla and Barua 2011), marketing (Andrews et al. 2016;Xu et al. 2017), and tourism research (Disegna, D'Urso, and Massari 2018;Falk 2017;Yang, Tan, and Li 2019). In the current study, we have 1,820 air-pollution-relevant reviews in comparison to a large share of 264,574 non-air-pollution-relevant reviews. In other words, air-pollution-relevant reviews make up only 0.683% of the whole sample. With PSM, the observational data becomes a quasi-experimental sample, mimicking controlled random experiments (Huang et al. 2012;Shadish, Cook, and Campbell 2002).
To both generate comparable samples and improve the robustness of subsequent analysis, we adopted PSM based on the nearest neighbor one-to-one matching method without replacement (Caliendo and Kopeinig 2008). Because travelers' trip experience (e.g., frequency of traveling last year) of the previous year should not affect whether the travelers will experience air pollution on a trip, we adopted a static matching approach to calculate the propensity score (Xu et al. 2017), which is in line with past studies ( Susarla and Barua 2011;Yang, Tan, and Li 2019). The analysis takes the factors of overall traveling experience and platform usage experience into account, such as total numbers of cities visited and number of reviews posted, which preserves the analysis from comparing new and experienced travelers. These procedures ensured that causal inference that the hypothesized differences in traveler revisit behavior were solely driven by the "smoggy experience" rather than the heterogeneity in travelers' attributes. The MatchIT package (Ho et al. 2011) implemented in R statistical software was used to perform PSM. After the effects of other observable covariates were accounted for using PSM, it was possible to test the effect of air pollution on revisit behavior. As shown in Table 3, the differences in the distribution of major attributes between the travelers in the control and treatment groups were effectively controlled. All major attributes were approximately identical after performing PSM, which implies successful control of selection bias. It is worth noting that, for both groups, we also controlled the timing of the review being posted. Therefore, travelers from both groups have the same time span that renders the viability of their revisit behavior. Consequently, the pseudo-treatment, encountering air pollution issues during a visit, is exogenous, so that the effect on revisit behaviors can be attributed to air pollution (Rosenbaum and Rubin 1983;Rubin 2006).

Data Analysis and Results
Based on the matched samples, a t-test was conducted to compare the likelihood of travelers from the control and treatment groups to revisit the same travel destination. Among the 1,820 travelers who did not report air-pollutionrelated issues, 252 (13.846%) revisited the same city, while only 18 travelers (0.989%) of the treatment group showed revisit behaviors for the same city. The difference in the ratio of repeat visitors between two groups is 12.857% (p < .001, see Table 4). By reducing the ratio of repeat visitors from 13.846% to 0.989%, air pollution would lead to a 92.857% loss of repeat visitors. This supports hypothesis 1, indicating that air pollution reduces a tourist's likelihood of revisiting a city.
We also conducted a t-test for revisiting travelers to the 15 focal Chinese cities and examined the influence of air pollution on tourist revisit behavior at the country level. In all, 532 out of 1,820 travelers (29.231%), who did not encounter air pollution issues, revisited the country. By contrast, only 35 travelers (1.923%) visited the country again after encountering air pollution problems during a previous trip, 27.308% less than those who did not encounter air pollution issues (p < .001, see Table 4). Considering the 532 travelers that revisited China, 497 of them would not have been repeat visitors if they were annoyed by air pollution, a 93.421% loss of revisiting travelers. Thus, hypothesis 2 is also supported, confirming a negative impact of air pollution on tourists' revisit behavior to a country. Figure 2, travelers who were affected by air pollution during their previous trip were 92.857% and 93.421% less likely to revisit the city and the country, respectively, compared to those with smog-free experiences. The results strengthen the notion that air pollution deters tourists from revisiting a city and a country.

Robustness Check
Post hoc robustness checks were conducted to evaluate whether our major results would change when applying alternative sample coverage and further analysis. The analysis above is based on comparing customers of different hotels. Even though we tried to control major differences between different hotels, there might have been factors that we failed to take into account. For instance, hotels whose customers wrote air-pollution-relevant reviews may happen to be those who were less interested in implementing customer loyalty programs than others. As a result, these hotels have fewer revisiting customers. Another possibility is that these hotels may also be more popular among travelers from a particular country who are less interested in revisiting a hotel.
To address the above alternative hypotheses, we performed a robustness check by restricting our analysis to only the travelers of those hotels that received air pollution-related reviews. Such analysis will rule out the alternative explanation described above. In this vein, we identified 514 hotels where customers wrote about air pollution. A total of 137,048 travelers visited these hotels, and 135,228 of these customers did not mention air pollution in their reviews of the lodging experience. Among these 135,228 travelers, 27,821 of them (20.573%) visited the country later, whereas only 35 of 1,820 travelers (1.923%) who reported an experience of air pollution issue demonstrated revisiting behavior (p < .001, see Table 5). The estimation results are almost identical to the findings obtained from our original approach, indicating the robustness of our results.
Finally, we conducted a post hoc analysis to examine the impact of the purpose of the trip, especially business trips, on the likelihood of revisiting. For business travelers, they may have little choice but to return. However, it is still possible that a company may assign different employees to visit China, and an employee who had been in China during smoggy weather may skip the trip by letting another colleague travel. Chi-squared test results show that there is no significant difference among travel types (see Table 6).

Discussion and Implications
While past research accentuated that revisiting or loyal customers offer much more business value than new customers do (Oppermann 1998(Oppermann , 2000, there is a lack of studies on the impact of air pollution on customers' revisiting behavior. In addition, even though the impact of environmental factors on the tourism industry has been well acknowledged, little is known about how air pollution, as an important environmental factor, influences people's traveling habits (cf. . The current research contributes to filling this gap through the analysis of a large-scale customer reviews data. Specifically, our results indicate that travelers who encountered air pollution issues during their previous trips are 92.857% less likely to revisit a specific city and 93.421% less likely to revisit China during the studied period. In other words, the emergence of air pollution has a significant deterrent effect on customer revisiting tendency. While the results are obtained through analyzing data collected from the Chinese tourism market, we believe that the findings are likely to generalize in the context of other countries and cities with similar environmental problems of air pollution. While we believe air pollution exerts a negative effect on revisit behavior, the degree of air pollution's adverse effects may vary in other tourism markets. For instance, the effect may be stronger or weaker at destinations with correspondingly more or less air pollution. Due to the fact that many countries face the problem of air pollution, this research is therefore widely applicable and of ongoing importance. The novel coronavirus (COVID-19) pandemic has halted international travel and tourism, which may have long-term and profound influences that can alter the traveling habits of tourists in the coming decades. Nonetheless, we argue that the findings of this study would still hold after the pandemic, albeit derived from analyzing pre-COVID-19 data. Air pollution may have an intricate relationship with the impact of the pandemic, because air pollution may intensify the effect of the COVID-19 pandemic, due to a possible positive association between long-term exposure to ambient air pollution and COVID-19 mortality (Barnett-Itzhaki and Levi 2021). In this vein, air pollution may exhibit a more long-standing issue than a pandemic. When making travel decisions, travelers' concerns over air pollution should remain even in the post-COVID-19 travel and tourism world.

Implications to Travel and Tourism Research
Our study contributes novel insights to literature on several fronts. First, our study addresses the knowledge void regarding the effect of air pollution on travelers' revisiting behavior. While previous research has shown that air pollution hinders travelers from initiating an intention to visit a country (e.g., Becken et al. 2017), our study demonstrates that, for those who have actually visited a country but experienced air pollution, they are substantially less likely to revisit the country, despite an increasing number of inbound travelers (see Appendix A1).
Second, our study contributes to a better understanding of the air pollutions' impact on the actual behavior of travelers. Previous studies on the impact of environmental factors on the tourism industry have focused on surveying a limited number of travelers regarding perceptions and self-reported intentions, the study employed online reviews as a proxy to study the revisiting behavior of tourists. To the best of our knowledge, this study is among the first attempts to analyze big data of user-generated reviews pertaining to actual travel behavior in non-laboratory settings to understand the influence of environmental factors. Such an effort responds to the call for the use of big data to gain new insights into the tourism industry, where studies using big data analysis are relatively few (Bramwell et al. 2017). In addition, through applying PSM to big data, the study offered an example of how different analytic methods, such as PSM, can be incorporated with big data to elicit new insights for travel and tourism research which traditional methods may not offer.  Third, our analysis (see Appendix A1) reveals a possible confounding effect of economic growth on both air pollution and inbound tourist volume that future research should pay attention to. In this vein, an inclusion of economic growth rate as a control variable in the analysis and other methods like PSM might address the mentioned confounding effect.

Implications for Practitioners
This study yields several practical implications for tourism practitioners. The empirical results of the study highlight the negative impact of air pollution on inbound tourism and offer evidence that air pollution serves as an important condition for the development of the tourism industry. Policymakers should be aware that pollution-associated economic development may bring about a short-term increase of inbound tourists, these tourists are much less likely to revisit a city or a country, leading to a long-term loss for the tourism industry. The findings of the study encourage regulators to undertake an environment-friendly approach to economic development that is complied with China's "Carbon Neutrality Target" (The State Council of The P.R.China 2020), which in turn can boost tourism economies in both the short-and the long-term.
Furthermore, as a hotel manager, it is important to note that those who have visited a hotel during the period of heavy air pollution are much less likely to revisit the hotel as well as the country the hotel is located in. Losing repeat visitors results in not only exhaustion of a reliable revenue stream but also a loss of word-of-mouth channels that attract new tourists (Reid and Reid 1994). Adaptation measures must be taken to mitigate the impact of adversarial environmental problems, such as air pollution, on tourists' revisiting intentions, such as reduced price and other intervention actions for revisits (Atzori, Fyall, and Miller 2018). In daily operation, hotels should inform tourists about the air quality information and offer advice to help customers better organize their local itinerary by avoiding a bad travel experience with smog. In addition, for those who visit a hotel in the season with bad air quality, hotel managers may advise the customer with the best reason to revisit the city when the air quality is good.
Moreover, hotel operators should also provide enhanced indoor air quality by equipping for example, air purifiers that customers may expect during smoggy days. Otherwise, if this expectation is not fulfilled, customers may complain. Such complaint was discovered in our reading of the customer review. For instance, a traveler wrote in a hotel review that " [. . .] it is sad that the management of this hotel don't look at the facility equipment to make sure more purified air flows to the rooms knowing very well the level of pollution in the city [. . .]." Offering more indoor entertainment activities and facilities can be a good option that boosts customer satisfaction during the period of smoggy weather.

Limitations and Future Research
Despite the comprehensive analysis conducted in this study, it still has some limitations that are noteworthy and offer opportunities for future tourism studies. First, the study discerned whether travelers encountered air pollution issues during their visits by keyword matching. As a result, it is possible that travelers who have experienced smog but did not mention it in their online reviews might have been included in the control group, thereby potentially reducing the difference between the control and treatment groups. Therefore, one can consider our result as a relatively conservative estimation. The actual effect of air pollution on tourism may be even more severe than the reported results suggest. Second, the study chose to analyze hotel reviews rather than reviews on outdoor attractions, because the number of English reviews on Chinese outdoor attractions is much smaller than English reviews on Chinese hotels. Furthermore, given the difficulty in collecting data, only 15 major Chinese cities were studied, limiting our view of the entire tourism market of China, which includes 100s of cities. To the best of our efforts, we tried to cover the major Chinese cities while balancing the big and small but rapidly developing cities and at the same time, providing a comprehensive geographical coverage. The dataset analyzed in the current investigation is sufficient to generate meaningful findings, nonetheless, enlarging the sample in terms of including more cities and outdoor attractions is encouraged to obtain improved results. Finally, the findings of this study are derived from pre-COVID19 data. The result should still hold, as air pollution may remain a more persistent issue than the COVID-19 pandemic. However, concerns over air pollution and epidemic disease may jointly affect people's travel behavior. Therefore, a possible future direction would be to quantify the effects of COVID-19 and air pollution and investigate the roles they play in shaping the post-COVID-19 travel and tourism world. Nevertheless, the proposed approach provides quantified insights into the behavior of a large number of travelers than prior work relying on survey. Future studies can apply the proposed method to examine various types of factors that influence travelers' behavior in different countries.

Appendix A1
Appendix A2  Agnew and Palutikof (2006) Pre-trip effect "Outbound flows of tourists are more responsive to climate variability of the preceding year, whereas domestic tourism is more responsive to variability within the year of travel" (p. 109). Álvarez-Díaz and Rosselló-Nadal (2010) Pre-trip effect Incorporating meteorological variables can increase the predictive power of the model to estimate the number of tourist arrivals in the Balearic Islands by air from the UK. Moreno, Amelung, and Santamarta (2008) Pre-trip and during trip effect High temperatures bring about higher beach visitation. Rosselló-Nadal, Riera-Font, and Cárdenas (2011) Pre-trip effect "Meanwhile, more hours of sunshine duration in the last 2 months discouraged the British to travel abroad. Moreover, the days of air frost provoked an increased number of British passengers going abroad" (p. 287). Becken and Wilson (2013) During trip effect The study "show a generally high level of changes made to trips, particularly in the less settled early summer season, and an interesting link with satisfaction" (p. 620). Rutty and Scott (2016) During trip effect International tourists using beaches are more resilient to a broader range of weather conditions than are domestic beach users. Damm et al. (2017) a Pre-trip and during trip effect "Under +2 °C warming, the weather-induced risk of losses in winter overnight stays related to skiing tourism in Europe amounts to up to 10.1 million nights per winter season" (p. 31). a Temperature is expected to affect both travelers' destination decision and skiing experience during the trip. For instance, a traveler may decide to end the trip earlier than planned due to a lack of snow. In other words, the reported effect may arise both before and during the trip. A similar argument can be applied to the study of Moreno, Amelung, and Santamarta (2008).  Deng, Li, and Ma (2017) Pre-trip effect Air pollution has significant negative impact on international tourists' arrivals in China. Pollution in neighboring regions also negatively affect international tourists' visit to the local provinces. Wang, Fang, and Law (2018) Pre-trip effect The air quality has a push effect on the outbound tourism in the local city, the demand for outbound tourism with the diminishing air quality. The relationship between air quality and demand for outbound tourism is moderated by disposable income, the air quality has a lower impact on the outbound tourism demand for the people with a higher disposable income than ones with the lower disposable income. Dong, Xu, and Wong (2019) Pre-trip effect Air pollution significantly decreases international inbound tourism. Specifically, an increase of PM10 concentration by 0.1 mg/m 3 will cause a decline in the tourism receipts-to-local GDP ratio by 0.45% points.  Pre-trip effect Air pollution significantly decreases domestic arrivals in the local city. Specifically, an increase of PM2.5 concentration by one-unit in a city, the number of domestic tourists to the city declines by 0.7%.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs
Wenjie Fan https://orcid.org/0000-0002-4525-1466 Wei Fan https://orcid.org/0000-0002-7415-0496 Studies Type of Effect Major Findings Tang et al. (2019) Pre-trip effect In short-run, air pollution did not decrease in inbound tourist arrival in Beijing. However, in the longer run, air pollution was negatively related to the inbound tourist arrival in Beijing. "A 1% increase in AQI, on average, would decrease tourist arrivals a 2.7% from Japan, 2.8% from South Korea, 1.5% from Russia, 1.6% from Germany, 2.6% from the UK, and 2.1% from the USA" (p. 601). Zhou et al. (2019) Pre-trip effect "Air pollution has a negative influence on tourism flows and that this effect is more pronounced for inbound than for domestic tourism" (p. 747). If PM10 concentration increases by one-unit, the domestic and inbound tourism arrivals decline by 0.19% and 0.33%, respectively. Churchill, Pan, and Paramati (2020) Pre-trip effect Increased CO 2 and PM2.5 have adverse effects on tourist arrival in both developed and developing Economies. Specifically, a 1% increase in CO 2 and PM2.5 emissions are associated with a 0.69% and 0.5% decrease in tourist arrivals, respectively. While the effect of CO 2 on tourist arrival is stronger for developed countries, PM2.5 strongly affects the number of tourist arrivals in developing countries.  During trip effect Poor air quality increases the feeling of pessimism among the tourists. When the tourists perceived higher air pollution in the destination, they are more likely to be suspicious toward local service providers in comparison to the destinations where perceived pollution is lower. Wang and Chen (2021) Pre-trip effect Deteriorating air quality, as measured by PM2.5, significantly reduces the arrival of both domestic and inbound tourists while affecting the arrival of inbound tourists more than the domestic ones. a We considered a loss of inbound tourists as a pre-trip effect, as tourists decide not to visit a specific destination.