HOW AIR QUALITY AFFECTS HUMAN MOBILITY PATTERNS: AN EXPLORATORY ANALYSIS

: Air quality acts as an important factor that human may consider as they make decisions on when and where they would go. In order to access how much the air quality affects human mobility patterns, the air quality was measured using air quality index (AQI) and human mobility patterns were measured by travel volume and travel distance of shared bikes. Their correlation that presents on weekdays and weekends as well as in different administrative districts were investigated using Spearman correlation analysis method. A case study was conducted in Beijing, China using bike sharing data and air quality data ranging from May 10 to 16, 2017. The results show that travel distance is more sensitive to air quality on weekdays such as Changping District (-0.20), Haidian District (-0.13), Shunyi District (-0.12). The travel volume on weekdays is less sensitive to air quality due to commuting. The travel volume has a negative relationship with AQI on weekends. Fengtai District, Huairou District, Pinggu District are more susceptible to severe air quality, leading to a reduction in bike traveling distance. This work sheds light on understanding human-environment coupling mechanism and promoting urban sustainable development.


INTRODUCTION
Understanding human mobility patterns plays a key role in a wide range of fields, such as urban planning (Kang et al., 2012;Liu et al., 2012), traffic forecasting (Peng et al., 2012), disease spreading (Bagrow et al., 2012), location-based recommendation systems (Cheng et al., 2011), and so on.Air quality is a concern for all countries, as severe air pollution has adverse effects on human health.It is reported that only one percent of 500 cities in China satisfy air quality standard set by the World Health Organization (WHO) (Peng et al., 2014).Some students were absent from school and workers stopped working due to the deteriorating air conditions (Ma et al., 2019).To minimize the risks of diseases caused by severe air pollution, people are likely to consider air quality when engaging in outdoor activities.For example, if there is no hazardous alert for a week or more days, the good air condition has little impact on human movement.In contrast, if the bad air pollution frequently happens, people are more likely to participate in indoor activities instead of going outside for exercise or entertainment.Therefore, air quality can be considered as one of the factors influencing human mobility patterns, especially in cities with severe air pollution.Some existing studies explore the influence of external factors (e.g., weather) on human mobility patterns using logistic regression models (e.g., Logit model) based on the questionary data, which is time consuming and inefficient (Campbell et al., 2016).With the development of mobile phone and Internet technologies, a number of existing studies have investigated the impact of air quality on human mobility patterns using social media check-ins (Yan et al., 2019), social sensor data (Sagl et al., 2012), public transportation smart card data (Ma et al., 2017), subway passenger data (Wu et al., 2020), taxi data (Kang et al., 2019), mobile positioning records (Xu et al., 2021) and docked bike sharing data (Cao et al., 2019;Gebhart et al., 2014), where dockless bike sharing data was seldom used for such purpose.
In order to understand how much air quality affects human mobility patterns, the air quality was measured using air quality index (AQI) and human mobility patterns were measured by travel volume and travel distance of shared bikes, the corelation between which was quantitatively computed by the Spearman Correlation Coefficient (SCC).A case study was conducted in Beijing, China to investigate the impact of air quality on human mobility patterns from spatial and temporal perspectives.
The rest of the paper is organized as follows.Section 2 introduces the study area and data used in this study.In Section 3, the main methods used are presented.Section 4 analyzes and discusses the results and some conclusions are drawn in Section 5.

Study Area
Beijing, as the first city in China where shared bicycles appeared, started the bike-sharing program in 2014.By 2017, there were 15 shared bicycle companies operating in Beijing, with the peak number of bicycles reaching 2.35 million and the number of registered users reaching 11 million (Li, 2018).As shown in Figure 1, a total of 16 administrative districts are located in Beijing, i.e., Dongcheng District, Xicheng District, Chaoyang District, Fengtai District, Shijingshan District, Haidian District, Shunyi District, Tongzhou District, Daxing District, Fangshan District, Mentougou District, Changping District, Pinggu District, Miyun District, Huairou District and Yanqing District.In this study, we did not include the Yanqing District as the study area because there is limited data that is not enough for analysis.

Bike sharing data
The bike sharing data used in this study was obtained from the 2017 Mobike algorithm challenge (1) .The data ranging from May 10 to May 16, 2017 was collected, including 1.8 million records, 430,000 bicycles, and 310,000 users.It includes the fields such as order ID, user ID, bicycle ID, bicycle type, start time, start location, and destination location (see Table 1), providing useful information for the spatiotemporal analysis of human mobility patterns.The start location and destination location are originally encoded using Geohash, which can be converted to latitude and longitude in an automatic manner.More details regarding such conversion process are elaborated in Section 3.1.The kernel density of the travel volume was calculated and visualized in Figure 1, showing a tendency to spread from the central area to the rural area.The darker the region, the more the bike-sharing orders.Fengtai District, Chaoyang District, Haidian District, Dongcheng District, and Xicheng District are central areas where dense human movement occurs and have higher travel volume.

Air quality data
There are 35 monitoring stations across Beijing, which are divided into 12 urban environmental monitoring stations, 11 suburban environmental monitoring stations, 7 control and regional stations, and 5 traffic pollution monitoring points.In this study, the spatial join function was used to map the monitoring stations to each district.There are one or two monitoring stations located in each district (i.e., the red points in Figure 1).The stations distribute in various districts, to monitor the air quality condition in each district.The air quality data was collected from the Beijing Air Quality Historical Data website (2)  including the name of monitoring stations, a pair of coordinates, hourly concentrations of PM2.5, PM10, SO2, NO2, O3, and AQI.As AQI is determined by the concentration value and sub-index of various pollutants, we selected AQI as the indicator of air quality in this study.AQI ranging from 0 to 50 represents good air quality with little or no health impact.AQI ranging from 51 to 100 represents moderate air quality with acceptable health impact for most people.AQI ranging from 101 to 150 represents unhealthy air quality for the sensitive groups such as children, the elderly, and people with respiratory diseases.AQI ranging from 151 to 200 represents unhealthy air quality for everyone, and everyone may begin to experience some adverse health problems.AQI ranging from 201 to 300 represents very unhealthy air quality, and everyone may experience more serious disease.AQI over 300 represents hazardous air quality, and everyone should avoid all outdoor activities.

METHODOLOGY
In this study, the travel volume and travel distance were calculated to indicate the human mobility patterns, and the average hourly AQI value of the monitoring stations within the district was used to represent the hourly air quality of the district.In order to explore how air quality affects human mobility patterns, a quantitative correlation analysis was conducted between AQI and travel volume as well as travel distance through computing the SCC.

Measuring human mobility patterns
As indicated above, the human mobility patterns can be measured by travel volume and travel distance.The travel volume was computed by counting the number of orders that take place within a certain space and time.In order to investigate the actual location of bike sharing orders, the start location and destination location with geohash code as shown in Table 1 were all converted to pairs of longitude and latitude using the "geohash_decode" function that is embedded in the Python package of Transbigdata (1) .With regard to computing the travel distance, the longitude and latitude are further concerted to X and Y in meters using the "get_distance" function embedded in the Transbigdata Python package.The travel distance can be then computed using the returned coordinates as follows, where (  ,   ) is the coordinate of the start location, and (  ,   ) is the coordinate of the destination location.The air quality was measured by AQI value, which is usually used as a comprehensive indicator with consideration of various pollutants.

Measuring air quality
The hourly air quality of a certain district was measured by averaging the AQI values collected by the monitoring stations within in this district.Assume there are M monitoring stations within the district D, the air quality within this hour and district   can be computed as follows, where  1 ,  2 , … ,   are the AQI values collected by the 1 th , 2 th , …, M th monitoring station, respectively.

Computing SCC for spatiotemporal correlation analysis
The hourly travel volume and hourly travel distance were used as independent variables, and the AQI was used as dependent variables to quantitatively investigate the correlation between them based on computing the SCC.SCC is a popular indicator for correlation analysis, which is a non-parametric measure of the dependence of two or more variables.It can also usually be represented by  , they are identical and its value ranges from -1 to 1, where 1 means that the two variables are perfectly positively correlated, -1 means that the two variables are perfectly negatively correlated and 0 means that there is no correlation between the two variables.The SCC was calculated based on the rank order of the original variables.Rank order refers to the position of the variables after sorting them from low to high.It is calculated as follows, where   is the rank order of variable  (i.e., travel distance and travel volume),   is the rank order of variable  (i.e., AQI values), ̅ and  � are the mean of  and , respectively. is the correlation coefficient between variable  and variable .
(1) https://transbigdata.readthedocs.io/In this work, we use equation 2 to calculate the SCC and to better identify the difference between weekdays and weekends, we use equation 3 to calculate the SCC for weekdays and weekends.It is calculated as follows, where   represents the SCC value of the i th day and N is the number of days.The N equals to five and two with regard to weekdays and weekends, respectively.

Spatiotemporal human mobility patterns
In this section, we analyzed human mobility patterns using the bike sharing data from spatial and temporal perspectives.With regard to the travel distance, we first made a statistic analysis in Figure 2 to illustrate the overall distribution of travel distance of all orders.It shows that the majority of the travel distance is within 5,000 meters.Additionally, the data was divided into weekdays and weekends to reflect the different spatiotemporal human mobility patterns during different time periods.
The hourly travel volume and hourly travel distance on weekdays and weekends are summarized using histograms in Figure 3.According to Figure 3 (a), it reveals that the travel volume is very low from 11pm to 5am on weekdays, which is due to people resting during these hours.The travel volume rapidly increases during 7am and 8am and 5pm and 6pm, aligning with the morning and evening peak hours.At noon, there is a local peak hour at 12pm, which is potentially related to the fact that people go out for lunch or leisure activities.The peak-hour patterns do not exist on weekends when the overall travel volume is significantly lower than weekdays, which indicates that bicycle travel largely serves workday commuting activities.With regard to the travel distance on weekdays and weekends that are shown in Figure 3 (b).there is no significant difference among other hours except there exist two peaks from 7am to 8am and from 5pm to 6pm on weekdays, which co-occurs with the higher travel volume.This is consistent with the time of going to work and off work.
Furthermore, we investigated the heterogeneity of human mobility patterns in space during morning peak hours (i.e., 7am to 9am on weekdays), evening peak hours (i.e., 5pm to 8pm on weekdays) and non-peak hours.The kernel density of travel volume and travel distance were computed and visualized in  As shown in Figure 4 (d), (e) and (f), most of the travel distance of the central districts are less than two kilometers, which reflects the bike sharing system provides a solution to the "first and last mile" problem.Longer travel distance during the morning peak hours appears in Daxing District and Tongzhou District with an average distance of 3,374 meters.However, during the evening peak hours, the districts with longer travel distance drift to Chaoyang District and Haidian District, with an average distance of 3,402 meters which is slightly more than that in the morning.

Air quality
Following the methodology in section 3.2, we visualized the air quality measured by the average AQI values by districts in Figure 5.It reveals that Tongzhou District, Daxing District, Fangshan District, Fengtai District and Xicheng District own higher AQI and poorer air quality during the morning peak hours on weekdays.During the evening peak hours on weekdays, the AQI is higher in Fangshan District and Mentougou District.
During the non-peak hours, the AQI is the highest in Fangshan District.Overall, the air quality in Fangshan District is the poorest, regardless of the time periods.

Correlation between spatiotemporal travel patterns and air quality
Through computing the SCC values, we quantitatively analyzed the correlation between air quality and travel volume as well as travel distance from spatial and temporal perspectives.The results are summarized in Table 2.It shows that on weekdays, the travel distance was negatively affected by air quality in areas such as Changping District (-0.20),Haidian District (-0.13),Shunyi District (-0.12),Chaoyang District (-0.10),Fangshan District (-0.10) and Tongzhou District (-0.05).Most of these areas have a high demand for shared bikes.However, the travel volume on weekdays shows randomly correlates with air quality among all administrative districts of Beijing.This is because people have to go to work regardless of the air quality on weekdays.
The travel volume has a negative relationship with AQI on weekends.The districts most affected by air quality on weekends are Changping District, Daxing District, Tongzhou District, Xicheng District, Chaoyang District, and Shijingshan District, with SCC values of -0.66, -0.58, -0.54, -0.42, -0.39, and -0.39, respectively.The negative correlation between air quality and travel volume may be due to people's reluctance to engage in outdoor activities during poor air quality.The Huairou District and Pinggu District, which are the two popular tourist attractions in Beijing that people tend to ride in spring, most align with the afore-mentioned patterns, with the SCC of -0.17, -0.16, respectively.In a word, the quantitatively correlation analysis results prove that air quality indeed influences the people activity patterns.As the air quality is worse with a higher AQI value, the travel volume is less and the travel distance is shorter, where people tend to shorten their traveling by bike or keep away from going outside.

CONCLUSIONS
The travel distance and travel volume retrieved from bike sharing data were used for modelling human mobility patterns, and the AQI values were used for generating the air quality of a certain district during a specific time period.The SCC were further computed to quantitatively analyze the impact of air quality on human mobility patterns across different administrative districts of Beijing on weekdays and weekends.
The results show that on weekdays, the air quality has less effect on the travel volume but worse air quality usually leads to shorter travel distance.This is because people have to follow the daily routes due to the commuting demand.On weekends, it is more flexible for people to choose their travel modes.Thus, the air quality significantly affects the travel volume and travel distance, especially in those rural districts, e.g., Changping District and Huairou District.This work helps understand human-environment coupling mechanism, promoting the harmonious development of urban systems.In the future, multiple types of crowd-sourcing data (e.g., social media and taxi data) can be fused to shape human mobility patterns and a longer series of data can be analyzed to the improve the performance.

Figure 1 .
Figure 1.The study area of Beijing with the density distribution of travel volume.

Figure 4 .
It shows that the travel volume density keeps high in the central districts including Chaoyang District, Haidian District, Xicheng District, Dongcheng District, and Fengtai District (see Figure 4 (a), (b) and (c)).

Figure 4 .
Figure 4. Distribution of travel volume density (a-c) and travel distance (d-f) during different time intervals.

Figure 5 .
Figure 5. Distribution of AQI during different time intervals.

Table 1 .
Samples of bike sharing data.

Table 2 .
SCC measuring the quality impact on travel distance and travel volume.