Urban Impedance Computing Based on Check-In Records

Urban impedance is an important consideration in assessments of transportation and land-use systems. This work leverages check-in records obtained from mobile social networks to build a fine-grained but inexpensive urban impedance model. Check-in records and road networks are collected and used to calculate and adjust the various parameters of the model, including path length, number and angle of turns, number and direction of junctions, and population density. Check-in records can filter functional locations and supply the time factor, thereby providing excellent advantages over traditional models that do not employ this data type. The proposed model is more accurate than traditional impedance models, as verified by experiments using Sina Weibo data in Tianjin City.


Introduction
The rapid progress of urbanization has modernized many people's lives but also engendered several important issues, such as traffic congestion, high housing prices, and low neighborhood cohesion.Urban accessibility [1] is a fundamental measure of the benefits of urban life and answers how many destinations can be accessed within a given time.It is widely used to evaluate transportation and landuse systems.Increasing accessibility is essential for urban planning.Impedance, which is defined as the cost between the origin and destination locations, is an important indicator of urban accessibility.
Traditional impedance models require a large amount of supporting data, including the geographical distribution of public facilities, traffic volume of different time periods, and population in the range of study [2].These data are difficult to collect from many cities of developing countries.As the use of mobile social networks has recently grown immensely, a large amount of user mobility data has been generated.Mobile social networks allow a user to "check in" at POIs (points of interest), which corresponds to an online record describing his/her current physical location, and share this record with his/her friends.This new data source can be leveraged to build a fine-grained but inexpensive urban impedance model.The present paper explores this idea and presents the following main contributions: (1) Selection and preparation of appropriate data sources for the impedance model: we extract city blocks from road networks, collect POIs and check-ins from mobile social networks, and match them correspondingly.
(2) Proposal and implementation of the impedance model: we use check-in records to filter POIs and supply the time factor and then calculate distances, turns, junctions, and population parameters after path planning.
(3) Visualization of impedance and its parameters in a real city.

2
International Journal of Distributed Sensor Networks

Related Work
In recent years, researchers have studied a number of urban accessibility models.Handy and Niemeier [3] proposed a utility model based on the discrete-choice model.By assessing a value for each candidate destination in the region, individuals will be more likely to visit destinations with larger utilities.Wachs and Kumagai [4] proposed a cumulative opportunity model that considers the number of destinations in a certain range; unfortunately, this model ignores differences in opportunity points.Hansen [5] proposed a gravity model that considers the distance between centers and opportunity points and could also be used to analyze the market potential [6].Accessibility is heavily influenced by the selection of the impedance function.Common impedance functions include exponential functions, Gaussian functions, and negative power functions.The choice of impedance function mainly depends on the applications of the model and the characteristics of the data used.Since their functions and parameters are simplified, these models tend to ignore the complexity of city roads and, therefore, show limited accuracy.
In research on impedance functions, the United States BPR (Bureau of Public Roads) developed [7] a speed-flow model through regression analysis.To overcome the shortcomings of BPR model resulting in large deviations when the traffic volume reaches its peak, Spiess [8] and Wang et al. [9] improved the impedance function.According to queuing theory, Davidson [10] proposed a progressive impedance function that amends the parameters describing the relationship between travel time and distance.Wang et al. [11] improved the impedance model based on traffic flow by considering travel costs, time, hub locations, and other factors.Despite their benefits, however, as mentioned earlier, these traditional impedance models require a significant amount of supporting data.
Another research field related to urban accessibility is urban computing or smart city.Urban computing [12] aims to tackle urbanization issues through a process involving acquisition, integration, and analysis of data generated from various sources in urban spaces, such as sensors, devices, vehicles, buildings, and humans, to reflect traffic flows, human mobility, geographical information, and so forth.For example, Zheng et al. [13] described the underlying problems in Beijing's transportation network using the hypothesis that the connection between two regions cannot effectively support the traffic traveling between them, thereby resulting in a large volume, low speed, and high detour ratio.Fu et al. [14] predicted the rankings of residential real estate in a city at a future time according to their potential values inferred from a variety of data sources, such as human mobility data and urban geography, currently observed around the real estate.Chen et al. [15] leveraged a combination of location-based social networks and taxi GPS digital footprints to achieve personalized, interactive, and traffic-aware trip planning.Yu et al. [16] recommended personalized travel package with multiple points of interest based on crowd sourced user footprints.These works are mostly application-oriented and can be embedded into and improved by our model.

Preliminary
Definition 1 (block center).Road networks divide the city area into polygonal blocks [17,18].Let  = {V 1 , V 2 , . . ., V || } indicate the centers of these blocks and let || indicate the number of blocks available.People and facilities in a block usually share the same transport environment; thus, block centers may be considered representative locations when analyzing the impedance of a block (we can compute the impedance for all locations, but doing so is time-consuming and unnecessary).The position of the center of a block can be easily obtained given its vertices and edges.
Definition 2 (function density).This term considers how many functional locations (i.e., POIs) can be reached from an object location (e.g., the center of a block) given an accessibility radius.The positions of POIs are indicated by longitude and latitude.The function density is displayed in formula (1).Consider where (  , V  ) denotes the road network distance between a POI   and the center of a block V  ,  is the accessibility radius defined in terms of road network distance (because people travel along road networks and care about transport distances rather than absolute distances), and  is the total number of POIs.
POIs, such as shops, restaurants, and tourist attractions, have business hours, for example, opening from 8:00 a.m. to 8:00 p.m. Thus, a time factor must be considered when calculating their impedance.Out of their business hours, these POIs are regarded as absent.We set a time slot lasting for two hours and classify days into weekdays or weekends.As a result, a total of 24 time intervals should be considered.For each block center, the function density is adjusted according to formula (2).Consider where time(  ) denotes the business hours of a POI   ,  is a specific time slot, and the other variables are identical to those in formula (1).POIs with ≥10 check-ins are taken into account in this work.

Definition 3 (path).
A path is a part of road network that starts from an origin location and ends at a destination location.A path is described by five elements ⟨, , T, J, L⟩: origin (), destination (), turns (T), junctions (J), and line segments among them (L).
where |T| is the number of turns and   is the angle of each turn.Consider J = { 1 ,  2 , . . .,  |J| }, where |J| is the number of junctions and   is the direction to take upon arriving at a junction, that is, turning right, turning left, or going straight.Consider L = { 1 ,  2 , . . .,  |T∪J|+1 }, where |T ∪ J| is the total number of turns and junctions and   is the length of each line segment.
We adopt the Baidu map traffic routing API [19] to obtain paths between block centers and POIs.

Modeling.
The overall impedance situation of a city can be understood by the impedances of its block centers, which are denoted by { 1 ,  2 , . . .,  || }.The larger the value of   , the larger the impedance of this block center.
Drawing lessons from existing models, we take both path parameters and population parameters into consideration.We also use check-in records to adjust these parameters.Check-in records can help our modeling procedure in two ways.First, check-in records can filter functional locations.Only locations that own many/frequent check-ins are meaningful for impedance computing.Second, check-in records can supply the time factor.A location presents different impedances at different times.Time not only influences the function density (as shown in formula ( 2)) but also impacts the real-time population (which will be described in Section 3.3.4).
Three parameters are considered for paths, that is, path length (), number and angle of turns (), and number and direction of junctions (), and one parameter is considered for population, that is, the population density ().The impedance of the block center V  at the time interval  is  , =  , +  , +  , +  , , where , , , and  are the weights of different parameters.
We use the CartoDB map tool [20] for visualization.Figure 1 shows the average path lengths of Tianjin's block centers (at a particular time interval).Blocks with darker colors represent blocks with longer average path lengths.A detailed description of the data will be provided in Section 4.

Number and Angle of Turns.
The path from the block center to a POI may contain turns.Obviously, more and sharper turns will slow down traffic.Thus, the number and angle of turns in a path must be taken into account in the urban impedance model.Baidu map traffic routing indicates a very large number of slight turns.Because modeling all of these turns is timeconsuming and unnecessary, angles less than 10 degrees are neglected.Traditional impedance models tend to ignore the effect of the segment between two turns.In practice, however, if only a short distance must be traveled from one turn to another, the whole passing time will be greater.The average of path turns (short for the number and angle of turns, the same below) from a block center to all accessible POIs is taken as the second parameter of our model.Consider where   is the length of the line segment between the -1th turn and the th turn.The function tan() can show angle differences properly, and the function ln() can restrict the value range.Figure 2 shows the average path turns of Tianjin's block centers.

Number and Direction of Junctions.
Junctions on a path increase the passing time because people must wait for traffic lights to change and pedestrians from other directions to cross roads.The more the junctions on a path, the larger its impedance.Different directions at a junction present different waiting and passing times.Turning left, for example, requires more cost than going straight and then turning right.
where   is the length of the line segment between the -1th junction and the th junction.Figure 3 shows the average of path junctions of Tianjin's block centers.

Population Density.
The population density directly affects the traffic situation in a block; thus, its impact on impedance cannot be ignored.Different times of day show different population densities.Although some traditional impedance models also consider population, real-time population distribution data are difficult to obtain and their precisions are not able to meet the demands of accurate calculations.Users' check-in records can properly represent the population distribution and can be easily obtained from mobile social networks.As different area sizes present different abilities to relieve traffic congestion, the population density of an approximately circular area determined by  the block center and the accessibility radius is taken as the fourth parameter of our model: where check , is the number of check-ins at POI   and  , is the size of an approximately circular area with center V  and accessibility radius  (simply calculated as  2 in this work).Figure 4 shows the population density of Tianjin's block centers (from 8:00 a.m. to 10:00 a.m. on a weekend).

Experiments
The experimental data include road networks from the national data center and POIs and check-in records from Sina Weibo [21].We use review records of POIs from Dianping [22] for cross-checking, as well as two traditional impedance models (the potential and utility models) for comparison.Real-time road conditions from the Baidu map are considered the ground truth.This paper describes the case of Tianjin (one of four municipalities in China) as an example.Road networks divide the city area into 2,754 blocks.A total of 90,731 POIs and 533,006 check-in records were collected in 2014.The accessibility radius was set to 1,000 meters in terms of road network distance, and 357,681 paths should be calculated.
Using the same impedance model (see formula (3)), we compute Tianjin's impedance according to Dianping data and display the results in Figure 6.Dianping data include 45,767 POIs and 311,794 review records (regarded as checkin records).For the same time interval, the results of the Dianping data are basically identical to those of the Weibo data.This finding shows that our model works robustly and steadily with different data sources.

Comparison with Other
Models.The utility and potential models are two common impedance models.
The utility model is based on the discrete-choice model.The basic idea of this model is that different facilities present different utilities to people.For example, a supermarket is more likely to be visited than a car shop.For each destination within the accessibility radius, a utility value is assigned to the origin.The greater the utility is, the greater expectation this destination will be visited.The impedance function of this model is described as a logarithmic sum:   = ln Real-time road conditions are a good reflection of impedance.We take real-time road conditions from the Baidu map as a reference to evaluate all three impedance models.Figure 7 shows the real-time road conditions in Tianjin at 9:00 a.m. on a Saturday.Here, green lines indicate clear roads, yellow lines indicate slow roads, and red lines indicate roads with traffic congestion.To achieve a more intuitive contrast among the results of the impedance models, we project the road conditions to the corresponding blocks.Figure 8 shows the projected result of road conditions in Tianjin (average from 8:00 a.m. to 10:00 a.m. on a weekend).A comparison of Figures 5 and 8 indicates nearly coincident results.
We now present a quantitative comparison to demonstrate which model yields results most similar to those of the reference (i.e., projected results of real road conditions).The Minkowski distance, a proper similarity criterion, is written  as , where   is the impedance of the block center V  computed by each model,   is the road condition of block V  , and  is set to 2. Here, a smaller distance means more similarity.Notice that the time factor is considered for all models.Table 1 shows the similarity between each model considered in this work and the actual road conditions.
Table 1 reveals that our proposed model yields results with the most similarity to actual road conditions.The model performs the poorest among the models studied, and the potential model shows moderate accuracy.The poor performance of the traditional models may be explained as follows: The utility model only considers the attractiveness of the destination and ignores the path between origin and destination, while the potential model takes the distance from the origin to the destination into consideration but ignores other parameters, such as turns and junctions.Our proposed model combines the parameters path length, number and angle of turns, number and direction of junctions, and population density, thereby accurately depicting the city's impedance at different time periods.

Conclusions
Urban impedance is an important indicator to consider in assessments of transportation and land-use systems.However, traditional impedance models require extensive data collection, which is costly, but yield only coarse-grained results.The present work leverages check-in records obtained from mobile social networks to build a fine-grained but inexpensive urban impedance model.We use check-in records to adjust the path and population parameters of the model.Check-in records can filter functional locations and supply the time factor, which not only influences the function density but also impacts the real-time population of an area.Several experiments confirmed that our proposed model yields more accurate results than traditional impedance models.
In future research, we aim to take multiple-path factors into consideration and employ more types of data to obtain an improved impedance model.

3. 3 .
Parameter Calculation 3.3.1.Path Length.Path length (i.e., road network distance), the most important component of the impedance model, refers to the sum of lengths of line segments in a path.The average of all path lengths from a block center to all accessible POIs is taken as an initial parameter of our model:  , = |T∪J|+1 ∑ =1   , where   ∈ L of path from V  to   ,  , = ∑ Den , =1  , Den , .

Figure 2 :
Figure 2: Average path turns of Tianjin's block centers.

Figure 7 :
Figure 7: Real time road conditions in Tianjin at 9:00 a.m. on a Saturday.

− 1 ∑
Den  =1 exp( , ), where  , is the utility value of V  assigned by   .We use the number of ODs (a pair of check-in records left by the same user in a day) as the utility value.The potential model is based on Newton's gravity model.Here, two factors are considered: the attractiveness of the destination, usually indicated by the population or facility density at the destination, and the distance decay.A typical expression of the potential model is   = ∑ Den  =1 (exp(2 , )/check  ), where   's check-in number check  indicates the attractiveness of   .

Table 1 :
Similarity between each model and road condition (weekends).