Total-delay-based Max Pressure: A Max Pressure Algorithm Considering Delay Equity

This paper proposes a novel decentralized signal control algorithm that seeks to improve traffic delay equity, measured as the variation of delay experienced by individual vehicles. The proposed method extends the recently developed delay-based max pressure (MP) algorithm by using the sum of cumulative delay experienced by all vehicles that joined a given link as the metric for weight calculation. Doing so ensures the movements with lower traffic loads have a higher chance of being served as their delay increases. Three existing MP models are used as baseline models with which to compare the proposed algorithm in microscopic simulations of both a single intersection and a grid network. The results indicate that the proposed algorithm can improve the delay equity for various traffic conditions, especially for highly unbalanced traffic flows. Moreover, this improvement in delay equity does not come with a significant increase to average delay experienced by all vehicles. In fact, the average delay from the proposed algorithm is close to—and sometimes even lower than—the baseline models. Therefore, the proposed algorithm can maintain both objectives at the same time. In addition, the performance of the proposed control strategy was tested in a connected vehicle environment. The results show that the proposed algorithm outperforms the other baseline models in both reducing traffic delay and increasing delay equity when the penetration rate is less or equal to 60%, which would not be exceeded in reality in the near future.

Traffic signals serve as one of the most used tools to manage conflicting vehicle movements at intersections. However, they also serve as the main bottleneck that impedes the network mobility, since they directly stop vehicles and impose travel delays (1). Therefore, proposing the design of signal timing is required to ensure efficient traffic operations at signalized intersections. Adaptive signal control methods use real-time traffic measurements to predict how traffic patterns will evolve and then optimize the signal timing based on prevailing patterns and/or these predictions. They have been studied vastly in the past decades and regarded as a promising technique to improve network mobility (2)(3)(4)(5). However, it remains a challenging research topic, especially for network-wide signal control problems, because of its complexity. One difficulty is to accurately model the temporospatial interdependence between intersection performance. Another is the computational speed to solve a complex optimization problem in real time. Decentralized traffic signal control algorithms address both concerns, since they optimize signal timings for each intersection based only on local traffic conditions. Many decentralized traffic signal control approaches have been proposed in the past decades, such as OPAC (6), RHODES (2), SCOOT (7), and so forth. A comparison between centralized and decentralized control approaches can be found in Chow et al. (8).
The max pressure (MP) signal control algorithm, also known as backpressure (BP), is a decentralized optimization algorithm that was originally proposed for the routing and scheduling of packet transmission in a wireless network (9). Varaiya (10) was the first to implement MP in traffic signal control problems, which we refer to as the Original-MP in the rest of this paper. The MP algorithm requires the turning ratio and saturation flows at an intersection be known, but does not require any knowledge of demand information. Because of its implementation ease and decentralized nature, numerous studies have proposed and/or tested the performance of variants of MP in signalized street networks; see Dixit et al. (11), Gregoire  1. A selected metric is measured for each vehicle movement at an intersection at each instance when a signal needs to be updated. Metrics that are commonly used include queue length, travel time, and traffic delay. 2. The weight of each movement is computed as this measured metric minus the average value of this metric for all downstream movements. 3. The pressure of a signal phase is computed as the sum of the weights multiplied by the saturation flows of the movements served during that phase. 4. The pressure is then used to change the signal phasing and timing plan. For cycle-less implementation, the phase with the maximum pressure is activated for the next time step. When a specific cycle length is to be maintained, the phase splits for the next cycle are determined based on the proportion of the pressures.
The control performance highly depends on the selection of the metric. The Original-MP algorithm uses ''queue length'' as the metric for weight calculation. However, it also uses point queue models to represent the traffic state on links and store-and-forward models to depict the vehicle transitions between links. Since the vehicles join the point queue immediately once they enter the link, this model does not track the positions and moving status of vehicles. Therefore, it should be emphasized that the ''queue length'' in the Original-MP algorithm represents the number of vehicles on the link as opposed to the number of vehicles stopped in a queue. In addition, it assumes the queue storage capacity is infinite, which is problematic when the traffic volume is high. To address these issues, many queue-based MP algorithm variants have been proposed. Xiao et al. (14) proposed a pressure releasing policy (PRP) that considers both the local queue length and the queues on the entry links. The latter was used to adjust the weight while accounting for queue capacity. Another method is normalizing queue lengths using queue capacities (18,25). Gregoire et al. (12) used a convex function of queue capacity to normalize the queue length, which outperforms the Original-MP algorithm under heavy traffic conditions. Although these methods normalize or adjust the number of vehicles for the weight calculation to reduce queue spillovers, one drawback is that vehicles' positions and moving status are not considered. For example, a vehicle stopped in a queue close to the stop line should weigh more than a free flow moving vehicle far away from the intersection. To address this, Li and Jabari (21) proposed a position weighted backpressure (PWBP) algorithm that uses the sum of normalized distance over all vehicles on a link to define the weight of a movement. For an upstream movement, the normalized distance is the actual distance from the upstream end of the link divided by the link length, which means a vehicle closer to the intersection has higher weight. For the downstream movement, the actual distance from the downstream end of the link is normalized. This method generates low weight if the downstream vehicles are closer to the intersection and thus can reduce the occurrence of queue spillover. Most of the MP algorithm variants switch phases at a pre-defined frequency. Levin et al. (20) pointed out that the loss of a regular cycle pattern may not be preferable, since it brings confusion to travelers. Therefore, the study proposed a model with a fixed phase sequence and maximum cycle length. Other examples of cycle-based MP algorithms include are given in Dixit et al. (11) and Mercader et al. (23).
While the metric plays an important role on the control performance, the ability to measure this metric also influences its use. Mercader et al. (23) argued that queue length can be difficult to obtain in practice and instead proposed a MP algorithm based on vehicular travel time in the previous cycle. The model was validated in a field experiment. Since minimization of travel delay is one of the mostly used objectives in traffic signal control problems, delay-based MP (D-MP) algorithms have also been proposed. Moreover, delay is inherently capacity-aware, that is, the marginal delay increases with queue length. Wu et al. (13) developed a D-MP algorithm using the head-of-line (HOL) information, but the proposed model only works for isolated intersections. Dixit et al. (11) proposed a parsimonious D-MP and showed that highquality real-time delay data could be obtained at a cheaper cost than the physical sensors required for queue measurement. However, this model cannot relate the queue lengths and delay at an arbitrary time. More recently, Liu and Gayah (15) proposed a D-MP that can overcome this drawback. The model divides vehicles on a link into two groups: moving vehicles (at free flow speed) and stopped vehicles. Then, the delay incurred in one time step is equal to the number of stopped vehicles multiplied by the time step size. It is mathematically proved that the model has the most desirable property of the Original-MP algorithm: maximum stability, which means a demand can be accommodated by this algorithm if it can be accommodated by any control policy. Simulation results show that the D-MP algorithm performs well under various traffic conditions, including partially connected vehicle (CV) environments.
To the best of our knowledge, none of aforementioned models has considered delay equity, measured as the variation of delay experienced among all individual vehicles. This is another common signal control objective that has been widely studied; see Liang et al. (26), Hitchcock and Gayah (27), and Lertworawanich et al. (28). The lone exception is Wu et al. (13), which is only applicable for isolated intersections. Moreover, while the HOL model generates fairness compared to the queue-based MP algorithm, it also leads to longer queue lengths. This tradeoff between the fairness and queue lengths can be adjusted by using a weighted sum of the HOL term and queue term as the weight; however, there is not a systematic method to find the optimal coefficients in the weighted sum. In light of this gap in the literature, we propose here a delay-based model that improves both delay equity and average delay under various traffic conditions. The performance of the proposed algorithm, which is referred to as TD-MP (total-delay-based MP), is compared to three benchmark models: the Original-MP, D-MP, and PWBP. The paper shows that the proposed algorithm has the lowest variation in travel delay under all tested scenarios, and it has even lower average delay than the D-MP algorithm under most scenarios. The performance of all methods is also tested and compared in a CV environment in which a subset of vehicles is monitored and can contribute to estimates of the MP metrics. The results show that the TD-MP algorithm is more stable than the baseline models, and it generates both lower traffic delay and higher equity when the penetration rate is below 60%.
The rest of this paper is organized as follows. The next section introduces the general form of a MP algorithm and shows the proposed algorithm along with the baseline algorithms. The subsequent section conducts the simulation and shows the comparison of the control performance between the proposed algorithm and baseline models under various scenarios. We conclude in the last section.

Methods
This section first introduces the general form of a MP algorithm, focusing on the definitions of weight and pressure. Then, the baseline models are described, with emphasis on how they differ. Next, we propose the novel MP algorithm, TD-MP, that can be used to improve delay equity.
Useful notations for network configuration in this paper are defined as follows. A link pair l, m ð Þ that allows vehicle transitions at an intersection is called a movement. The set of link pairs that is served by phase p at intersection i is denoted by L p i ; r l, m ð Þ is the turning ratio from link l to link m; O l ð Þ is the set of allowed movements from link l; } i is the set of phases at intersection i; V l, m ð Þ is the set of vehicles of movement l, m ð Þ.

General Form of a MP Algorithm
In all MP algorithms, the signal timing is determined by the pressure of all phases. In a cycle-less algorithm, the phase with the maximum pressure is activated for the next step, while in a cycle-based algorithm, the effective green time is allocated according to the proportion of the pressures. The proposed algorithm is cycle-less, so we focus on this type in the following. The general form for the pressure of a phase, P p i , can be expressed as follows: where w l, m ð Þ is the weight of movement l, m ð Þ, T is the pre-defined time step size for the signal update, e is the lost time between the phase switch, d l, m ð Þ is 0 if movement l, m ð Þ is currently served and 1 otherwise, and s l, m ð Þ is the saturation flow of movement l, m ð Þ. Equation 1 shows that the pressure of a phase is a linear combination of the product of the weight and saturation flow associated with each movement. The term 1 À d l, m ð Þ T Àe T considers the lost time between phase switches. The pressure is dependent on time t; however, we omit it from Equation 1 for simplicity. Note that the time step for the signal update, T , differs from the time step for the traffic state update, which is usually smaller. After obtaining the pressure, the phase with the largest pressure will be activated for the next time step: All cycle-less MP variants have the same form for the pressure calculation; they only differ in the computation of the weight associated with each of the movements. The general form for the weight of a movement can be expressed as follows: where x 1 l, m ð Þ and x 2 m, n ð Þ are the upstream and downstream values for the selected metric, respectively, for example, number of vehicles, travel time, traffic delay, and so forth. The second term in Equation 3 is the weighted average metric value for the downstream movements using turning ratios. Its main function is to reduce the chance of activating the phase if the downstream is congested. For example, the Original-MP algorithm uses the number of vehicles as the metric; if x 2 m, n ð Þ is high, it means the downstream links are congested so it might not be optimal to active the corresponding phase. Another function is that a reasonable design of x 1 and x 2 ensures the maximum stability. A control policy has maximum stability if it can accommodate any demand that can be accommodated by any other control policy, where accommodation means the number of vehicles in the network is always upper bounded under the control policy. This property has been proved in certain models, for example, the Original-MP (10) and D-MP (15), with infinite queue capacity assumptions. Note that maximum stability is still an open question if the queue capacity is assumed to be finite. Readers interested in the details of these proofs are referred to Varaiya (10) and Liu and Gayah (15) for more details. In general, x 1 and x 2 have the same form, but this is not required. The PWBP and the proposed model both have different expressions for both terms, which will be explained later.

Baseline Models
This section shows the expressions of weight of the three baseline models: the Original-MP, PWBP, and D-MP. All three models are cycle-less, so we only focus on the weight calculation.
Original-MP. The Original-MP algorithm uses point queue models for the traffic states, and the number of vehicles of movements, n l, m ð Þ, is used to calculate the weight in Equation 3, which ignores the locations and moving status of vehicles. It can be expressed as follows: where n l, m ð Þ is the number of vehicles of movement l, m ð Þ. This model can lead to unreasonable decisions in some scenarios. For example, Figure 1 shows an intersection of two one-way streets with only straight-through movements. Although phase 1 has more vehicles on its associated link than phase 2 (four on the former, three on the latter), it is more reasonable to activate phase 2 because the three vehicles on its link are already queued at the stop line and are waiting to the served. However, the Original-MP algorithm will activate phase 1 since it has a larger number of vehicles, which could waste precious green time at the signal as the moving vehicles approach the intersection.
Position Weighted Back Pressure. To address the issue with vehicle positions in the Original-MP algorithm, Li and Jabari (21) proposed a MP model that involves the location of vehicles on the link in the weight calculation: where X l, m ð Þ is the link length and x i, l, m ð Þ is the distance of vehicle i from the downstream end of link l. As Equation 5 shows, the expressions of weights for upstream and downstream movements are different. For upstream movement, vehicles closer to the downstream node have a higher weight because they could suffer higher control delay if the phase is idle. On the other hand, for downstream movement, vehicles closer to the upstream node have a higher weight because they have a more significant impact on the upstream discharge. Using this method, the PWBP algorithm activates phase 2 in the example shown in Figure 1, which should lead to better control performance.  travel at free flow speed until they join the queue and, thus, all traffic delay is related to the stopped vehicles. Travel delay incurred in the previous time step was used for the weight calculation in the D-MP algorithm. Therefore, the D-MP algorithm uses an average metric, unlike the Original-MP and PWBP algorithms, which both use a snapshot metric in the weight calculation. The weight at the ith time step can be expressed as follows: where d l, m, t ð Þ is the delay generated from movement l, m ð Þ in time t.

Proposed Algorithm
Total-Delay-Based MP. As demonstrated by Liu and Gayah (15), the D-MP algorithm outperforms three baseline models in average traffic delay, queue length, and network throughput under various scenarios. In this model, the signal phase is selected at a pre-defined frequency depending on the delay of each movement in the previous time step. However, this means that the weight for each movement is reset to be zero at the beginning of each step regardless of the cumulative delay incurred by the current vehicles. Although this method achieves reasonable performance overall, it may lead to inequitable situations in which one movement experiences extremely large delay to the benefit of others. This method specifically favors approaches with heavier traffic when demand is unbalanced and, thus, approaches with less traffic may experience very high delays. This phenomenon can be explained using the example shown in Figure 2. The intersection consists of a one-way street and a two-way street, and each direction only has one lane. Vehicles from the northbound street can either go straight or turn right; the southbound street allows straight movement and left-hand turns; the eastbound street allows through movement only. These three movements are served by three individual phases, as denoted in Figure 2a. Assume vehicles arrive to phases 2 and 3 continuously at a fixed rate of 1 veh/s, and that vehicle arrival rate to phase 1 is 0 veh/s. Also assume the saturation flow for all movements is 2 veh/s, signal phases are updated every 4 s, and the lost time between phase switches can be ignored. For simplicity, we also assume all vehicles move at free flow speed when the signal is green; that is, vehicles do not experience delay during the green time. Let the left-hand side of Figure 2b be the initial state at t = 0 s; note that one vehicle is waiting for phase 1, four vehicles are waiting for phase 2, and no vehicles are waiting for phase 3. Consider the case where the D-MP algorithm activates phase 2 for the first time step since it has the largest number of vehicles in the queue. At time t = 4 s, the queue of phase 2 will be cleared since the saturation flow of 2 veh/s is able to just process the initial queue and new arrivals in the first time step. For phase 3, one vehicle arrives in each second. Consequently, there are four vehicles waiting for phase 3 at t = 4 s. The first vehicle arrives at t = 1 so the delay it incurs is 3 s. Similarly, the other three vehicles experience delay of 2, 1, and 0 s, respectively. Therefore, the delay incurred by vehicles of phase 3 in this time step is 6 s. However, for phase 1, the delay is always equal to 4 s in each time step since there is always only a single vehicle waiting in the queue. Therefore, phase 3 is activated at t = 4 s. Moving forward, the signal will keep switching between phases 2 and 3, and phase 1 will always be idle. Consequently, the vehicle waiting for phase 1 will incur an arbitrarily large delay.
To overcome this drawback, we replace the delay term in Equation 6 with the sum of delay incurred by all vehicles on the link since they joined the link. Since we use the total delay from all vehicles, we refer to this model as the TD-MP. The weight can be expressed as follows: where d v i, l, m ð Þ is the delay incurred by vehicle i since it joins movement l, m ð Þ. This should provide improved delay equity by helping to ensure that some vehicles do not experience extremely high delays to benefit others. Considering the example above, the weight for phase 1 is 8 s at the beginning of the third time step, which will allow it to be served at t = 8 s.
Note we keep the downstream term from the D-MP algorithm. As mentioned before, the main goal of the downstream term is to avoid activating a phase if the downstream links are all congested. If the downstream link has less vehicles, it should impose less (negative) impact on the weight. However, d v i, l, m ð Þ considers the cumulative delay since the vehicle joins the link rather than the current level of congestion. As the previous example shows, a downstream link with fewer vehicles can have a higher cumulative delay, which makes it unsuitable for the downstream term. Therefore, we use the same downstream term from the D-MP algorithm that only counts the delay from the previous time step. By using the cumulative delay from all vehicles for the upstream movements, the TD-MP algorithm is expected to reduce the traffic delay for approaches with low traffic demand and improve the delay equity. It is worth mentioning that Equation 7 does not maintain maximum stability. In general, the maximum stability requires the control policy to serve the busiest phase when making decisions. However, the main purpose of the proposed algorithm is to improve the delay equity, for which Equation 7 would not always activate the busiest phase, especially when the demand is unbalanced. As mentioned before, the establishment of maximum stability is under one strong assumption: the queue capacity of all links is infinite, which is questionable in reality. In addition, the simulation results in the following sections show that while improving the delay equity, our model can even improve the overall performance, that is, reducing delay, compared to all baseline models that have the maximum stability under certain scenarios.
Modification for the Work Conservative Property. A signal control policy is said to be work conservative if the activated phase can serve at least one vehicle (i.e., at least one served upstream link is not empty and its downstream links are not all full), as long as there exists at least one vehicle that can be served at the intersection. This property has been discussed in some MP-based algorithms, for example, Gregoire et al. (12) and Mercader et al. (23). Up to now, the weight definition of the proposed algorithm and all baseline models cannot ensure the work conservative property. To this end, we change the pressure expression in Equation 1 to the following: Here, the large M is used to make the second term small, and thus maintain the pressure for feasible phases. All models become work conservative by using Equation 8 to update the signal timing. Note, in the following simulations, if all phases have empty upstream links or fully congested downstream links, we randomly select a phase to activate at that intersection.

Simulation Tests
To validate the proposed algorithm, microscopic traffic simulation tests are conducted using SUMO software (29). Since the proposed TD-MP algorithm is an extension of the D-MP algorithm developed by Liu and Gayah (15), as a preliminary test, we first compare these two algorithms for an isolated intersection. Then, we compare the proposed algorithm against all baseline models on a grid network assuming the full knowledge of the required metrics can be obtained. Subsequently, the control performance of the proposed algorithm in a partially connected environment is tested.

Test at an Isolated Intersection
Simulation Settings. We examine the performance of the proposed model when applied to an isolated intersection, illustrated in Figure 3. The incoming links are marked by thicker dashed lines, and the arrows showing the travel direction. For each incoming link, the right-hand lane is a shared lane by the right-hand turn and the through movements, and the left-hand lane is for left-hand turns only. Turning ratios are assumed to be identical for all incoming links. The speed limit is set to 20 m/s and the saturation flow is 1800 vehicles per hour per lane (vphpl). We use the default car-following model of Krauß (30) in SUMO. Available phase options are shown on the righthand side of the figure. The lost time between phase changes is set to 3 s. The signal update time step T = 5 s, while the simulation time step is 1 s. At the time of each update, the TD-MP algorithm calculates the pressure as the cumulative delay of all vehicles that are currently on each link since they entered the link. Note that for this isolated intersection, the downstream links are essentially sinks, so we do not include the downstream term in the weight and pressure calculations.
For simplicity, we use equal demands for the northsouth (NS) and east-west (EW) incoming links, denoted by f NS and f EW , respectively. The sum of these two demands is fixed for all simulations and set equal to 1400 vphpl. To study the delay equity, one balanced and five unbalanced demand scenarios are considered. In all unbalanced scenarios, NS links have higher traffic demand than EW links. Ten runs with different random starting seeds were performed for each scenario. This section serves as a preliminary demonstration for the equity improvement of the proposed algorithm.
Results. Figure 4 shows the average delay measured for each individual lane at the intersection. In the legend, L indicates the left-turn-only lane, and RT indicates the shared lane. The x-axis indicates the difference in demand between the NS and EW links. For example, 0 indicates the balanced scenario, in which the demand for all four approaches equals 700 vehicles per hour (vph), and 200 vph means the demands for NS and EW links are 800 and 600 vph, respectively. Darker colors are used to represent the results from the D-MP algorithm and lighter colors are used for the TD-MP algorithm. We consider the distribution of delay across intersection movements as a measure of delay equity. The results reveal that the D-MP algorithm generates inequitable travel delay even under the balanced scenario, which is caused by the unequal turning ratios. Since there are fewer left-hand turns than right-hand turns and through movements combined, the delay from the left-hand lanes is significantly higher than from the right-hand lanes. Notice that the delay from the NS links decreases as the demand imbalance increases, as it has the heavier load; by contrast, the delay for the EW links increases with the demand imbalance. As a result, the delay variation increases significantly with the level of demand imbalance. The difference in delay for individual movements can be quite large; for example, when the demand difference is 1000 vph, the delay difference experienced from the D-MP algorithm between vehicles on the EW leftturn lanes and NS through-right lanes is 59 s, which is 317% of the average delay experience at the intersection.
The TD-MP algorithm can improve the delay equity effectively, even for the balanced scenario. When the demand difference is less than or equal to 400 vph, the TD-MP algorithm produces a higher delay for righthand lanes and lower delay for left-hand lanes compared to the D-MP algorithm. Consequently, the delay difference among all lanes is reduced. When the demand difference exceeds 400 vph, the demand of the left-hand lane from the NS direction becomes higher than the demand of both lanes from the EW direction. Accordingly, the TD-MP algorithm increases the delay of both lanes from the NS direction and reduces the delay of the other direction. Thus, the TD-MP model improves the delay equity for all tested scenarios. Figure 5 shows the average delay and standard deviation of delay experienced by all vehicles in the simulation. Figure 5a shows that the average delay from the TD-MP algorithm is larger than that of the D-MP algorithm; however, the amount is small (less than 2 s per vehicle and less than 10% of what is observed in the D-MP algorithm), and this difference diminishes with the level of imbalance. In contrast, Figure 5b shows that the TD-MP algorithm can reduce the standard deviation in traffic delay considerably; the improvement ranges from 2 s for the balanced case (13%) to 7 s for the most unbalanced case (35%). Overall, these results demonstrate the power of the TD-MP algorithm to improve the delay equity without the significant sacrifice of overall mobility for a single intersection. The next section will demonstrate that these benefits can be further improved when the TD-MP algorithm is applied network-wide.

Test on a Grid Network
Simulation Settings. To further demonstrate the benefits of the TD-MP model, its performance is tested on an idealized 4 3 4 grid network, illustrated in Figure 6. The phase pattern, turning ratio, and lane configuration are the same as the single intersection. Similarly, all NS links have the same demand, and all EW links have the same demand. We tested three demand levels for f NS + f EW : 1200, 1400, and 1500 vph. We refer to these three levels as the low, medium, and high demand levels. For each demand level, one balanced scenario and five unbalanced scenarios are considered.

Simulation-Based Optimal Time
Step. Liu and Gayah (15) showed that the update time step is critical for MP-based control. Thus, the optimal update time steps are needed for each method for a fair comparison of their performance. To this end, all models were applied using four values for the time step (5, 7, 9, Figure 7. Ten simulations with different random starting seeds were performed, and the shaded area of the curves correspond to the confidence interval for the mean values. The results reveal that, for the unbalanced scenario, all models have an optimal time step of 5 s at which both the average delay and standard deviation are the lowest. For the balanced scenario, 5 s is also optimal for the TD-MP, D-MP, and PWBP algorithm. The Original-MP algorithm has different optimal values; the difference in performance across time steps is much smaller than that in the unbalanced scenario. The possible reason that the Original-MP algorithm has a larger optimal time step when the demand is balance is that when the time step is very small, for example, 5 s, the controller could switch phases too frequently because the number of vehicles is sensitive to the traffic state, which can generate high time losses and lead to poor performance. Specifically, when  the demand is balanced, we can reasonably assume that the numbers of vehicles from all approaches are similar. Therefore, a phase activated in the previous time step tends to have fewer vehicles than other phases at the next one or two time steps. As a result, the signal phase changes very frequently, which will create unnecessary phase switches and high time losses that lead to poor performance. This issue is diminished by the other three models, because the metrics are more insensitive to the number of vehicles. In addition, most scenarios we will test are unbalanced. Therefore, to ensure the fairness, we use 5 s for all models.

Results With Full Knowledge of Required Metrics Under Stable
Demand. Figure 8 shows the average delay and standard deviation of delay obtained from all models for the three demand levels. For each scenario, although the time headway of vehicle generations is random, the average arrival rate is fixed through the whole simulation. Therefore, we refer to this demand pattern as stable demand. For all demand levels and models, the average and variation of delay increase with the level of imbalance. The Original-MP algorithm has the highest values for both metrics in all scenarios. This is not unexpected, since it uses a snapshot metric to determine the signal timing, and it treats all vehicles on a link as the same while locations and moving status are crucial for the performance. With the increase in travel demand, the average queue length increases, and the non-queued vehicles will decrease. Thus, the negative influence from this treatment is diminished. Therefore, the relative increases in the average delay as demand increases between the Original-MP and other models is reduced, especially for the balanced scenarios. Interestingly, the proposed TD-MP algorithm has the lowest delay for several cases tested; specifically, for demand differences less than 600 vph under low demand levels or demand differences higher than 800 vph under the medium and high demand levels. For other scenarios, the PWBP algorithm has the lowest average delay, but the average delay difference among the D-MP, TD-MP, and PWBP models is negligible. However, the standard deviation from the TD-MP model is considerably smaller than all three baseline models for most cases, which suggests that the TD-MP model provides better delay equity than all other methods. The differences are starker for the highly unbalanced scenarios and under higher demand levels. Figure 8 leads to a similar conclusion as the single intersection case study: for a network, the TD-MP model can improve the delay equity significantly, and this improvement does not come at the any significant cost of the overall mobility.

Results With Full Knowledge of Required Metrics Under Unstable
Demand. The previous section demonstrated the control performance of the TD-MP model under stable demand. However, since adaptive signal control strategies are designed to tackle real-time traffic conditions, it is critical to investigate the control performance under unstable demand. To this end, we create a typical scenario in which the average demand increases from a low starting level to a high level and then drops back to the low level.
We assume the demand difference between NS entry links and EW entry links is 600 vph for the whole simulation. In addition to this variation in the average demand, we assume that under unstable demand, the real arrival rate changes every 5 min and follows a normal distribution with a mean equal to the average demand in that interval and a standard deviation equal to 5% of the mean. For example, if f NS = 900 for the first 20 min, a sample is drawn every 5 min from the distribution N 900, 45 ð Þ to be used as the arrival rate for the corresponding 5-min interval. Figure 9 shows the unstable demand pattern generated using this method. Figure 10a shows the cumulative delay for all models under unstable demands, and for visualization purposes, Figure 10b shows the delay reduction from the other three models compared to the Original-MP. As expected, since the MP algorithms use the pre-defined metric value, which solely depends on the current traffic states on the local links to determine the signal timing, the uncertainties do not have a significant impact on the control performance. Similar to the scenarios under stable demand (when the demand difference is 600 vph), the PWBP model has the lowest delay while the Original-MP model has the worst performance; furthermore, the TD-MP model reduces delay compared to the D-MP model. Figure 11 shows the standard deviation of travel delay per vehicle in all 10 simulation runs. It shows that for all random seeds, the TD-MP model can improve the delay equity significantly compared to all baseline models. The PWBP model has a better delay equity than the D-MP and Original-MP models for most random seeds, while it generates higher delay variation for a few random seeds. Above all, this section shows that the MP-based algorithms are robust to the randomness in the travel demand.
Results in a Partially Connected Vehicle Environment. The results shown in the previous section are obtained when perfect information is available to compute metrics required to implement all MP models. However, this requires perfect sensing across the network, which is not realistic or expected. Compared to the baseline models, it is more challenging to obtain the pressure for the TD-MP model since it requires the exact arrival time and departure time of each vehicle on a link for the total delay computation. Thanks to the advancement and increasing deployment of CVs, traffic states can also be obtained through communication with the CVs without the help of other equipment. However, since a fully connected environment is not expected to occur in the near future, it is critical to test the model performance in a partially CV environment. In this case, traffic states may be estimated based on the information provided by only a portion of vehicles that serve as CVs, and the information from other vehicles is ignored. For example, if there are two CVs and five non-CVs on a link, the value for the metric of the Original-MP model, which is the number of vehicles, is 2.
To this end, we considered five CV penetration rates l 2 20, 40, 60, 80, 100 f g used to provide inputs for the required metrics in all MP algorithms. For simplicity, we only investigated the balanced scenario and one unbalanced scenario (with a demand difference of 600 vph) for the medium demand level. (To have a relatively fair comparison, we do not use the most unbalanced scenario in which the TD-MP model outperforms other models considerably with respect to delay equity.) The influence of the penetration rate is shown in Figures 12 and 13. The horizontal dashed lines indicate the values of the baseline models in a fully connected environment, that is, l = 100. The error bars indicate the standard error across the 10 random seeds. Both the average delay and the standard deviation when l = 20 are significantly larger than other values. For example, the average delay of the PWBP and TD-MP models, which are the best models in a fully connected environment, from the balanced scenario is equal to 883 and 186 s, respectively. This suggests that an extremely low penetration is not sufficient to obtain these metrics. Therefore, to have a better visualization, the results of l = 20 are omitted.
In line with our expectation, both the average delay and delay variation of all models worsen with a decrease in the penetration rate. This influence of penetration rate is more significant for the unbalanced scenario than the balanced scenario, that is, for a given reduction in the penetration rate, the increase in the two metrics for the unbalanced scenario is higher than for the balanced scenario. In addition, for both demand scenarios, this impact of the penetration rate is more significant when the penetration rate is low. For instance, for the unbalanced scenario shown in Figure 13b, the increase in the standard deviation when the penetration rate changes from 60% to 40% is much more significant than the increase when the penetration rate changes from 100% to 80%.
For the comparison of model performance, the proposed TD-MP model provides the best delay equity for all penetration rates under both unbalanced and balanced scenarios. It also generates the lowest average delay when penetration rate is equal to or lower than 60%. This is reasonable, as the TD-MP model is able to better account for ''lone'' vehicles at an approach (see example associated with Figure 2), which is more likely under smaller penetration rates. When the penetration rate is higher, the average delay from the TD-MP model is still very close to the minimum value, which is achieved by the PWBP model. Note that although the PWBP model has the lowest average delay when the penetration rate is higher than 80%, its performance worsens drastically when the penetration rate decreases. When the penetration rate drops to 40%, it even generates a larger delay variation than the Original-MP model, which has the worst control performance under all other scenarios. The PWBP model requires both the number of vehicles and the positions of all vehicles for the weight calculation. In a highly CV environment, the second information is beneficial for the control performance since it provides more detailed and accurate traffic state information. However, when the penetration rate is low, the high randomness can result in large errors in the weight estimates and worsen the control performance.
In addition to the comparison of the average delay and the standard deviation, the TD-MP model is also more stable in responding to both the traffic randomness, which is reflected by the error bars in Figures 12  and 13, and the penetration rate, which is reflected by the changes in both metrics for a change in the penetration rate, than other models. This is a very desirable property for a signal control algorithm.
To investigate the estimation error from the CVs in the pressure calculation, we compute the ratio of the sum of the value estimated from the CVs over all links and all steps during the whole simulation to the corresponding value computed from all vehicles. Figure 14 shows the ratio (in percentage) of the estimation under both demand scenarios tested above. The Original-MP model has the highest estimation ratio, while the TD-MP model has the lowest. This is not unexpected. The reason is that the TD-MP and D-MP models use average metrics, that is, delay in each time step and total delay incurred by current vehicles, while the Original-MP and PWBP models use snapshot metrics, that is, the number of vehicles and position weighted number of vehicles at the instants of signal update. Assume there is one stopped non-CV and one stopped CV on a link at the first time step, and the CV leaves in the second time step. For the Original-MP model, the ratio in these two time steps is equal to 1 + 0 2 + 1 = 0:33; however, the ratio for the TD-MP model is equal to 1 + 0 2 + 2 = 0:25. Therefore, the TD-MP model has a lower estimation ratio than the Original-MP model because of this faster increase in the denominator in the estimation ratio calculation. It can be explained in the same manner for other models as well. Although the TD- MP model has the lowest estimation ratio, it has the best performance in both average delay and delay equity under various conditions. The reason is twofold: firstly, total-delay is a better metric in nature to reflect the traffic condition than other metrics in various traffic conditions; secondly, it is the relative estimation accuracy among all phases instead of the estimation accuracy itself that determines the control performance. Although the estimation ratio of the TD-MP model is lower, the CVs are randomly distributed in the system, so it is reasonable to assume the estimation ratio for all approaches is very similar in the long-run. Therefore, the TD-MP model is still able to select the phase with the actual maximum pressure in certain time steps. Consequently, compared to other algorithms, the TD-MP algorithm can provide the correct control for enough time steps to outperform the baseline algorithms. Overall, in a CV environment, the performance of all the tested MP-based models improves with an increase in the number of CVs. This shows that the proposed TD-MP model has the best performance in both the average delay and delay equity for most scenarios. Moreover, the TD-MP model is a more stable control policy for responding to the randomness in both traffic conditions and penetrate rate.

Conclusions
This paper develops a new MP model using the sum of delay over all vehicles, since they join a link as the metric to compute the weight. The proposed model is compared to multiple other MP-based signal control methods, including the Original-MP, PWBP, and recently proposed D-MP models. The simulation results suggest that the proposed model can improve equity significantly, especially for highly unbalanced traffic conditions, while simultaneously keeping the average delay almost as low as for other models. Simulations also reveal that when data on travel times, delays, and vehicle positions are only available from a subset of vehicles-as would be the case in a CV environment-the proposed model is more stable than the Original-MP and PWBP models and demonstrates similar stability to the D-MP model. Further, the proposed TD-MP model provides the highest delay equity for all tested penetration rates and the lowest average delay when the penetration rate is equal to or lower than 60%.
The existing MP algorithms control traffic operation through signal timings. Thanks to the development and increasing popularity of autonomous vehicles (AVs), this emerging technology has the potential to boost the efficiency further. Therefore, proposing a method to combine the MP algorithm and the control of AVs is a promising future research direction. In addition to travel time, traffic delays, and queues, other measure of effectiveness such as the number of stops and fuel consumption is another point that needs to be investigated. This paper focuses on the time-step-based MP algorithm in which the controller switches phase at a fixed frequency. Other cycle-based MP algorithms (11,20,23) have also been proposed, in which the phase sequence and duration are adjusted every cycle. Since the minimum green time is usually imposed, they are expected to have less intensive delay inequity. However, the green allocation in such models is proportional to the pressure, which can lead to delay inequity as well. Therefore, it is another interesting topic to investigate the delay equity in cyclebased MP algorithms.

Author Contributions
The authors confirm contribution to the paper as follows: study conception and design: H. Liu  authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by NSF Grant CMMI-1749200.

Data Accessibility Statement
All data used in this paper were generated by the authors using simulation software as described in the paper. Copies of the simulated data are available on request.