Receding Horizon Cooperative Platoon Trajectory Planning on Corridors with Dynamic Traffic Signal

In this paper, a trajectory control approach using model predictive control is proposed for cooperative (automated) vehicles. This control approach optimizes accelerations of the controlled connected and automated vehicle (CAV) platoon along a corridor with signalized intersections. The objectives of the proposed approach are to maximize the throughput first and optimize comfort, travel delay, and fuel consumption simultaneously after that. The throughput is determined according to the maximal number of CAVs that can pass the intersection during the green phase. Safety is included by penalizing smaller gaps between CAVs in the running cost. The red phase is taken into account as a virtual vehicle at the stop-line during the red time, thus the safe gap penalty with the virtual vehicle causes the first-stopping vehicle to decelerate or even stop facing the red phase. The acceleration and speed are constrained within the upper and lower bounds. The proposed approach is flexible in dealing with platoon merging, splitting, stopping, and queue-discharging characteristics at signalized intersections. Finally, the proposed control approach is verified by simulation under a baseline scenario and four scenarios, which consider signal settings and the anticipation of the red phase. The simulation results demonstrate the benefits of the proposed control approach on fuel savings, compared with the state-of-art approach which used the virtual vehicle term without anticipation. The adjustments of signal parameters in Scenario 3 and Scenario 4 demonstrate the applicability of the control approach under actuated signal control.

tested in few studies, considering the partial automation (9) or the adaptive signal control (12). In addition, these systems were designed only for the individual vehicle benefits, rather than the benefits of the platoon or the traffic flow.
A cooperative vehicle intersection is a ''signal-free'' intersection which enables the CAVs to communicate with each other and thereby pass the intersection cooperatively without collision (13). Although these intersection control algorithms had the potential to improve the traffic operations of CAVs at a typical four-arm intersection (14)(15)(16)(17) or along a corridor (18), driver/user acceptance in relation to safety perception and potential conflicts of pedestrians and bicyclists have been neglected, which questions the applicability of this line of research in reality.
With respect to the CAV trajectory optimization by controlling speeds or acceleration rates at fixed-timing intersections, some CAV trajectory control approaches at isolated intersections only applied simple objective functions to optimize energy economy, ride comfort, or both (19,20). These control algorithms used terminal costs to represent the red phase, assuming that the terminal conditions (time and position) were known at an isolated intersection. However, terminal costs are confined to be applicable at isolated intersections, because it is difficult and suboptimal to combine intersections along an arterial using terminal costs. Several instantaneous fuel consumption, emission models, or both (21)(22)(23) were adopted in these control approaches to minimize the fuel use, or to validate the reduction of fuel consumption and emission by simulation. More sophisticated systems on corridors with multiple pre-timed intersections were designed for an individual vehicle considering multiple criteria (24)(25)(26)(27)(28)(29). The key in the design was supposed to make vehicles stop facing the red phase. There are three ways to achieve this performance: using a virtual vehicle, tracking the target speed, and constraining the position. The first approach applied a virtually preceding vehicle at the stop bar representing the red phase. Together with the safe gap requirement, the followers behind the virtual vehicle were able to stop to keep the safe gap between the virtual vehicle and the followers. The control approach in Asadi and Vahidi (24) considered the red phase in constraints by introducing a virtual vehicle in front, but the signal information was implemented with no prediction. The other approach aimed to track the piecewise target speed (including desired deceleration rates/speeds) facing the red phase (25)(26)(27)(28). To track the pre-defined target speed would produce large decelerations once the vehicle was recognized to miss the green phase, and then accelerate dramatically at the beginning of the green phase. Therefore, more attention should be paid to design the target speed in an optimal way, and relieve computational burden when tracking the piecewise target speed cost term. Another approach was to regard the red phase as a position constraint that the stopping vehicle could not pass (29). However, the work in Liu et al. (29) did not track the preceding vehicles in desired gaps. Therefore, elaborate work on tuning cost weights was necessary to make a trade-off between maximizing speeds and minimizing fuel consumption. Otherwise, the vehicles might stop far away from the intersection to save fuel. A parsimonious shooting heuristic algorithm was proposed subject to constraints of vehicle arrivals, vehicle mechanical limits, traffic lights, and car-following safety. The vehicle trajectory was decomposed into a few analytically solvable sections for a simple constructive heuristic. Based on this algorithm, an optimization framework was proposed, optimizing the travel time, a surrogate safety measure, and fuel consumption simultaneously (30,31).
There were also research interests focusing on the integrated optimization of adaptive traffic signals and vehicle trajectories in a unified framework (32)(33)(34). The platoons were designed to decelerate but not stop when approaching the intersection during the red phase. However, these control algorithms were designed to optimize simple objective functions of the platoon leader in the vicinity of an isolated intersection for relieving computational load.
From the discussion above, it can be concluded that most current approaches only optimize the trajectories of an individual vehicle using simple objective functions of a few criteria. In addition, it is evident that the existing optimization-based control algorithms under traffic signals mostly focus on design for pre-timing signals, and the current way to represent the red phase using piecewise target speed term may result in computational issues. The previous work in Liu et al. (29) was designed for an arterial by optimizing a comprehensive objective function, considering throughput, ride comfort, travel delay, and fuel savings. However, the previous control system was open-loop based on feedforward optimal control, and thereby was restricted to a fixed-timing signal plan. One advantage of closed-loop control systems over open-loop systems is that the use of feedback allows the system to be insensitive to both external disturbances and internal variations in system parameters (35), such as changes in signal settings. Although the control approaches allowed for system feedback in other work (19,24,25,28), they did not take advantage of it and were thereby confined to pre-timing signals. The reason is that signal information in these approaches is an input when tracking the pre-defined target speed, which excludes signal changes within the system. To include the adaptive signal plan, a closed-loop system is developed to overcome the limitations of open-loop systems. The feedback at each time step in the closed-loop can replan the trajectory under actuated or semi-actuated signal plans. In addition, the previous work in Liu et al. (29) required elaborate work on tuning cost weights to avoid stopping away from the stop-line. An improvement to address this problem is to transform the red phase position constraint to a penalty term in the running cost, which helps tune the cost weights under the workings of both the safe following and red phase terms.
In this paper, a model predictive control (MPC) framework is proposed for urban corridors under traffic signals to overcome the aforementioned limitations of platoon trajectory control approaches. The proposed MPC framework is efficient on computational time using an iterative Pontryagin maximum principle (iPMP) approach (36). An optimal platoon trajectory control algorithm is presented by optimizing accelerations of the overall controlled CAV platoon. The control algorithm determines the optimal throughput first, and then optimizes multi-criteria including ride comfort (by minimizing accelerations), average travel delay (by maximizing vehicle speeds), safe space gap, and fuel consumption rates, subject to admissible constraints on acceleration and speed. Safety requirements are incorporated by stimulating the inter-vehicle distances larger than the minimum safe gap as a penalty term in the running cost. The red phases are represented by introducing virtual vehicles at the stop bars during the red phases, thus the firststopping vehicles can avoid departure in red time using the safe gap penalty with the virtual vehicles. The red phase is implemented with anticipation by updating the cost terms in the running cost at the beginning of the current signal cycle. The proposed control approach is flexible in accounting for platoon dynamics of merging, splitting, stopping, and queue discharging along a corridor with multiple intersections in an oversaturated traffic flow. The proposed trajectory control approach is not restricted to fixed signal timing. It also works under the actuated signal plan by updating the signal parameters in the closed-loop, which reveals the flexibility of the control approach under different signal control approaches. Finally, the performance of the proposed control algorithm is verified by simulation using four scenarios and a baseline scenario, which take the signal settings and the anticipation time of the red phase into account.
The remainder of the paper is organized as follows: the following section introduces the control formulation for longitudinal driving task, followed by the experiment design and analysis of the simulation results. We conclude the study in the final section.

Control Formulation
The longitudinal platoon control problem is formulated in this section, including control problem, control objectives and constraints, system dynamics, controller formulization, running cost specification, derivation of the optimal control input and solution approach.

Control Problem
To demonstrate the workings of the proposed algorithm a 100% CAV environment and pre-timed signal control are considered. It is assumed that signal phasing and timing information is available for the platoon controller under I2V communication, and CAVs can communicate with each other and be controlled via accelerations. The actuator lag and the sensor delay are not considered. Merging behaviors from side streets or adjacent lanes are not taken into account.
The statement of the control problem can be described as a CAV platoon traveling on the corridor with multiple intersections where downstream CAVs are queuing before the stop-lines. The platoon trajectory control system will be activated if the platoon leader reaches the control zone (e.g., 200 m upstream of the stop-line at the upcoming intersection). The control objective is to determine the accelerations of the CAV platoon and CAVs in the queue to fulfill control objectives and constraints. The maximal throughput is pre-determined, and will be detailed in the forthcoming subsection.

Control Objectives
The control design is expected to fulfill (a trade-off between) the following control objectives, including: (1) To maximize the throughputs during the (remaining) green phases (2) To maximize the ride comfort (by minimizing accelerations) (3) To minimize the travel delay (by maximizing vehicle speeds) (4) To minimize the fuel consumption (5) To maintain the safe gap with the preceding vehicle (6) To decelerate or even stop confronting the red phase if unable to pass the intersection The throughput is optimized first by determining the maximal number of vehicles that are able to pass the intersection during the green phase. The reason for that is to confirm the first-stopping vehicle facing the red phase, and then the red phase term of the sixth objective will be applied to the first-stopping vehicle.

System Dynamics Model
To describe the longitudinal dynamics model, a secondorder model is proposed in this subsection. The control input variable u is the acceleration, u i (t). i (1 ł i ł N) denotes the vehicle sequence number on a single lane, and N is the total vehicle number in the controlled platoon. State variables x are considered as the longitudinal position, x i (t), and the speed, v i (t), of the controlled vehicle i. The control and state variables can be defined as: u = u 1 , :::, u i :::, u N ð Þ T ð1Þ x = x 1 , :::, x i :: The longitudinal dynamics model is described by the following ordinary differential equation:

Controller Formulization and Running Cost Specification
If q j (veh) denotes the maximal number of vehicles able to pass the jth intersection, then the cost function J of the control system can be formulated as the following: subject to (1) the system dynamics model of Equation 4 (2) the initial condition: x(0) = x 0 (3) the constraints on state and control variables: where L denotes the running cost and G denotes the terminal cost at the end of the prediction horizon T p . Although the terminal cost function has an influence on the controller stability and performance, a longer prediction horizon can compensate this impact of G at the cost of computational load (37). The terminal cost G (=0) and an appropriate prediction horizon are chosen in this work to guarantee the controller performance. Noteworthy is that the maximal throughput q j can be pre-determined before the final optimal solution. The value of q j can be optimized beforehand based on the optimal position trajectory x i (t) when removing the red phase penalty in the control objectives. In other words, the last vehicle that can depart the jth intersection during the green phase is pre-determined as the q j th vehicle.
Here, the first vehicle unable to pass behind the q j th vehicle is defined as the first-stopping vehicle (i = q j + 1) at the jth intersection.
In this control design, the running cost of vehicle i, L i (a constituent of L), is defined as follows (the time t is omitted to simplify equations): Here, l i denotes the length of vehicle i, t min denotes the minimum safe car-following time gap, and s 0 is the minimum space gap at standstill conditions. Turning vehicles to leave intersections can be included in the control approach by setting different values of t min for different turning movements. To represent the red phase, a virtual standstill vehicle is introduced in the last term of the running cost. v j vir (=0) and x j vir are the speed and the position of the virtual vehicle at the jth intersection, The first cost term in the running cost is designed to maximize ride comfort by minimizing accelerations. The second cost term in the running cost is to maximize speeds to minimize travel delay. The third cost term is to track the preceding vehicle and consider the safety as a large penalty if the distance to the predecessor is short. The fourth cost term implies that the gap is stimulated to follow the desired time gap, t min . The fifth cost term represents the minimization of fuel consumption. The last cost term is designed only for the first-stopping vehicle at the jth intersection (i = q j + 1) during the red phase. This term renders the stopping vehicles stay in front of the stop-line using the safe gap penalty with the virtual vehicle.
In the fifth term, f eco is the instantaneous fuel consumption rate (ml/s). Detailed parameter values can be found in Kamal et al. (23). Although f eco is optimized to approach zero accelerations and speeds, other criteria in the running cost trade off with the fuel consumption cost to generate optimal trajectories in the vicinity of signalized intersections. For typical vehicles on a flat road, f eco (ml/s) can be estimated as It should be noted that the running cost in Equation 7 is a piecewise function according to the vehicle sequence in the platoon. The running cost is categorized into three modes for better illustration, that is, the leading mode, the following mode, and the first-stopping mode. Leading mode is designed for the platoon leader, thus the third and fourth (safe following and desired time gap) cost terms vanish owing to no preceding vehicle ahead. Following mode is used for the following vehicles, so the sixth (virtual vehicle) term is removed. First-stopping mode is used for the first-stopping vehicle, which engages in avoiding collision with the virtual vehicle and anticipating signals facing the red phase, so the fourth (desired time gap) term is unnecessary.
This switch of the running cost under three modes can be achieved by updating cost weights b 3 and b 4 (leading mode), b 4 (first-stopping mode), and b 6 (following mode). Assuming the signal cycle starts from the green phase, all cost weights can remain unchanged within the cycle. This is beneficial to apply the proposed control approach under an actuated signal plan because the red and green phase lengths are flexible during a signal cycle. In addition, the red phase refers to the red phase with anticipation, which will be illustrated in the ''Solution Approach'' subsection.

Derivation of the Optimal Control Input
Hereafter, the control problem is solved based on Pontryagin maximum principle. Without providing too much detail, the Hamiltonian H is defined as follows (t is again omitted): where l denotes the co-state of the system: Thus, the optimal control law can be obtained according to the necessary condition for the optimal control law using Hamiltonian. Therefore, the optimal control law can be described as: To simplify the piecewise feature of the instantaneous fuel consumption model f eco , the Heaviside function h is introduced: In Equation 13, the Heaviside function value is zero for negative and zero arguments (n ł 0), and holds for one under positive arguments (n.0). The co-state dynamics are thereby derived as:

Controller Constraints
The control problem should respect some constraints on control and state variables. Admissible acceleration is restricted between the maximum acceleration, a max , and the minimum acceleration, a min . Speed should be lower than the limit speed, v max , but nonnegative.

Solution Approach
An iPMP approach is applied to solve this control problem, referring to Hoogendoorn et al. (36) and Wang et al. (38) for details. The continuous-time control problem is discretized in time within the prediction horizon in relation to the control and co-state variables. The iPMP approach solves the state and co-state dynamics forward and backward in time, respectively, and then updates the co-state dynamic with a weight factor. The updated costate will be imported to the next iteration as an input. The optimization converges if the error between the state and co-state dynamics is smaller than the pre-defined threshold, then the iteration stops. The illustration of the solution approach is depicted in Figure 1.
The MPC framework is applied, which solves the control problem in a shorter horizon than the optimal control framework in our previous work (29). This shorter horizon of the MPC framework results in an efficient computational time. The MPC framework only selects the first time step of the optimal solution in the iPMP algorithm. The constraints on control and state variables are implemented on restricting control variables based on system dynamics.
Platoon dynamics of merging, splitting, stopping, and queue discharging along a corridor are achieved by switching three modes of the running cost (updating the values of cost weights), which is included within the MPC closed-loop at every time step. In the presence of signal anticipation, the red phase can be anticipated by implementing the virtual vehicle term as early as possible, that is, at the beginning of the current signal cycle under the pre-timing signals, or at the moment when the platoon controller receives the updated signal plan under actuated signals. The MPC framework allows for system feedback, that is, signal changes, thus actuated signal settings can also be incorporated in the MPC closed-loop. In addition to signal anticipation, the signal settings (the green and red time) can also be updated in the MPC closed-loop by switching the cost weights in response to the actuated signals. Therefore, this approach can also be applied under the actuated signal control approach.

Simulation Results and Analysis
This section verifies the platoon performances of this control algorithm under four scenarios, considering the signal settings and the anticipation time of the red phase. A baseline scenario is also presented for comparison.

Experiment Design
To test the behavior of the platoons resulting from the proposed control approach, trajectories on a corridor with two signalized intersections are simulated, taking into account the signal settings, the lane length between two adjacent intersections, the speed limit, and the numbers of vehicles in the controlled platoon and in the queue. Four scenarios and a baseline scenario are designed to verify the characteristics of platoon splitting, merging, decelerating, accelerating, stopping, and queue discharging. The control effects on the fuel savings are revealed by comparing the total fuel consumptions of all controlled vehicles within the simulation horizon. Hereafter, the intersection in the upstream direction on the arterial is referred as the upstream intersection, and the intersection in the downstream direction is considered as the downstream intersection.
Two pre-timed signal settings are designed to test the workings of the red phase term, that is, the opposite and overlapped signal settings, as shown in Figure 2. The length of the effective green phase is 30 s in both settings, whereas the effective red phase is 30 s and 20 s in opposite and overlapped signal settings, respectively. Therefore, the simulation horizons are 90 s and 80 s separately in different signal settings. The prediction horizon is selected to be 10 s, because the influence of the zero terminal cost is negligible with respect to 5 s and larger prediction horizon (37).
In To test the performances taking signal settings and the anticipation time of the red phase into account, four scenarios and a baseline scenario are designed. These scenarios are appropriate to verify the feasibility of the platoon trajectory control approach in relation to the applications on an arterial with intersections. The characteristics of platoon splitting, merging, decelerating, accelerating, stopping, and queue discharging in all the scenarios provide insights into the effectiveness of the control approach. The benefits on fuel savings are explored in all scenarios. Similar settings (e.g., the number of controlled vehicles, vehicle queues, the number of multiple intersections, and the signal timing plans) can be implemented easily in the same way. The cost weights are tuned in Scenario 1 and then are applied in other scenarios. The parameter values in the simulation are detailed in Table 1. The choices for the parameter values mostly come from previous work (29). In our experiment settings, the time step is 1 s, which means delays under 1 s have no effect on the optimal trajectories.
The baseline scenario is presented under the opposite signal setting without anticipating the red phase. The anticipation of the red phase under pre-timed signal control is removed, thus the virtual vehicle term is added just at the beginning of the red phase. The objective of this baseline scenario is to obtain insights of the validity of the red phase (virtual vehicle) term, which is similar to the application in previous work (24).
In the forthcoming four scenarios, the anticipation time of the red phase is implemented at the beginning of the current signal cycle. Anticipating the red indication before the start of the red phase is supposed to outperform the baseline scenario where no anticipation exits (e.g., saving more fuel). Scenario 1 is simulated under opposite and pre-timed signal plan with anticipating the red phase. The comparison between the baseline scenario (no anticipation) and Scenario 1 (anticipation from the beginning of the current signal cycle) can explore the benefits of anticipating the red phase in the proposed control approach. Scenario 2 is designed under overlapped and pre-timed signal settings, which can prove the workings of the adjustment in signal settings under pre-timed signal control. In Scenario 3, the actuated signal is included in the MPC closed-loop. The length of green phases increases 5 s whereas the red windows decrease 5 s adaptively based on the overlapped signal settings. Scenario 4 updates the signal plan based on the initial overlapped signal setting according to the oversaturated traffic flow. The lengths of the red (and green) phases in sequence are 15 s and 18 s (17 s and 20 s) at the downstream intersection, and the counterparts are 18 s (27 s and 25 s) at the upstream intersection. The last two scenarios aim to investigate the workings of the proposed control approach under the actuated signal plan. It is assumed that the platoon controller receives the actuated signal plan after the first prediction horizon, that is, 10 s after the beginning of the signal cycle.

Platoon Performance
The aforementioned scenarios are simulated to evaluate control effects based on trajectory analysis, as depicted in    The maximal throughputs are determined first when removing the red phase penalty, as discussed in the previous section. To be noted, the iPMP approach is more efficient on computational time, compared with the solver used in Liu et al. (29) based on the optimal control framework, in spite of similar experiments. The remainder of this section analyzes the platoon performances and spacing gap in each scenario. The advantages of the proposed control approach are discussed in comparison with the baseline scenario.
Tuning Cost Weights. First, the cost weights are tuned to gain insights of optimal trajectories based on Scenario 1. Analysis of Baseline Scenario. The baseline scenario represents the situation that no anticipation of the red phase is provided under the opposite signal setting using virtual vehicle term, as shown in Figure 3. The total fuel consumptions of all vehicles within the simulation horizon are 1888.3 ml (0.0606 ml/m) in this baseline scenario. The vehicles in the queue at the upstream intersection (vehicle 9-11) start from 0 speed, whereas other vehicles begin with the initial speed of 15 m/s. The first 12 vehicles are able to pass the downstream intersection, and vehicles 9-20 leave the upstream intersection. These vehicles depart the intersections with accelerating to the limit speed v max , so the maximal throughput can be guaranteed.
Although the red phase is not anticipated, the vehicles are able to stop in front of the stop-line but with drastic decelerations at the beginning of the red phase (i.e., for vehicle 21 at the upstream intersection at 30 s and for vehicle 13 at the downstream intersection at 60 s). Without the prevision of red phases, the passing vehicles (e.g., vehicles 1-8) are optimized to decelerate during the red phase and then accelerate suddenly at the beginning of the green phase, which causes more fuel consumption. In addition, the stops during the red phase cannot be avoided if the vehicles have to catch the next green phase. To be noted, vehicle 9 decelerates during 38 s to 43 s to keep the safe gap when merging with the preceding platoon. The same holds for vehicle 12 during 10-13 s.
Analysis of Scenario 1. Scenario 1 is simulated under the opposite signal setting but with anticipating the red phase at the beginning of the current signal cycle. The total fuel consumption of all vehicles is 1854.9 ml (0.0579 ml/m) in Scenario 1, which is smaller than the counterpart in the baseline scenario. As shown in Figure 4 The differences in trajectory performances between the baseline scenario and Scenario 1 prove the benefits of anticipating the red phase in the proposed control approach. The sharp decelerations and stops facing the red phase are avoidable, and more fuel savings are verified in Scenario 1.
Analysis of Scenario 2. In Scenario 2, the overlapped signal setting is presented, as depicted in Figure 5. The total fuel consumption of all vehicles is 1736.2 ml (0.0550 ml/ m) in Scenario 2. Although the trajectory performances in Scenario 2 keep the same features as in Scenario 1 apart from the signal setting, Scenario 2 validates the flexible characteristic of the control approach in relation to changes in signal settings under pre-timed signal plan.
Analysis of Scenario 3. Scenario 3 explores the potential to implement the proposed approach under the actuated signal plan. The initial signal plan is the same as the overlapped signal settings. However, the signal plan is updated in the MPC closed-loop after the first prediction horizon (10 s) in the signal cycle. The lengths of green phases change with an increase of 5 s, and the lengths of red phases vary with a decrease of 5 s. The total fuel consumption of all vehicles is 1879.2 ml (0.0550 ml/m) in Scenario 3. The optimal trajectories depicted in Figure 6 prove the feasibility of the control approach in relation to application in actuated or adaptive signal plans. Analysis of Spacing Gap. The spacing gaps of all controlled vehicles can be categorized into four groups, that is, the splitting gaps, the stopping gaps, the following gaps, and the merging/catching gaps. The splitting gaps aim to reflect the increases in gaps resulted from the red indication, that is, the gaps between the first-stopping vehicles and the immediately preceding vehicles. For other stopping vehicles behind the first-stopping vehicles, the stopping gaps can describe the gaps between two adjacent stopping vehicles. Table 2 details the vehicle sequence number (represented by V) in relation to the firststopping vehicles, the splitting gaps, and the stopping gaps under four scenarios at two intersections. The following gaps account for gaps between vehicles that can pass the downstream intersection during the first green phase. The merging or catching gaps are proposed to capture declines in spacing owing to the signal settings and the initial position settings, namely, the gaps between vehicles 8 and 9 and between vehicles 11 and 12. The differences between the merging gaps and the catching gaps are whether the gaps drop into the following gaps within the horizon. It is noted that the merging/catching gap and the splitting gap may occur on a certain vehicle sequentially under different signal phases, for example, in Scenario 2, Scenario 3, and Scenario 4.
To explore the performance of spacing, the gaps between two adjacent vehicles under four scenarios are illustrated in Figure 8. The vertical ordinate of the spacing subfigures is presented compactly by way of logarithmic scale. The four spacing gap categories are depicted in different colors and line types. It can be concluded that the spacing gaps are in accordance with the system design, because the space gaps satisfy the safe requirement over the simulation horizon in all scenarios, and the spacing gaps fluctuate with changes in splitting and merging performances and signal changes.
There are general characters in all scenarios. The initial space gaps are 5 m for queuing vehicles at the standstill condition, and 35 m for the nonstatic vehicles. The maximal following gap is 45 m, which is calculated using v max t min + s 0 + l i . Taking into account the speed constraint which limits the controlled speeds being equal to or lower than the maximal speed, the following and stopping gaps cannot exceed the maximal following gap (45 m). The stopping gaps of stopping vehicles decline during the red phases at the upstream and downstream intersections, as the two declined trends of dashed lines in subfigures a-d of Figure 8. The depths of the declines in stopping gaps vary under different scenarios as a result of various red phase lengths. Longer red phase lengths, such as under pre-timing signals in Scenario 1 and 2, give rise to deeper drops. In addition, the merging gaps between vehicles 8 and 9 increase slightly at the beginning of the horizon in all scenarios, because vehicle 9 needs acceleration to pass the upstream intersection from the stationary condition whereas vehicle 8 is moving forward.
Taking Scenario 1 as example, the merging gaps between vehicles 8 and 9 and between 11 and 12 fall below the maximal following gap (45 m), which means vehicles 9 and 12 merge with the predecessors into platooning. As shown in the subfigure (a) of Figure 8, the splitting gap between vehicles 12 and 13 rises when vehicle 13 confronts the red indication at the downstream intersection. Vehicle 21 decelerates facing the red phase at the upstream intersection, resulting in the splitting gap, and then accelerates to catch up with the vehicles in front during the subsequent green phase, causing the catching gap. The same explanation holds for other scenarios.

Conclusions and Future Work
In this study, a flexible CAV trajectory control approach is proposed on arterials with signalized intersections based on the MPC framework. The throughput is first maximized during the green phase, and multiple criteria of ride comfort, travel delay, and fuel consumption are optimized after that, subject to linear constraints on acceleration and speed. The safe following requirement is formulated as a penalty in the running cost to regulate vehicles following at a safe gap to the predecessors. The red phase is represented by keeping the safe gap with a standstill virtual vehicle at the stop bar, and it can also be anticipated by the first-stopping vehicle since the beginning of the signal cycle. The control approach is flexible in incorporating platoon merging, splitting, stopping, and queue-discharging characteristics. Simulation under four scenarios verified the performance of the approach.
The simulation results show that the red phase term with anticipation works better than the case where no anticipation is provided. The performance of the control approach also demonstrates its flexibility in relation to application in different settings, that is, changes in signal parameters under pre-timed signal plan and actuated signal plan. Further research should aim to incorporate the adaptive signal control and the trajectory control in a unified framework.

Author Contributions
The authors confirm contribution to the paper as follows: study conception and design:  Wang. All authors reviewed the results and approved the final version of the manuscript.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research presented in this paper is funded by the Chinese Scholarship Council (CSC) under Grant number 201706320313.

Data Accessibility Statement
Data sharing is not applicable to this paper as no new data were created or analyzed in this study.