Multi-robot path planning using an improved self-adaptive particle swarm optimization

Path planning is of great significance in motion planning and cooperative navigation of multiple robots. Nevertheless, because of its high complexity and nondeterministic polynomial time hard nature, efficiently tackling with the issue of multi-robot path planning remains greatly challenging. To this end, enhancing a coevolution mechanism and an improved particle swarm optimization (PSO) algorithm, this article presents a coevolution-based particle swarm optimization method to cope with the multi-robot path planning issue. Attempting to well adjust the global and local search abilities and address the stagnation issue of particle swarm optimization, the proposed particle swarm optimization enhances a widely used standard particle swarm optimization algorithm with the evolutionary game theory, in which a novel self-adaptive strategy is proposed to update the three main control parameters of particles. Since the convergence of particle swarm optimization significantly influences its optimization efficiency, the convergence of the proposed particle swarm optimization is analytically investigated and a parameter selection rule, sufficiently guaranteeing the convergence of this particle swarm optimization, is provided in this article. The performance of the proposed planning method is verified through different scenarios both in single-robot and in multi-robot path planning problems. The numerical simulation results reveal that, compared to its contenders, the proposed method is highly promising with respect to the path optimality. Also, the computation time of the proposed method is comparable with those of its peers.


Introduction
Thanks to its significance in motion planning and cooperative navigation, the multi-robot path planning problem has recently aroused great research interest of researchers from the community of robotics. 1 Generally, the multi-robot path planning issue aims to generate an optimal or near-optimal obstacle-free path for each single robot from a start location to a destination location in a given obstacle-rich workspace, meanwhile optimizing the global cost function of the overall system. 2 However, the path planning problem is naturally proven to be a nondeterministic polynomial time hard (NP-hard) issue, which results in efficiently coping with this problem being difficult. 3 Besides, since the space coordinations among robots need to be considered, the multi-robot path planning problem is more complicated than the single-robot path planning problem.
Basically, the issue of multi-robot path planning can be solved based on centralized or decentralized manners. Under the centralized scenario, a central path planner considers all interactions and constraints among all robots and simultaneously yields a candidate path for each robot. This would lead the centralized path planner to the production of a complicated configuration space and increase the likelihood of failure in generating paths. 4 On the other hand, the decentralized planning assigns an independent planner for each robot. In such a case, each robot strives to separately generate a path in its own configuration space, which can not only simplify the planning issue to enhance the possibility of searching feasible paths for all robots but also save the computation time of the planner. 4 In virtue of the aforementioned advantages of the decentralized planning, developing efficient decentralized path planning methods becomes one of the most popular streams in the multi-robot path planning field. Implementing the lattice structure to be the road map, Yu et al. developed a decentralized algorithmic framework to deal with the multi-robot path planning problem in Yu and Rus. 5 Based on the generalized connectivity, Solana et al. 6 have addressed the issue of producing paths for a team of robots in a cluttered workspace. Recently, a decentralized model predictive control based multi-robot path planning algorithm has been proposed by Tallamraju et al. in their work. 7 Integrating a coevolution strategy with genetic algorithm (GA), a coevolutionary improved GA (CIGA) has been developed for global path planning of multiple mobile robots in Qu et al. 8 To efficiently tackle with the path planning issue of multiple mobile robots in continuous environments, an enhanced GA has been mixed with artificial potential field algorithm to solve the multi-robot path planning in Nazarahari et al. 9 As one of the most well-known evolutionary algorithms, probably thanks to its formidable capability over NP-hard problems, easy implementation, and fast convergence speed, particle swarm optimization (PSO) algorithm has recently drawn increasing attention in the field of multirobot path planning. [9][10][11][12] Unfortunately, the overall performance of the conventional PSO is constrained by its insufficient capability in adjusting the global and local search capabilities as well as the ease of plugging into stagnation in the case where the particle cannot search a position better than the previous ones. 13 -17 In addition, when developing a PSO-based optimization method, the convergence of PSO remains paramount and significantly affects the performance of PSO. 18 Nevertheless, like the majority of the population-based algorithms, the stochastic nature of PSO leads the analytical convergence investigation on PSO to be challenging. 19 This leads to the hypothesis that the optimization performance of PSO over the multi-robot path planning problem could be amended via remedying these aforementioned deficiencies of the conventional PSO.
Combining standard PSO 2011 (SPSO 2011) 20 with evolutionary game theory (EGT), 21,22 a self-adaptive evolutionary game-based PSO (SAEGBPSO) algorithm is first proposed in this article. Attempting to mitigant the stagnation in SAEGBPSO, particles in this algorithm update their position and velocity information based on moving rules defined in SPSO 2011. To well trade-off the global and local search abilities, a new self-adaptive parameter adaption rule determined by the evolutionary strategy of EGT is proposed to update the three control parameters of particles in SAEGBPSO. Besides, a convergence-guaranteed parameter selection principle is provided for the proposed PSO algorithm followed after the analytical convergence investigation of this algorithm. Leveraging SAEGBPSO and the coevolutionary strategy, 8,23 a SAEGBPSO-based decentralized path planner is then developed in this study to generate collision-free and intercollision-free paths for multiple robots and achieve simultaneous arrivals of all robots to the target position. In the designed path planning method, each robot assigns a SAEGBPSO subpopulation and searches its path in the given workspace based on two steps. Firstly, merely considering the physical constraints and the workspace information of each robot, each SAEGBPSO subpopulation searches several feasible and obstacle-free candidate paths for the associated robot. Next, only considering the space and time constraints among different robots, each SAEGBPSO subpopulation exchanges its information with the others among an information interaction operator, such that an elite intercollision avoidance path is generated for each robot to achieve simultaneous arrival of each robot to the destination position.
The remainder of this article is organized as follows. The second section formulates the studied multi-robot path planning problem. The third section mainly states the proposed SAEGBPSO algorithm. The fourth section investigates the analytical properties, mainly including the convergence analysis and the convergence-guaranteed parameter selection principle, of the proposed SAEGBPSO. The coevolution-based SAEGBPSO method for the multi-robot path planning issue is presented in the fifth section. The sixth section conducts the numerical simulations and comparisons. The seventh section ends this article by drawing conclusions.

Modeling of the workspace
Under the background of multi-robot cooperative operation, the method stated in Dun-wei et al. 24 is used to model the workspace of robots in virtue of its easy implementation and small sensitivity to shapes of obstacles. As shown in Figure 1, a global coordinate system, represented by o xy, is first established, where st and ta, respectively, are the start and destination positions of a robot. The x-axis of o xy is assumed to coincide with line st ta. The y-axis of o xy is supposed to be perpendicular to st ta. In this global coordinate system, line st ta is equally divided into ns þ 1 subsections by ns waypoints. Here, ns is a constant parameter that is predefined by users or decision-makers. As displayed in Figure 1, after gaining a set of vertical lines l 1 ; l 2 ; . . . l nsþ1 , a candidate path for each robot, represented by ph ¼ ½st; w 1 ; w 2 ; :::; w ns ; ta, can be then searched on lines l 1 ; l 2 ; . . . l nsþ1 .

Problem formulation
This article focuses on the scenario where multiple robots cooperate with each other to execute some complex tasks in a hazard-rich workspace. To enhance the success possibility of performing tasks, the generated paths for robots need to satisfy the following conditions: (a) the generated paths need to be obstacle-free; (b) the generated paths ought to be intercollision-free; and (c) the generated paths must guarantee the simultaneous arrivals of robots to the destination location. Moreover, the velocity of each robot i is assumed to be bounded within ½V min , respectively, stand for the minimum and maximum velocities of robot i. During navigating a robot to the destination location, the velocity of each robot varies within its velocity boundary. Also, during each robot moving to the destination location, the yaw angle and moving distance of each robot need to stay within its maximum yaw angle and distance constraints, respectively.
In this study, the total path length of the multi-robot system, that is, the summation of each robot's path length is regarded as the global objective function in the studied path planning issue. Assume that w i;0 and w i;nsþ1 , respectively, denote the start and destination locations of the ith robot. Let ph i ¼ ½w i;0 ; w i;1 ; :::; w i;ns ; w i;nsþ1 denote the generated path for the ith robot, where w i;k ðk ¼ 0; 1; :::; ns þ 1Þ denote the kth waypoint of the generated path for the ith robot. The multi-robot path planning issue we concerned aims to generate an obstacle-free and collision avoidance path ph i for each robot i to guarantee the simultaneous arrival of each robot to the destination, meanwhile minimizing the overall path length of the multi-robot system. Thus, the multi-robot path planning issue can be mathematically established as follows finding : ph i ¼ ½w i;0 ; w i;1 ; :::; w i;ns ; w i;nsþ1 Subject to i;k max i ; 1 k ns ð3Þ w i;k w i;kþ1 2 semiÀfree workspace; 0 k ns ð4Þ T i \ T j \ ; ::::; \T N = 2 nonÀnull ð6Þ where where N denotes the total number of robots. L i and L max i denote the path length and the maximum moving distance constraint of the ith robot, respectively. i;k represents the yaw angle of the ith robot at the kth path segment along the generated path. max i stands for the maximum yaw angle constraint of the ith robot. T i is arrival time of the ith robot to the destination location. disðw i;k ; w i;kþ1 Þ denotes the Euclidean distance between waypoints w i;k and w i;kþ1 . x i;k and y i;k are, respectively, the x-axis and y-axis values of waypoint w i;k . "Semi-free workspace" in (4) indicates the entire space which is uncovered by any obstacle in the whole workspace. "Null" in (5) defines that the intersection of the paths of any two different robots is an empty set, requiring that the generated paths of any two different robots need to be intercollision-free. "Non-null" in (6) denotes that the intersection of the arrival time of any two different robots is a nonempty set, indicating that the generated paths of any two different robots can guarantee the simultaneous arrival of different robots to the destination location.

Statement of the proposed SAEGBPSO
To achieve the nonstagnation in SAEGBPSO, particles in this algorithm follow the moving rules defined in SPSO 2011 to update their positions and velocities as follows 20 where w is a real coefficient, standing for the inertia weight parameter of the particle. X m ðtÞ and V m ðtÞ, respectively, denote the position and velocity of particle m at iteration t. X 0 m ðtÞ indicates a position which is randomly generated in a hypersphere.
Suppose that HPðBC m ðtÞ; jjBC m ðtÞ À X m ðtÞjjÞ denotes the hypersphere for particle m at iteration t, where BC m ðtÞ and jjBC m ðtÞ À X m ðtÞjj denote the isobarycenter and radius of the hypersphere, respectively. Similar to SPSO 2011, the coordination of BC m ðtÞ in SAEGBPSO is calculated as where c 1 and c 2 are two positive parameters, denoting the cognitive and social acceleration parameters of the particle, respectively. Pbest m ðtÞ and GbestðtÞ represent the personal best position of particle m and the global best position of the swarm at iteration t, respectively. After obtaining the hypersphere HPðBC m ðtÞ; jjBC m ðtÞ À X m ðtÞjjÞ for particle m based on (11) at each iteration, the random position point X 0 m ðtÞ in (9) is then randomly produced in this hypersphere. As given in (9), since the randomly generated position X 0 m ðtÞ is added to the velocity of the particle to be a disturbance, similar to SPSO 2011, particles in SAEGBPSO can keep searching in the search space with a non-null velocity, which could thus avoid particles plugging into stagnation. For more detailed information of SPSO 2011, the reader is referred to Maurice. 20 Despite being able to achieve nonstagnation in SPSO 2011, this algorithm could not well adjust its global and local search capabilities due to the fact that the three main control parameters (i.e. the inertia weight w, the cognitive acceleration parameter c 1 and social acceleration parameter c 2 ) remain constant and there exists no distinguish between the cognitive and the social acceleration parameters. 18 To mitigant this flaw in our proposed SAEGBPSO, a selfadaptive parameter updating rule determined by the evolutionary stables strategy (ESS) of EGT 21,22 and the iteration number of the particle is developed in this study. Prior to introducing the proposed self-adaptive parameter updating rule in SAEGBPSO, the analogy between EGT and SAEGBPSO is first described as follows: (a) the players or individuals in EGT analogize particles in SAEGBPSO; (b) every particle in SAEGBPSO has three candidate strategies, these are, respectively, moving only according to its inertia weight, just following its personal best memory, and merely following the global best memory of the swarm; and (c) the payoff matrix of EGT is consisted by the mean performance obtained by each particle in SAEGBPSO following a specific strategy.
Let e 1 , e 2 , and e 3 represent the three aforementioned strategies, respectively. The payoff matrix applied in this article is then calculated by 19 where pf ðe l Þ (l ¼ 1,2,3) represents the payoff that the particle obtains by only following the lth strategy. In EGT, the replicator dynamics equations (RDEs) can be defined by a commonly accepted form as follows 22 where p l ðl ¼ 1; 2; 3Þ represents the probability distribution of strategy p l over the pure strategy e l . K is the payoff matrix. p ¼ ðp 1 ; p 2 ; p 3 Þ is the set of mixed strategies. Here, it is notable that each element in the set of mixed strategies satisfies: P 3 l¼1 p l ¼ 1 and 0 p l 1. At each iteration t, the ESS in EGT represents the ratio of each strategy when the population converges toward a stable point and is used to weight the average of the fitness gained by every particle. In this article, the ESS can be denoted as follows E ss ðtÞ ¼ ½Z 1 ðtÞ; Z 2 ðtÞ; Z 3 ðtÞ ð14Þ Subject to At each iteration t, the value of pf ðe l Þ ðl ¼ 1; 2; 3Þ is obtained based on the previous experience of each particle as follows pf ðe l Þ ¼ where t denotes the current iteration number of SAEGBPSO. FðX ðt 1 ÞÞ is the previously gained fitness value of the particle X at the previous iteration t 1 ð1 t 1 tÞ. For more details of EGT, the reader is referred to the literatures. 19,21,22 Once the pf ðe l Þ ðl ¼ 1; 2; 3Þ of every particle in SAEGBPSO is obtained based on (16), it is then applied to fill the payoff matrix given by (12). Once the payoff matrix is filled, the RDE defined by (13) is used to calculate the corresponding ESS, namely E ss ðtÞ ¼ ½Z 1 ðtÞ; Z 2 ðtÞ; Z 3 ðtÞ. After the obtainment of E ss ðtÞ ¼ ½Z 1 ðtÞ; Z 2 ðtÞ; Z 3 ðtÞ, the three ratios Z 1 ðtÞ, Z 2 ðtÞ, and Z 3 ðtÞ in this E ss are used to adaptively update the three aforementioned control parameters of particles in the proposed self-adaptive parameter updating rule in SAEGBPSO as follows where where subscripts "s" and "f" in each control parameter denote the initial and final values of the corresponding control parameter, respectively. t max is a predefined constant, denoting the maximum iteration number. d 4 is a sufficiently small positive parameter to avoid the denominator of b in (23) becoming zero (d 4 ¼ 1e À 05 in this article).
Here, it is noticeable that w s > w f , c 1s > c 1f , and c 2f > c 2s in the above self-adaptive parameter updating rule. Since the three ratios Z 1 ðtÞ, Z 2 ðtÞ, and Z 3 ðtÞ represent a stable search direction of a population, when these ratios are implemented to update the control parameters of particles in the proposed self-adaptive strategy in SAEGBPSO, particles could adapt the "shape" of the search space to optimize the search direction of the swarm as far as possible. Also, since the ESS potentially implies the stability nature of EGT, when the three ratios Z 1 ðtÞ, Z 2 ðtÞ, and Z 3 ðtÞ in ESS are implemented to update the control parameters of particles, they may could face the potential irregularity of the solution space to prevent them plugging into some local optimum. Thus, the implementations of the three ratios in the proposed self-adaptive strategy in SAEGBPSO may enhance the performance of the optimizer in finding high-quality solutions.
Also, it can be seen from (17) to (19) that the inertia weight and cognitive components of particles in SAEGBPSO decease, whereas the social component increases with the iteration number increasing, which implies that the global search powers of particles are likely to promote in the early stages of the evolution and the local search abilities of particles would enhance in the late of the evolution. Besides, it is evident from (17) to (19) that the inertia weight and the cognitive acceleration parameter decrease, whereas the change in the social acceleration parameter grows greater with b increasing. This may indicate that the global search abilities of particles in SAEGBPSO would be more retained in the case where the value of b remains relatively big. Based on (23), a large value of b means that the value of ðZ 1 ðtÞ þ Z 2 ðtÞÞ is relatively big. Since a relatively big of ðZ 1 ðtÞ þ Z 2 ðtÞÞ denotes that the search direction of the particle is more stable when the particle mainly adopts the strategies of its inertial and personal best experience, the swarm can benefit more in the case where each particle follows these two strategies. Thus, in the case where b is great, it is reasonable to increase the inertia weight and the cognitive acceleration parameter, so that the global search abilities of particles in SAEGBPSO can be enhanced. On the other hand, Z 3 ðtÞ has a relatively big value in the case where b is small, based on (23). In such case, the searches of particles can be more stable when particles follow the strategy of the global best experience of the swarm. Therefore, it is natural to increase the social acceleration parameter of the particle to enhance the local search ability of the proposed SAEGBPSO in the case where b is small.

Convergence investigation of SAEGBPSO
As stated previously, the convergence property of PSO remains an important issue when designing a PSO-based optimizer. The convergence investigation of PSO aims to find the control parameter boundaries to guarantee the convergence of the developed PSO optimizer. The convergenceguaranteed control parameter boundaries of different PSO algorithms could be different since different PSO algorithms may adopt different moving and control parameter updating rules. Thus, there exists necessity to theoretically analyze the convergence of different PSO algorithms.
Based on this logical flow noted above and inspired by some studies on the convergence analysis of their proposed PSO algorithms, such as Tang et al. and Cédric et al., 18,19 this study first investigates the convergence of SAEGBPSO, such that the convergence-guaranteed control parameter boundaries can be found. Based on the obtained convergence-guaranteed control parameter boundaries and the proposed control parameter updating rule defined by (17)- (22), this article then provides a convergenceguaranteed control parameter setting rule to sufficiently guarantee the convergence of the proposed SAEGBPSO.
In this section, the convergence of SAEGBPSO is investigated based on the deterministic model convergence analysis, 25 that is, under the situation where X 0 m ¼ BC m ðtÞ. Recall that each particle in the proposed SAEGBPSO adopts the moving rule defined in (9)-(10) to renew its velocity and position information. Thus, without loss of generality and for simplicity, the subscript m in each variable in (9)-(10) can be omitted in terms of the convergence investigation of the proposed SAEGBPSO. Also, please note that since every dimension in the velocity and position vectors of each particle in SAEGBPSO is updated independently from the others in the moving rules defined in (9)-(10), the moving rule defined by (9)-(10) in SAEGBPSO can be simplified and rewritten into a one-dimensional matrix form as follows Here, it is important to note that since the proposed SAEGBPSO is rewritten into the dynamic system denoted by (24), the convergence stability of the proposed SAEGBPSO is equivalent to that of the dynamic system denoted by (24). Thus, if the convergence stability of the dynamic system denoted by (24) is analytically investigated, the convergence of the proposed PSO algorithm is then obtained. Also, note that the dynamic system (24) is a first-order constant-coefficient nonhomogeneous difference equation. There are many mature methods to solve this difference equation. The characteristic equation could be one of the most typical and popular methods. The characteristic equation of system (24) is easily obtained as Then, the two characteristic roots, denoted as h 1 and h 2 , to (27) are easily gained as According to the standard results of the dynamic system theory, system (24) converges iff ("iff" denotes "if and only if" in this study) Because h 1 and h 2 can be two real or complex roots in (28), both these two cases are investigated separately below.
1. Case 1. Both h 1 and h 2 are complex, denoted as Proof. Clearly, we have that Based on the classical mathematical approach, the proof of Lemma 1 can be easily completed.
Proof. Because the magnitude of an imaginary number , where H 1 and H 2 , respectively, stand for the real and imaginary parts this imaginary number, for h 1;2 2 C, we have Therefore, for h 1;2 2 C, system (24) converges, iff ffiffiffi ffi For h 1;2 2 C, (30) holds, according to Lemma 1. Hence, considering conditions that h 1;2 2 C and Maxfjh 1 j; jh 2 jg < 1 together, system (24), that is, SAEGBPSO, converges under the situation where h 1 and h 2 are complex, iff For h 1;2 2 C, the convergence domain of SAEGBPSO on different control parameter plans is demonstrated in Figure 2.
Lemma 3. The two roots h 1;2 2 R, iff Proof. Clearly, for (27), both h 1;2 are two real roots, iff The proof of Lemma 3 can be easily completed by expanding (37).
Next, in the case where h 1;2 2 R, conditions on w and c, guaranteeing the convergence of system (24), need to be discovered.
Finally, integrating the conclusion drawn in Lemma 2 with that in Lemma 4, it allows us to conclude that system (24), namely, SAEGBPSO, converges, iff The real convergence region of the SAEGBPSO on different parameter plans is shown in Figure 4.

Parameter selection principle for SAEGBPSO
Please recall that, through the analytical convergence investigation of SAEGBPSO conducted in the above subsection, it is conclusive that if and only if conditions given by (44) are satisfied, SAEGBPSO converges. Now, for the sake of sufficiently satisfying the conditions given by (44), so that the convergence of SAEGBPSO can be sufficiently guaranteed, this subsection provides a convergenceguaranteed parameter selection principle for this algorithm.
Lemma 5. The convergence of SAEGBPSO can be sufficiently guaranteed, only if the three control parameters shown in the proposed self-adaptive updating strategy defined by (17)-(22) satisfy the following conditions Proof. Since c ¼ ðc 1 þ c 2 Þ=3, the sufficient and necessary condition given by (44) for the convergence of SAEGBPSO can be rewritten as follows, namely, SAEGBPSO converges, iff It is trivial from (18), (19), (21), and (22) that (18) to (20), it is clear that w f w w s , c 1f c 1 c 1s , and c 2s c 2 c 2f . Therefore, we can obtain that Notice that the right-hand side conditions given in (47) denote the necessary and sufficient condition for the convergence of SAEGBPSO. Thus, the proof of Lemma 5 can be easily completed according to (47).
Note that the condition given by (45) is a sufficient condition for the convergence of SAEGBPSO, which means that only if the condition given by (45) is satisfied, the convergence of SAEGBPSO can be sufficiently guaranteed. Also, it is noticeable that, since w s , w f , c 1s , c 1f , c 2s , and c 2f are predefined constant parameters, the condition given by (45) can satisfy easily through setting proper values of these predefined parameters. Here, the values of these mentioned parameters are empirically suggested as w s ¼ 0:9, w f ¼ 0:1, c 1s ¼ c 2f ¼ 2:5, and c 1f ¼ c 2s ¼ 0:1. The convergence trajectories of the position and velocity of the particle in SAEGBPSO are visualized in Figure 5 under the suggested parameter settings mentioned here.
The coevolution-based SAEGBPSO method for multi-robot path planning As stated above, a coevolutionary strategy presented in Qu et al. and Kala 8,23 is incorporated with SAEGBPSO to solve the mutli-robot path planning problem based on a decentralized manner. In the developed planning method, each robot is assigned a SAEGBPSO subpopulation. At each iteration, only considering the physical constraints (i.e. the maximum moving distance and yaw angle constraints) of the robot and the path safety constraint caused by obstacles, each subpopulation first separately and independently evolves in its own configuration to find EP candidate paths for the associated robot. Then, each   subpopulation reports the EP candidate paths to an information interaction operator, so that each subpopulation can exchange information with the others. During the information exchange stage, considering the space and time constraints among different robots, an elite obstacle-free and intercollision avoidance path that can achieve simultaneous arrivals of robots to the destination position is selected from the EP candidate paths for each SAEGBPSO subpopulation.
Obviously, the value of EP greatly influences the performance of the path planner. The greater the value of EP is, with the higher possibility the planner can find an optimal path for the robot; however, the more computation time the planner consumes. 8,23 Usually, compromisingly considering the path optimality and the computation time, EP is set to be a predefined constant. 8,23 In this study, similar to, 8,23 the exact mechanisms or impacts regarding how the value of EP affects the performance of the path planner is uncovered, which could be considered as an extension of this paper in the near future. Besides, we must highlight that, similar to other terrific studies, such as literatures, 8,23, 26-28 using coevolutionary mechanism to handle multiple vehicles path planning, the coevolution-based SAEGBPSO method can also solve the single-robot path planning problem in the case where EP ¼ 1 and only one SAEGBPSO subpopulation is considered.
The flowchart of the coevolution-based SAEGBPSO for the multi-robot path planning issue is illustrated in Figure 6. In the herein subsections, the ways of encoding particles and handling constraints as well as the design of the information interaction operator are presented during the application of the designed path planning method on the multi-robot path planning problem.

Encoding particles
The aim of encoding the particle is to find a mapping between the particle's position and the potential solution to an optimization problem. As depicted previously, since the purpose of the studied path planning problem is to produce a path for each robot, constructed by waypoints w 1 ; w 2 ; . . . ; w ns that are randomly sampled in lines l 1 ; l 2 ; . . . ; l ns Because the x-axis values of lines l 1 ; l 2 ; . . . ; l ns are given beforehand after the establishment of the workspace of robot, values of waypoints w 1 ; w 2 ; . . . ; w ns lie only with the y-axis values decided by lines l 1 ; l 2 ; . . . ; l ns . Thus, these y-axis values, represented by ðy w i;0 ; y w i;1 ; . . . ; y w i;ns ; y w i;nsþ1 Þ, are used to encode particles in the designed path planning method, where y w i;0 and y w i;nsþ1 are the y-axis values of the start and destination positions of the ith robot, respectively. For guaranteeing each particle to search within the workspace, the following saturation strategy is implemented to modify the designing variable y w i ði ¼ 1; :::; nsÞ in the case where y w i locates outside the workspace as 3 where width is the width of the workspace, as shown in Figure 1.

Constraints handling
As given in (2)-(6), the studied path planning problem is modeled into a constrained optimization. To efficiently solve this problem and guarantee the feasibility of the generated path, the mission concerning how to tackle with constraints of the problem must address. To deal with the physical and path safety constraints given by (2)-(4), the total constraint violation degree of the physical and path safety constraints of each particle m in designed path planning method is calculated as follows .... Figure 6. The flow chart of coevolution-based SAEGBPSO method for multi-robot path planning. SAEGBPSO: self-adaptive evolutionary game-based particle swarm optimization.
where D s m , D f m , and D y m stand for the degrees that particle m violates the path safety constraint caused by obstacles, maximum moving distance, and yaw angle constraints, respectively. Given Nob obstacles, D s m can be obtained by Given the maximum moving distance constraint L max i , D f m is calculated as follows where L m denotes the path length of particle m and is calculated according to (7). Given the maximum yaw angle constraint max , D y m can be obtained as follows where ns is the predefined parameter described in "Modeling of the workspace" section. i denotes the ith yaw angle along the path represented by particle m and is calculated based on (8).
After the calculation of the total constraint violation degrees of the physical and path safe constraints using (49)-(52), the feasibility-based rule presented in Deb et al. 29 is then adopted to select the elite solution between any two candidate solutions in each SAEGBPSO subpopulation. In a minimization optimization problem, this rule can be summarized as (a) the solution having smaller fitness value is preferred over the solution with larger fitness value in the case where any two different candidate solutions have a same constraint violation degree and (b) the solution owing smaller constraint violation degree dominates the solution with greater constraint violation degree in the case where any two different candidate solutions have different constraint violation degrees.
Since the fitness value and the constraint violation degree of each candidate solution (or particle) are compared separately in the above feasibility-based rule, no additional penalty or control factor is needed when using this technology to deal with constraints, which can thus reduce the burden of the optimizer. 29 Moreover, despite violating partial constraints, some nonfeasible candidate solutions may also involve some valuable information of the solution space. In such a case, the diversifications of solutions and the possibility of searching high-quality solutions can be increased when considering those nonfeasible solutions in the feasibility-based rule. 30 In addition to the physical constraints and path safety constraint, as shown in (2)-(4), the space and time constraints given by (5)-(6) also need to be handled in the studied multi-robot path planning problem. Here, please recall that the space and time constraints defined by (5)- (6) require that the generated paths for any two different robots are free of intercollision and can achieve simultaneous arrival of each robot to the destination location, respectively.
For dealing with the space constraint given by (5), the space constraint violation degree of any two different paths ph i and ph j is calculated as follows 28 where d i;j denotes the minimum flight distance between two any different paths ph i and ph j . As depicted in "Problem formulation" section, since each robot i moves along its path ph i with velocity constraints ½V min i ; V max i , the range of the arrival time of this robot to the destination position is then determined by To achieve simultaneous arrivals of any two different robots i and j to the destination location, only the condition: ; L j =V min j = 2non À null needs to be satisfied. Therefore, for tackling with the time constraint given by (6), the time constraint violation degree between any two different paths ph i and ph j can obtain as follows 28 where V min

Design of the information interaction operator
Only considering the space and time constraints among different robots, the primary mission of the information interaction operator is to determine an elite path from the generated EP candidate paths for each SAEGBPSO subpopulation. During the design of the information interaction operator, the first task needed to sort out is the model of the information interaction operator. So far, many existing models, such as the cooperative coevolution model, competitive coevolution model, and island model, can be used as options. 8 Thanks to its superiorities of maintaining solution diversifications and employing parallelism, 8 the island model, as visualized in Figure 7, is applied to build the information interaction operator model in this study. For more information about the island model, the reader is referred to Qu et al. 8 The second task in the design of the information interaction operator is to decide the criterion for choosing the elite path from EP candidate paths found by each SAEGBPSO subpopulation. Here, two criteria are used to determine the elite path for each subpopulation at each iteration. The first criterion is the summation of the space and time constraint violation degrees of each robot. Note that the space and time constraint violation degrees are calculated by (53) and (54), respectively. The second criterion is the path length of each robot. After the determinations of the model of information interaction operator and criteria for choosing the elite path, the algorithmic steps of the information interaction operator are described as follows: Step 1: Based on the island model, for a given SAEGBPSO subpopulation A, randomly select a different SAEGBPSO subpopulation B.
Step 2: Using the feasibility-based rule stated above, sort individuals in A and B in ascending order according to the total constraint violation degrees of the physical and path safety constraints of each individual. Suppose the sorted individuals in A and B are denoted as ða 1 ; a 2 ; :::; a EP Þ and ðb 1 ; b 2 ; :::; b EP Þ, respectively.
Step 3: Traverse each individual in ða 1 ; a 2 ; :::; a EP Þ. Compute the summation of the space and time constraint violation degrees of the current individual in ða 1 ; a 2 ; :::; a EP Þ as well as those of each individual in ðb 1 ; b 2 ; :::; b EP Þ.
Step 4: Select the individual in ða 1 ; a 2 ; :::; a EP Þ with least summation of the space and time constraint violation degrees as the elite path for subpopulation A. If multiple individuals in ða 1 ; a 2 ; :::; a EP Þ have the same least summation of the space and time constraint violation degrees, the individual with the least path length is selected as the elite path for subpopulation A to minimize the total path length of the multi-robot system.
The algorithmic steps of the coevolution-based SAEGBPSO method for the multi-robot path planning are summarized in Table 1. Note that the main loop of the developed path planning method does not exist until the iteration number t of each SAEGBPSO subpopulation reaches its given maximum iteration number t max .

Numerical simulations
To verify the proposed method, its performance is compared against with those of three evolutionary algoirthms: the CIGA, 8 SPSO 2011, 20 and the fitness-scaling adaptive chaotic PSO (FACPSO). 31 The simulation parameters for SAEGBPSO are set to be w s ¼ 0:9, w f ¼ 0:1, c 1s ¼ c 2f ¼ 2:5, and c 1f ¼ c 2s ¼ 0:1 based on the analysis results in "Parameter selection principle for SAEGBPSO" section. The simulation parameters of the three compared methods are extracted from their corresponding literature  1. Assign subpopulations and randomly initialize each subpopulation 2. Obtain Pbest and Gbest as well as E ss for each initial subpopulation 3. for each subpopulation do 4. while t t max do 5. obtain Z 1 ðt À 1Þ, Z 2 ðt À 1Þ, and Z 3 ðt À 1Þ and the fitness value FðXðt À 1ÞÞ of each particle 6. Calculate fpðe 1 Þ, fpðe 2 Þ, and fpðe 3 Þ for each particle based on (16) 7. Compute the payoff matrix K for each subpopulation based on (12) 8. Calculate Z 1 ðtÞ, Z 2 ðtÞ, and Z 3 ðtÞ for each subpopulation based on (13) 9. Update control parameters w, c 1 , and c 2 of each particle using (17)-(23) 10. Calculate the barycenter BCðtÞ of each particle based on (11) 11. Randomly obtain X 0 ðtÞ within the hypersphere HPðBCðtÞ; jjBCðtÞ À XðtÞjjÞ for each particle 12. Update the velocity of each particle based on (9) 13. Update the position of each particle based on (10) 14. Modify the position vector of each particle based on the saturation strategy given by (48) 15. Calculate the fitness value of each particle based on (7) 16. Calculate the constraint violation degree of each particle based on (49)-(52) 17. Update Bbest of each particle based on the feasibility-based rule 18. Preserve EP best candidate paths based on the feasibilitybased rule 19. Send the EP candidate paths to the information interaction operator.

After the information interaction operator, determine
Gbest of the current subpopulation 21. Increase the iteration number t by 1 22. end while 23. end for 24. Output Gbest of each subpopulation to navigate the associated robot SAEGBPSO: self-adaptive evolutionary game-based particle swarm optimization.
and presented in Table 2. In each numerical simulation, the final path of each subpopulation in each method is output after 40 particles evolve 300 iterations. As highlighted previously, since the proposed method can not only solve the multi-robot path planning problem but also the single-robot path planning issue in the case where EP ¼ 1 and the number of SAEGBPSO subpopulation equals to 1, the feasibility and effectiveness of the proposed method are verified in such two cases in the following contents of this section. Also, in each numerical simulation conducted below, SAEGBPSO is referred to be the coevolution-based SAEGBPSO method for the convenience of writing.

Numerical simulations and comparisons on singlerobot path planning
Two numerical simulations under different planning scenarios are conducted in this subsection to evaluate the performance of the proposed method on single-robot path planning. The needed simulation parameters and workspace information of those two numerical simulations are displayed in Table 3. The generated paths and corresponding cost curves of different methods for these two numerical simulations are displayed in Figures 8 and 9, respectively. The simulation results of each method for these two numerical simulations are reported in Tables 4  and 5, respectively. From Figures 8(a) and 9(a), it is apparent that each planning method can generate an obstacle-free path for the robot, which can reflect the feasibility of these methods in the single-robot path planning problem. Besides, it is evident from Tables 4 and 5 that SAEGBPSO outperforms its contenders in terms of path optimality. Moreover, it can be easily observed from Tables 4 and 5 that SPSO 2011 and SAEGBPSO are, respectively, ranked the first and second among the four methods as far as the computation time is considered. Probably because the three control parameters of particles in SPSO 2011 are invariant, no additional computation resource is required to adjust these control parameters in this algorithm. Therefore, compared to the other three approaches, SPSO 2011 is the computationally cheapest.
However, it is important notice that, despite consuming the least computation time, SPSO 2011 provides the worst performance with respect to the path optimality, and even if the proposed SAEGBPSO is ranked the second in terms of the computation time, SAEGBPSO obtains the best performance in the path optimality, as confirmed in Tables 4 and 5. Also, it is worth mentioning from Tables 4 and 5 that the computation time difference between SPSO 2011 and SAEGBPSO is less than 3 s in these two numerical simulations, which may be neglected based on the fact that the studied path planning problem is executed off-line. Considering all concerns mentioned here, we can conclude that the proposed SAEGBPSO method dominates its opponents in terms of the path optimality. Besides, the proposed method is promising with respect to the computation time in the single-robot path planning problem.

Numerical simulations on multi-robot path planning
Two numerical simulations are conducted in this subsection to evaluate the feasibility and effectiveness of the proposed method on the multi-robot path planning. In the first simulation scenario, two robots are initially located at a same start position in a 100 m Â 80 m workspace and expected to move toward a same destination position. In the second simulation case, five robots are initially located at different positions in a 50 m Â 50 m workspace and required to arrive at a same destination position. The position information and physical constraints of each robot for these two simulations are presented in Tables 6 and 7,  respectively.  Tables 8 and 9 report the numerical simulation results of different methods for these two multi-robot path planning numerical simulations, respectively. Figures 10-13 visualize the generated paths, the corresponding cost curves, and the arrival time of robots to destination position searched by different methods for the first simulation case. Figures 14-17 display the generated paths, the corresponding cost curves, and the arrival time of robots to destination position searched by different methods for the second simulation case.
From Figures 10-17, it can be observed that each considered method can successfully generate an obstacle-free and intercollision avoidance path for each robot in each numerical simulation case, which, in a certain degree, reveals the feasibilities of these methods on the multi-robot path planning problem. From Tables 8 and 9, it is clear that the proposed method performs superior to its opponents in terms of the simultaneous arrival time and the total fitness value. Moreover, SPSO 2011 and the proposed method provide the best and second best performances over these two numerical simulations in terms of computation time, as confirmed in Tables 8 and 9.
Again, it is significant to note that although SPSO 2011 is the fastest method with respect to the computation time, this method provides the worst performance in terms of path optimality in these two multi-robot simulation cases,    SAEGBPSO: self-adaptive evolutionary game-based particle swarm optimization; CIGA: coevolutionary improved genetic algorithm; SPSO: standard particle swarm optimization; FACPSO: fitness-scaling adaptive chaotic particle swarm optimization. Bold values denotes the best result obtained with regarding to each metric.    PL: path length of each robot; MV: moving velocity of each robot; CIGA: coevolutionary improved genetic algorithm; SPSO: standard particle swarm optimization; FACPSO: fitness-scaling adaptive chaotic particle swarm optimization; SAEGBPSO: self-adaptive evolutionary game-based particle swarm optimization. Bold values denotes the best result obtained with regarding to each metric. and even if the proposed method is ranked the second in the computation time among the four methods, this proposed method outperforms the other three methods in the path optimality, as confirmed in Tables 8 and 9. Also, the difference of the computation time between our proposed method and SPSO 2011 is less than 6 s in each simulation scenario, as presented in Tables 8 and 9, which could be neglected due to the considered path planning problem is performed off-line. Taking all these concerns stated here into account, it allows us to conclude that the proposed method can be considered as a vital alternative in the multi-robot path planning problem.
Here, one may be confused by the obtainment of the simultaneous arrival time of each robot to the destination position presented in Tables 8 and 9. To interpret by how the simulation arrival time of each robot is obtained, only the proposed method for the first multi-robot numerical simulation is set to be an example. As presented in Table 8, the path   lengths of Robot 1 and Robot 2 generated by the proposed method are, respectively, 118.03 m and 122.22 m in the first numerical simulation. In such a case, when Robot 2 and Robot 1, respectively, move at speeds of 6 m/s and 5.79 m/s, these two robots can simultaneously arrive at the destination location in 20.37 s, as reported in Table 8.

Conclusions
In this study, a novel-PSO-based method is proposed for solving the multi-robot path planning problem. To enhance the performance of the optimizer, a novel SAEGBPSO algorithm is first proposed via integrating SPSO 2011 and  the evolutionary stable strategy of EGT. Aiming at addressing the stagnation issue of PSO, particles in SAEGBPSO update their movement information according to the moving rules defined in SPSO 2011. Subsequently, for well-balancing the global and local search capabilities of particles, a novel self-adaptive strategy is proposed to update the three main control parameters of particles in SAEGBPSO based on the EGT and the iteration number of the algorithm. Moreover, the convergence of the proposed SAEGBPSO is anlaytically studied and a parameter selection principle which sufficiently guarantees the convergence of SAEGBPSO is presented in this article. Leveraging the proposed SAEGBPSO and a coevolutionary strategy, this article develops of a coevolutionbased SAEGBPSO for the multi-robot path planning problem. Finally, the performance of the proposed approach is validated through different numerical simulations both in single-robot and in multi-robot path planning problems. The simulation results confirm that the proposed method is highly competitive in terms of the path optimality. Moreover, the computation time of the proposed method is comparable with those of the other approaches compared in this article. Therefore, the proposed method can be regarded as an effective alternative in robot path planning.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is financially supported by the National Science Foundation of China under Grant No. 61903286.