Flexible model predictive control based on multivariable online adjustment mechanism for robust gait generation

The gait generation algorithm considering both step distance adjustment and step duration adjustment could improve the anti-disturbance ability of the humanoid robot, which is very important to the dynamic balance, but the step duration adjustment often brings non-convex optimization problems. In order to avoid this situation and improve the robustness of the gait generator, a gait generation mechanism based on flexible model predictive control is proposed in this article. Specifically, the step distance adjustment and step duration adjustment are set to be optimization objectives, while the change of pressure center is treated as the optimal input to minimize those objectives. With the current system state being used for online re-optimization, a feedback gait generator is formed to realize the strong stability of variable speed and variable step distance walking of the robot. The main contributions of this work are twofold. First, a gait generation mechanism based on flexible model predictive control is proposed, which avoids the problem of nonlinear optimization. Second, a variety of feasible optimization constraints were considered, they can be used on platforms with different computing resources. Simulations are conducted to verify the effectiveness of the proposed mechanism. Results show that as compared with those considering step adjustment only, the proposed method largely improves the compensation ability of disturbance and shortens the adjustment time.


Introduction
Leg movement affects the dynamic balance of humanoid robots, and it is the basic guarantee for robots to complete various advanced tasks. However, owing to the complexities of its hybrid dynamics, the unidirectional constraints on contact forces, as well as the high dimensionality and nonlinearity of robot general dynamics, the robot leg movement is widely regarded to be a difficult problem 1 and various methods have been proposed. [1][2][3][4][5][6][7] Specifically, to address the dynamic balancing problem of humanoid robots, a general method 8,9 is to use a hierarchical structure shown in Figure 1 to control the robot. As illustrated, the bottom layer is a motion control layer, and it is responsible for tracking the planned robot trajectory by utilizing inverse kinematics or inverse dynamics. While the top level is the motion planning layer, which is utilized to generate a set of desired targets, for example, the Center of Mass (CoM), the Center of Pressure (CoP), the Center of Gravity (CoG), 10 limb movements as well as some other tracks. The top layer determines the walking gait of the robot and is a prerequisite for stable work. In the case of strong disturbance, it has to adjust the step of the robot quickly to the right position within a period of time like the human beings.
For top-level motion planning, many gait generation schemes perform zero moment point (ZMP, equivalent to CoP 2 ) conditions by calculating the trajectory suitable for robot's CoM. Most of them use simplified CoM dynamics models, such as the famous linear inverted pendulum (LIP) model 3 and Cart-Table (CT) model. 4 Using a simplified model to capture task-related dynamic processes to a set of linear equations is very useful for real-time generation of walk patterns. Specifically, Kajita et al. 4 put forward the concept of preview control and solved a linear quadratic (LQ) regulation problem for a dynamical extension of the CT. Wieber 5 further developed it into model predictive control (MPC) and selected the CoM jerk as control signal to achieve asymptotic tracking of the desired ZMP trajectory. However, both methods are not strong enough to disturbance. Recently, Englsberger et al. 6 and Takenaka et al. 7 split the LIP model into a stable first-order system and an unstable first-order system, and then introduced the intermediate variables of the divergence component of motion (DCM or capture point 11 ) to solve the CoP and CoM trajectories online. Within these methods, the input of the unstable subsystem in LIP is planned online according to the predetermined footstep locations, while both the CoP and CoM trajectories were generated in real time by tracking the DCM according to the robot natural dynamics. Englsberger et al. 12 extended the method to 3-D case. Yet, it is worth noting that although these methods have certain resistance to perturbations, their effects are still limited.
To enhance the robustness of the generated gait, Herdt et al. 13 and Diedam et al. 2 used predictive control to include step distance to the optimization objective function, while made footsteps variable to generate both CoM and CoP trajectories under the CoP constraints. Results showed that the generated walking mode is robust to disturbances and achieves high level target tasks, such as desired step position or walking speed. However, the step distance adjustment contributes only to disturbance energy consumption, and they do not take into account the step timing to change the landing speed either. In practice, step duration adjustment could help adjust the robot foot movement speed for more robust gait generation. Since the CoM trajectory is a nonlinear function of time and of initial conditions, in order to deal with nonlinear optimization problems, many algorithms have been proposed in the literature. 14-17 Kryczka et al. 18 and Aftab et al. 19 proposed to utilize a nonlinear optimization technique to modify both footstep positions and step-timing to maintain dynamic stability during the robot walking process. Specifically, Aftab et al. introduced a simple model of the mechanical cost in the objective function by penalizing the acceleration of the swing foot, while the acceleration can't anyway exceed a given maximum value. Kryczka et al. rewrote the optimization objective to be a nonlinear function of the optimization variables. Although satisfactory results have been achieved, the nonlinear optimization process introduced high computational costs and cannot guarantee convergence to the minimum either.
Khadiv et al. 20 adopted a DCM method by taking the constraints of both location and time of stepping into account in robot gait generation process and modeled the problem as a quadratic program that can be solved real time. The results showed that such the proposed strategy could help the robot recover from severe pushes. However, since the CoM motion constraint has not been taken into account in the objective function, the CoM speed tracking error cannot be minimized in the control process. Sun et al. 21 proposed a class of global and feasible projected Fletcher-Reeves conjugate gradient approach. This method guarantees the tracking accuracy of each physical quantity to the target value in the optimization process but requires the optimization constraints to be linear. Sun et al. 22 further proposed a superlinearly convergent trust region-sequential quadratic programming (QP) approach. The method incorporates a combination algorithm that allows both the trust region technique 23,24 and the sequential QP method to be used. It avoids solving the QP subproblem for nonlinear constrained optimization problems, which gives the potential for fast convergence in the neighborhood of an optimal solution.
In this article, we propose to flexibly choose the analytical bounds of position, velocity, and acceleration of CoM in MPC frame to predict the stability and manually set the target footstep locations and step duration in the prediction horizon. The optimal outputs are allowed to deviate from the target, that is, to adjust the item weight coefficient in the optimization objective, and the variation of CoP are selected as the input to minimize the objective. With the current system state being used to recalculate the optimization online, a feedback controller as shown in Figure 1 is formed, which outputs the required CoM, CoP, footstep locations, and the next step timing. Finally, according to the practical robot structure, the maximum and minimum stride distances of the lateral plane and the sagittal plane are also included as the optimization constraint of QP to achieve variable speed and variable step.
The main contributions of this study could be summarized as follows. First, we improved the MPC objective function with adjustment of step duration taken into account, and therefore, such a step timing adjustment strategy could largely suppress the interference in forward visual field and stabilize CoM velocity mutation with the non-convex optimization problem avoided, as compared with those in the existing work. 2,4,25 Second, the proposed method flexibly handles the stability optimization problem for biped robots and also presents a variety of gait optimization constraints, which can be easily implemented on control platforms with different computing resources. With the foot CoP stabilized without any mutations in the forward motion, the proposed mechanism mimics human walking, and thus is conducive to walking smoothly with small errors.
The article is organized as follows. In the "The MPC method for walking machine" section, we recall the LIP and the CoM trajectory characterization of standard MPC. The "System optimization model" section describes the motion model adopted, the "Walking planning with adjustable step duration" section achieves flexible walking optimization through carefully selected objective function and constraints, and the "Push recovery planning" section describes push recovery. To illustrate its benefits, all the simulation results are shown in "Results and discussion" section, the conclusion points out the next research work. For simplicity purpose, the abbreviations utilized in this article are summarized in Table 1 as below.

The MPC method for walking machine
When a robot is walking on the ground, it comes down to the fact that the CoP can lie only within the convex hull of the contact points between the robot's feet and the ground. For the LIP model used by most scholars, it describes the motion of the CoM of a robot when its height is constant while the rotation effect is not considered. In addition to its linear properties, the LIP has the advantage that dynamics along the x-and y-directions are decoupled and can be represented by the same differential equations. Many studies 4,26-30 have proved that, despite its simplicity, the use of  Figure 2) for gait generation is effective. Simplified LIP model for real robots is shown in Figure 3. Establish LIP dynamic equation in the x-axis (the y-axis identical) where ! ¼ ffiffiffiffiffiffiffiffiffi g=z c p , with g and z c being the acceleration of gravity and the constant altitude of CoM, respectively. x is the horizontal position of the CoM, € x is horizontal acceleration of the CoM, and x cop is the position of the CoP.
It could be found in equation (1) that LIP has a right half plane pole, and there is always a divergence component in the system, which leads to the divergence of CoM. In addition, the change of CoP will directly lead to the change of the divergence velocity of CoM. However, CoP, the input signal has some constraints and must be within the range of stable polygons. The scheme proposed by Kajita et al. 4 generates a trajectory of the CoM under the constraint that the footsteps are fixed and impossible to change. CoM and CoP are programmed as a series of discrete points, with varied jerks ::: x over time intervals of constant lengths T. Therefore, the dynamics at t kþ1 ¼ ðk þ 1ÞT could be calculated as This scheme is to minimize this jerk while maintaining a position of the CoP as close as possible to some prescribed reference positions, but in the process, there is no consideration of the robot's foothold and step timing. Therefore, to address the problems of anti-disturbance and variable speed walking for the robot, we propose to add the velocity deviation and acceleration deviation of CoM, as well as landing point deviation and stepping time deviation to the optimization function, and select the variation of CoP as the optimal input. After the first QP solution is finished, the feedback value of the system state is recalculated online to form a feedback controller, and the optimal trajectory is constantly updated.

System optimization model
For all of the implementations presented, the model is the LIP with 2-D dynamics given by equation (1). However, the math is generally independent of the model. It is straightforward to implement any other linear model. As shown in Figure 4, a complete step includes double support phase t double and single support phase t single , and NT time span may contain M steps. With the state space equation of the discrete linear system shown in equation (3) taken into account, the control input u is the variation of x cop . Given a sequence of control inputs u ¼ ½0; u k ; u kþ1 ; :::; u kþN À1 T , the X ¼ ½x k T ; x kþ1 T ; x kþ2 T ;:::; x kþN T T and Y ¼ ½y kþ1 T ; y k þ2 T ;:::; y k þN þ1 T T sequences within NT time intervals can be calculated   ; x fr are the left and right foot position ; ; In order to determine the best trajectory, a cost function is constructed. The function is a scalar function that scores the trajectory, by defining a quadratic cost on the elements of X and u such that Introducing it into equation (4), then It has a simple quadratic form where Note that the J 0 term is constant with respect to u and has no effect on the location of the minimum of J.
Adding equality and inequality dynamic equilibrium constraints For some constrained states, c 0 in X b 0 in can be transformed into Finally, the problem of finding CoM trajectory in a period of time NT is converted into a QP. There are many methods can be used to solve this standard QP, such as interior points, active sets, conjugant gradient, or simplex methods 31 Walking planning with adjustable step duration The MPC scheme adopted by Kajita et al. 4 is basic in generating stable CoM trajectories. Its only mandatory feature is to adjust the magnitude of the motion derivative of CoM in the prediction horizon. However, any control variables that help to suppress CoM mutations contribute to more stable walking. Objective function. For the dynamic balance of biped robot, the form of function will be different under different circumstances. In case of steady walking, this article takes the following form of objective function where ðC x ; C y Þ, ð _ C x ; _ C y Þ, and ð € C x ; € C y Þ are the vectors of CoM position, velocity, and acceleration in 2-D over the next N time steps, respectively. ðC target  Figure 5, in the calculation at a time, the robot can walk one step or walk M > 1 steps. So here, we chose to give it three steps,

As shown in
The reference value is selected as d ¼ 0:2signðy s 0 À y f 0 Þ, ðx s 0 ; y s 0 Þ is the position of initial swing foot, ðx f 0 ; y f 0 Þ is the position of initial support foot, v xref and v yref are the reference velocity of CoM, t step is a known deterministic value. Note: The ðx f ; y f Þ, T xstep , and T ystep are unknown in advance, the objective function J can be written as a function of u After solving the QP, the value of u is obtained. The CoM and CoP trajectories can be got from equation (4) and initial status x 0 . Take the minimum values in T xstep and T ystep for the next QP, that is, T ¼ minðT xstep ; T ystep Þ. Although we worked out the CoM and CoP trajectories in the M-walk process, only the first foothold is taken as the current destination, the QP is recomputed in next NT time.
Constraints. Constraints play a major role in the computation of the optimal trajectory. In the robot walking process, CoP directly determines the balance state of the robot. It should also be noted that since the locations of the footsteps are variables in optimization process, the constraints on the CoP after the first step are unknown, and thus, it is necessary to dynamically set CoP constraints according to the optimized planning footstep locations. In addition, in order to maintain a simple linear form and use a fast QP solver, simple conservative constraints are chosen to approximate the real constraints. The conservative constraints represent smaller regions than the real constraints. Another advantage of this approach is that it introduces a safety margin and enhances the stability of optimization.

CoP constraints
Since robot is in a different supporting phase at different times, the CoP constraints could be divided into a set Z S cop consisting of multiple single support constraint blocks and a set Z D cop consisting of multiple double support constraint blocks throughout the NT time (see Figure 5).

Single support phase (during t single )
When in single support, the CoP can only be within the range of the support foot, so the feasible range of the CoP is where U S i 2 R NÂ1 be defined as a vector of zeros and ones with the ones corresponding to the time steps of the ith single phase, with the first phase being the initial double support phase. D 1 ; D 2 ; D 3 , and D 4 are the stable ranges, which must be less than half of the foot size. From equation (4), we get B x cop and A y cop ; B y cop are the corresponding line in A; B. Finally, we have Double support phase (during t double ) When in double support, establish the coordinate system o 0 as shown in Figure 6(a) in the middle of the feet. o 0 rotates q around the z-axis relative to o. Hence, for any CoP ðx cop ; y cop Þ in planning trajectory within coordinate o 0 , it is ðx 0 cop ; y 0 cop Þ, and the constraints of double support are as follows x 0 where D 0 x ; D 0 y are the stability margin in the direction of x 0 and y 0 as illustrated in Figure 6(a), and x 0 cop ¼ ðx cop À x r Þcosq À ðy cop À y r Þsinq Because l pace ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðx l À x r Þ 2 þ ðy l À y r Þ 2 q , let l 0 ¼ 1=l pace , the equation (18) can be converted to x 0 cop ¼ l 0 ½ðx cop À x r Þðy l À y r Þ À ðy cop À y r Þðx l À x r Þ y 0 cop ¼ l 0 ½ðy cop À y r Þðy l À y r Þ À ðx cop À x r Þðx l À x r Þ À Then combine equation (15) and (19), equation (17) are transformed into where U D i 2 R N Â1 be defined as a vector of zeros and ones with the ones corresponding to the time steps of the ith double phase, Hence I N Â1 is a N-dimensional vector with all elements being 1.
The constraints of CoP in double support can also be selected in other forms. The optimal results obtained in different forms are slightly different in the macroscopical attitude of the robot. For example (as shown in Figure 6 Alternatively, the CoP constraints of the double support and the single support can be directly set up to coincide with each other as shown in Figure 6(c).
Step distance constraints Because of the limitation of mechanical structure, the step distance needs to be constrained where D dx ; D dy are the maximum step distances of the robot.

Foot placement constraints
When walking steadily, the landing point swings back and forth in the direction of the y-axis, yet too small swing amplitude may cause interference between the legs. Hence, in the y-axis, we set the distance between the two landing points is greater than a threshold.
where D miny is the minimum step distance of the robot in the y-axis. In the next few steps, the landing point y f i should be determined according to the size of initial conditions y f 0 and y s 0 . No crossing of feet allowed.
Step duration constraints In the case of large disturbance, humanoid robot is limited by mechanical structure. The maximum distance of step is equal to D dx ; D dy under the condition of fixed step timing. If it is not enough to overcome disturbance, it will lead to body forward and CoM divergence. If step duration is changeable, the robot can choose the maximum step distance to land in a shorter step time to quickly enter the next step. Through several rapid steps, it has more advantages in dealing with high-speed and large disturbances, which is similar to human beings. So when the initial velocity is obtained, the distance of one step should not exceed The above method finds the real-time optimal trajectory by rolling the QP within the constrained framework, in NT time and minimizes CoM position error, velocity error, acceleration error, landing point error, and step timing error. In addition, the variation of CoP could also be minimized. Therefore, the robot can change its speed and walk with variable step size, and automatically adjust the step size and time according to the required speed or disturbance. As weight coefficients of the objective function, the parameters ½a 1 ; a 2 ; a 3 ; a 4 ; a 5 ; a 6 determines the constraint strength of the corresponding physical quantity. The larger the coefficient, the smaller the error between the trajectory output and the target. While if the robot is disturbed, different initial state x 0 appears in each QP solution. Because of the online operation characteristics of the algorithm, the optimal output results are also changed to generate the optimal target trajectory under the disturbance.

Push recovery planning
In the same way, if the cost function and all kinds of constraints can be found manually, the rolling optimization method can be used to carry out the real-time gait planning. For push recovery, the large disturbance would grant the robot an initial velocity, and thus, the robot needs to be stabilized by step, which can be stabilized at one step or several times.
Adopt the following form of objective function The variable is in line with the previous section (as shown in Figure 7(a)), or In the case of one-step stability, the robot reaches a stable state through one foot support, and the double support constraints can be selected in the form of Figure 7(a). While for multiple stride, the CoP constraint given by equation (22) in the double support, as shown in Figure 7(b), could be selected, and the optimization method of push recovery is set manually. Other structural constraints are identical to those in the "Walking planning with adjustable step duration" section.

Results and discussions
In this section, we present simulation results using the proposed walking pattern generation method. In the first scenario, the results show that the model can recover from the larger pushes when the step time adaptive controller is used. In the second scenario, we compare the maximize anti-disturbance ability of the step duration adjustable method with that of the fixed step duration method in Diedam et al., 2 Kajita et al., 4 and Herdt et al. 13,25 Walking planning simulation In this scenario, we compare our controller with the one that uses fixed step durations. Walking simulation was carried out with Matlab (version R2017b) on a personal computer with x64 Win10 platform (Intel(R) Core(TM) i5-7200U CPU, 8G RAM). At the beginning, the robot is in a static state in the flat ground. Using LIP model, the height of CoM remains unchanged. The state space equation (3) is used, the structural parameters of the robot (see Figure 3) are shown in Table 2. The total weight of the robot is 25.5 kg and it has 12 degrees of freedom. Each leg has three degrees of freedom (pitch, roll, yam) on the hip joint, one pitch on the knee, the ankle joint has pitch and roll degrees of freedom. Each joint is equipped with an angular displacement and torque sensors to achieve position or force closed-loop. Inertial measurement unit (IMU) units are installed on floating base coordinates. Figures 8 to 11 show the simulation results of fixed step timing and adaptive step timing of the humanoid under disturbance. In the dual-support phase, the CoP constraint selection equation (20). x 0 ¼ ½0; 0:2; 0; 0; 0; 0; 0; 0; 0:10; À0:10, v xref ¼ 0:5 m/s, v yref ¼ 0:1 m/s. At the initial time of 5th step, the x-direction is added a disturbance of 0.3 m/s, the y-direction is added a disturbance of 0.1 m/s. For the case when step timing is adjustable, as shown in Figures 8 and 10, the system shortens the step timing, has a recovery motion that starts with a large recovery step, and then converges to the reference motion in 1 or 2 steps. However, for the case when the step timing cannot be adjusted, as shown in Figures 9 and 11, because of the excessive disturbance, the humanoid robots uses the maximum step distance at each step, but it is still not enough to suppress the CoM divergence, which eventually leads to the robot dumping.

Comparisons with the existing methods
In the second scenario, we compare the robustness of the proposed approach with that of Herdt et al. 25 The proposed walking pattern generation approach by Herdt et al. 25 and variations 2,4,13 of this approach are standard walking pattern generators. Herdt et al. 25 calculates CoM trajectory and real-time landing position in a fixed prediction horizon and realizes the travel of predetermined speed. Here, we apply the same parameters to both methods and calculate the maximum perturbation velocities that each method can recover from different directions (see Figure 12). The measured data are based on the system parameters in the previous section. The robot has a ground contact as shown in Figure 12(a) at the beginning of the disturbance action of the 5th step, when the humanoid is in double support. At the beginning of the walking, both methods set the step timing to 0.5 s, v xref ¼ 0:4 m/s, and the disturbance in the forward field of vision ð0; 180 Þ was applied when the humanoid reached the 5th step. That is to say, the forward is 90 direction of forward vision (positive direction of the x-axis), the 0 of forward vision is negative direction of the y-axis, and 180 is positive direction of the y-axis.
As can be observed in Figure 12, our approach with time adjustment is able to recover from much more severe pushes compared with Herdt et al. 25 When the step time is fixed, because the robot is in double support, the resistance to disturbance in the vertical direction of the two-foot line is the weakest, and that in the direction of the two-foot line is the strongest, as shown in the green line in the Figure 12(b). With the method proposed in this article, the step timing can be adjusted, and the speed increases when the x-direction is disturbed. Under the restriction of equation (25), the step duration decreases, and the robot moves frequently, which enhances the anti-disturbance ability in the x-direction. However, in the y-direction, the minimum distance between the feet is within the range from 0.2 m to 0.3 m, because cross feet is not allowed. Even if the step time is reduced to a very small amount, it still can not produce a large CoP movement in space, as shown in Figure 13. So in the y-axis direction, the feasible area is more limited than the other direction. Figure 12(b) compares the advantages of this method presented in this article in terms of spatial. Figure 12(c) shows the advantages of this method in terms of time. In the forward 0-180 field of view, the common recoverable disturbance velocity is as shown in the shaded area in Figure 12(b). The recovery time corresponding to the two methods is shown in Figure 12(c). It can be seen that the recovery time used in this method is less than Herdt et al. 25 when the two methods executed in the same disturbance speed in all directions. This is because the method reduces the stride time after the disturbance, but the number of steps used for recovery does not change. This in turn leads to a reduction in the recovery time used. Overall, the variable step duration method has an improved robustness as compared with the fixed step timing method.

Push recovery simulation
For push recovery, similar to human, the robot could take a big step to restore stability, or step by step to stability. As shown in Figure 14, the CoM has a initial velocity of 0.5 m/ s in the x-direction and a initial velocity of 0.2 m/s in the y-direction, D dx ¼ 0:5; D dy ¼ 0:3. By utilizing the scheme shown in Figure 7(a) under the disturbance, it can be seen that the CoM and CoP position appear overshoot, then gradually tend to stabilize. At the same initial condition, using the scheme shown by Figure 7(b), the CoM gradually stabilizes after many steps, as shown in Figure 15, without overshoot. Hence, it could be known that multi-step recovery is helpful to overcome the large disturbance and is more robust than one-step method.

Discussion
Through the above simulation and comparative analysis, the following advantages could be achieved with the proposed method, as compared to the other existing methods, 1. Flexibility: The proposed method provides a variety of schemes to deal with the QP problem in the dynamic balance for humanoid robots. For the same task, there are different optimization models. While for different tasks, only minor modifications are needed to use similar optimization models with strong flexibilities. 2. Robustness: The simulation results show that the variable step duration method is more robust than the MPC approach with several preview steps without timing adjustment. In our method, a nominal step duration is set to allow the next prediction horizon to deviate from it in a linear constraint after disturbances, and then solve a convex optimization problem by looking at the next step location and timing. 3. Similarity with people: The gait generated by the method, in the single support phase, the CoP moves stably forward in the forward direction of the foot, which is quite similar to human walking. A similar approach to human is also used for the perturbation process, which results in larger CoP movements in a shorter period of time, and suppresses CoM divergence, while minimizes the speed and acceleration tracking errors.
It is worth noting that, since the main focus of this article is to test the feasibility of the proposed MPC algorithm onto  the robot, we verified its robustness on flat ground only. When considering some other much more complicated forms of disturbances, like the uneven plane or walking on the slope, however, there may exist various other factors influencing on the performances of the robot. Those factors could impose a huge burden on the performance analysis of biped robots. Therefore, one aspect of our next-step work is to explore and improve the online anti-disturbance performances of the proposed algorithm under complex terrain.

Conclusion
In summary, we proposed an MPC method, which uses LIP as the motion model and the change of CoP as the input to minimize the variation of the CoP in the objective function, and thus guarantees the stable CoM and CoP trajectories, realizes the speed change and step duration change under the action of disturbance for humanoid robots. Compared with the fixed step duration method, the variable step distance and step timing method could further improve the compensation ability of the perturbations. Meanwhile, the flexible task of the robot is realized owing to the diversity of constraint settings. Moreover, the objective function can choose a different form by modifying some of the constraints that match it.
This study can be regarded as preliminary work for biped robots to enter human life. Our next-step work focuses mainly on the following aspects. First, we are to verify such proposed method on our lab-customized humanoid robot as shown in Figure 3, and then we would like to further extend our method to a three-dimensional space by considering the optimization in z-axis direction. Last but not least, the method could also be extended to uneven ground, within the slope or step environment to mimic the real human living environment, to further improve its adaptability.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.