Efficient neural learning control of nonlinear dynamics with applications

The control of nonlinear dynamics is gaining increasing attention since many practical systems are with such kind of characteristics. To deal with the system uncertainty, in this paper, the efficient learning control using neural network is proposed for the nonlinear strict-feedback system. The whole scheme is with the back-stepping design, while the novel learning is proposed for the neural network weights update. To deal with the approximation error, the robust item is added. The stability of the closed-loop dynamics is analysed and the effectiveness of the design is verified through flight simulation.


Introduction
Nonlinear dynamics exists in many practical systems such as robots, 1 manipulators, 2 flight vehicle, 3 quadrotors, 4 and MEMS (microelectromechanical system) gyroscope. 5 Control of nonlinear dynamics 6-8 is challenging since the design should be according to the structure while the nonlinearity is difficult to deal with. For the nonlinear design, the method is based on Lyapunov theory to make the energy decreasing. One important concern is on how to deal with the unknown nonlinear function. Some designs are using the knowledge of upper bound, while some works are based on the linearized parametric model. The other concern is on the form of the dynamics. For example, the controllability canonical form is different from the strictfeedback system and the pure-feedback system. For the dynamics in controllability canonical form, the main design can be with the error surface and then the robust design can be used. The control of the strict-feedback system is well studied using back-stepping and dynamic surface design. Though the design might be efficient, how to deal with the unknown dynamics is not easy since there might be not enough information to construct the linearized parametric model.
The intelligent control 9-11 can provide learningbased structure and the design is more convenient. Many works can be found as approximate control, reinforcement learning control, adaptive dynamic programming control, 9 fault tolerant control, 12 and so on. Though the motivation might be quite different, the idea is clear that the neural network (NN) can be used as bridge between known and unknown. Some works can be further included with disturbance observer, 13,14 sliding mode design, and H ' design. 15 For the system with unknown nonlinear functions, the NN can be used for approximation. Typically, two kinds of designs are widely designed. One way is to approximate the ideal control input, while the other is towards the nonlinear function approximation. The designs can be found in the literature, [16][17][18] while the application can be found in robot system, ship system, flight system, and mechanical systems. If the nonlinear function can be approximated as precise as possible, the tracking performance can be better. 19 However, most neural control is on the closed-loop stability using the tracking error to tune the NN weights. In this way, the system can be stable but after checking carefully, the approximation is far from the true value of the nonlinear function. Recently, some works have been towards the learning improvement using composite learning, 20,21 In the design, the theoretical analysis is rigorously presented, while the prediction error is constructed using the dynamics and the approximation. In practice, more practical designs are expected if more system information can be obtained such as the derivative of the system state.
Based on the above-mentioned discussion, there exist many works on intelligent control. But most designs are towards the function approximation, and then the controller is constructed to obtain the system stability. During the process, the attention is on closedloop system stability using tracking error to tune the weights. But the adaptation of the intelligent system is not sufficiently considered. Thus, in this paper, the approximation performance is considered and the new prediction error is given using the derivative of the system state. Furthermore, the efficient learning update law is constructed and the closed-loop system stability is analysed.
The structure of the paper is given as follows. Section 'Model dynamics and problem formulation' presents the nonlinear strict-feedback dynamics. Sections 'Efficient learning control' and 'Stability analysis' present the learning control and the closed-loop stability analysis, respectively. The verification is presented in section 'Simulation'. Section 'Conclusions and future works' gives the conclusions and the future discussions.

Model dynamics and problem formulation
In this paper, the following dynamics with strictfeedback form is considered where f i are the nonlinear system functions, g i are the control gain functions, j i = ½j 1 ; . . . ; j i T are the system states, y is the output, and u is the control input. Assumption 1. The system states j i and their derivatives _ j i can be obtained.
The functions f i are not known, while g i are known. The control goal is to design the efficient learning algorithm for the back-stepping control so that the output y can track the reference signal y r

Efficient learning control
For the strict-feedback design, the back-stepping scheme is of great interest since the design can break the complex dynamics into several simple dynamics. The main difficulty is the so-called 'explosion of complexity'. Several designs can be introduced for simplification such as dynamic surface control and the command filtered back-stepping. In this paper, the derivative of the virtual control signal is obtained using is the virtual signal and t s is the sample period.
Step 1. As the first equation shown in dynamics (1), it is known that where v Ã 1 is unknown and e 1 is the approximation error satisfying je 1 j 4e m 1 and e m 1 is the upper bound of e 1 . Define the tracking error where j d 1 = g d Design virtual control j d 2 as wherev 1 is the NN weight vector and k 1 . 0 is the design constant. Define e 2 = j 2 À j d 2 . Then, the derivative of e 1 is obtained as whereṽ 1 = v Ã 1 Àv 1 Define the prediction error as Since f 1 is unknown, _ x 1 is unknown. Here, since the dynamics does not contain the noise, the information is Then, the following equality can be obtained The NN weight update is given as where g 1 is a positive design constant.
Step i. As the ith equation shown in dynamics (1) and using NN to approximate f i ( j i ), it is known that where v Ã i is unknown and e i is the approximation error with je i j 4e m i and e m i is the upper bound of e i .
Design virtual control j d i + 1 as wherev i is the NN weight vector, k i . 0 is the design constant, and _ j d i ' The derivative of e i is obtained as wherev i = v Ã i Àv i The prediction error is constructed as Since f 1 is unknown, _ j i is unknown. Here, since the dynamics does not contain the noise, the information is Then, the following equality can be obtained The NN weight update is given as where g i and g zi are the positive design constants.
Step n. As the nth equation shown in dynamics (1) and using NN to approximate f n ( j n ), it is known that _ j n = f n ð j n Þ + g n u = v ÃT n u n ð j n Þ + e n + g n u ð17Þ where v Ã n is unknown and e n is the approximation error with je n j 4e m n and e m n is the upper bound of e n . The final control signal u is designed as u = Àv T n u n ( j n ) À k n e n À g nÀ1 e nÀ1 + _ j d n À e m n sign(e n ) g n ð18Þ wherev n is the NN weight vector, k n . 0 is the design constant, and _ j d n ' The derivative of e n is obtained as _ e n = _ j n À _ j c n =ṽ T n u n ( j n ) + e n À k n e n À g nÀ1 e nÀ1 ð19Þ whereṽ n = v Ã n Àv n The prediction error is constructed as Since f n is unknown, _ j n is unknown. Here, since the dynamics does not contain the noise, the information is calculated using _ j n ' j n (k + 1)Àj n (k) t s . The signal _ j n is calculated as _ j n =v T n u n ( j n ) + g n u ð21Þ Then, the following equality can be obtained The NN weight update is given as where g n and g zn are the positive design constants.
Proof. The Lyapunov function is selected as The derivative of L v is derived as e i ðÀk i e i À g iÀ1 e iÀ1 + g i e i + 1 Þ + e n ðṽ T n u n ð j n Þ + e n À k n e n À g nÀ1 e nÀ1 Þ Àṽ T n ðe n + g zn z n Þu n ð j n Þ ð25Þ Then it is calculated as Àk i e 2 i + e i (e i À e m i sign(e i )) À (z i À e i )g zi z i The equation is further obtained as where k g = g zi 1 À e m i m , P = mg 2 zi (e m i ) 2 , and m is the scalar. So all the signals in equation (24) are bounded. Now it is concluded that under the proposed method, the system can track the reference very well. This completes the proof.

Simulation
The flight dynamics 22 are presented with attack angle a, flight path angle (FPA) g, and pitch rate q. The control input is the elevator deflection d e . The attitude dynamics is listed as follows Define X = ½j 1 , j 2 , j 3 T , j 1 = g, j 2 = u p , and j 3 = q, where u p = a + g. The general form of the dynamics can be obtained as The way of using the tracking error to update the neural weight is denoted as 'Method 1', while the design in this paper is named as 'Proposed Method' which means the predictor-based update design. To show the performance, the index is selected as J i = Ð e i j jdt, where i = 1, 2, 3. The parameters are selected as in Zhang et al., 23 while g i , i = 1; 3, are selected as 1.5. Definê f i =v T i u i , i = 1; 3. In the simulation, the altitude will climb from 86,000 to 87,000 ft in 50 s, while the altitude will decrease from 87,000 to 85,000 ft in the next 50 s. Given the reference signal of altitude, the flight path angle is generated through the similar way as in Zhang et al. 23 In Example 1, there is no noise, while in Example 2 there exist noise for a and q.   precision for system states tracking in Figures 1 and 2. The control input responds smoother as shown in Figure 3. The neural approximation is depicted in Figure 5, while the trajectory of NN weights is shown in Figure 6. The tracking performance is demonstrated in Figure 4. Overall, the proposed method achieves the better convergence and the higher tracking accuracy. Example 2. The random noises with amplitude 0.0001 and 0.001 are added for a and q. In Figure 7, the system response is demonstrated, while in Figure 8 the elevator deflection is depicted. Furthermore, the NN response is shown in Figure 9. It is interesting to see that the proposed method can achieve much better performance in case of measurement noise. Also from the response of  the elevator deflection and the NN weights, more chattering occurs in case of noise.

Conclusions and future works
The efficient learning-based control is designed for the strict-feedback systems. The design constructs the signal to obtain the prediction error for the neural weight update. The system stability is analysed and the control performance is verified through nonlinear dynamic simulation.
For the future work, the output-feedback design can be studied. In reality, the time-varying disturbance exists in the dynamics and the new estimation design can be analyzed. For practical applications, the method can be applied to manipulators, underwater vehicles, quadrotor, and automobile dynamics for experimental purpose.