A dual adaptive robust control for nonlinear systems with parameter and state estimation

Stabilization and learning are imperative to the high-performance feedback control of nonlinear systems. A dual adaptive robust control (DARC) scheme is proposed for nonlinear systems with model uncertainties to achieve a desired level of performance. Only the output of the nonlinear system is accessible in this work, all the states and parameters are learned online. Firstly, the DARC uses the prior physical bounds of systems to design a discontinuous projection with update rate limits which confines the bounds of parameter and state estimation. Then robustness of the nonlinear system can be guaranteed by the deterministic robust control (DRC) method. Secondly, a dual adaptive estimation mechanism (DAEM) is developed to learn the unknown parameters and states of systems. One part of the DAEM is the bounded gain forgetting (BGF) estimator, which is developed to handle inaccurate parameters and parametric variations. The other is the adaptive unscented Kalman filter (AUKF) synthesized for state estimation. The AUKF contains a statistic estimator based on the maximum a posterior (MAP) rule to estimate the unknown covariance matrix. Finally, simulation results illustrate the effectiveness of the suggested method.


Introduction
Performant feedback control of nonlinear systems is impeded by inaccurate dynamical models.0][11] However, simultaneous estimating the parameters and states of nonlinear systems is still a challenging problem.3][14] Nevertheless, these methods suffer from the drawbacks of heavy computation of the Jacobi matrix and limited precision of the first order. 15,16With higher accuracy, the unscented Kalman filter (UKF) is implemented for parameter and state estimation of nonlinear systems. 17,18Both the joint UKF and dual UKF methods are under the framework of Kalman, which needs exact knowledge of prior statistics.In practice, statistics are hard to know and often inaccurate.Thus the performance of estimation might be degraded or even diverge. 19To avoid performance degradation, many different types of adaptive unscented Kalman filter (AUKF) with statistic estimator are applied to industrial areas. 20,21Though the adaptive mechanism of AUKF is effective, this method assumes that parameters are constants disturbed by small Gaussian noises which are usually not satisfied.The AUKF with Gaussian distribution assumptions also has a limited tracking capability of parameters, but the actual operating environment for nonlinear systems is complex and parameters are usually time-varying.This work integrates the bounded gain forgetting (BGF) estimator 1 and AUKF to form a dual adaptive estimation mechanism (DAEM).The bounded gain forgetting (BGF) estimator with parameter tracking ability while avoiding adaptation gain unboundedness is designed for parameter estimation.Simultaneously the AUKF method is used for state estimation based on the nonlinear model updated by BGF.The proposed DAEM can acquire a precise estimation of parameters and states even in presence of parametric variations.
3][24] As a model learning methodology, the DAEM can improve the performance of control systems by adjusting the feedback control law.However, this will bring the same weaknesses as adaptive control, poor transient performance, and vulnerability to external disturbances.Therefore, a novel control policy that retains the learning ability and overcomes the non-robustness of DAEM must be developed.Deterministic robust control (DRC) 25,26 has been widely used in the control of nonlinear systems because of its capability to achieve not only robust stability but also robust performance.However, the design conservativeness of DRC suffers from poor steady-state accuracy.Both the DAEM and DRC method have their own benefits and limitations.An adaptive robust control (ARC) framework with learning ability while retaining the robustness of the system has been developed. 22This paper utilizes the ARC framework to construct a dual adaptive robust control (DARC) for nonlinear systems to bridge the gaps between DRC and DAEM.Unlike conventional ARC, both state and parameter are learned online.Moreover, the state estimation error has also been taken into account for the stability of the system.The poor transients and nonrobust performance of DAEM are superseded by the guaranteed transients and robust performance due to leveraging the DRC philosophy.Owing to the learning power of DAEM, the poor steady-state performance of DRC is replaced by an excellent final tracking accuracy.The proposed DARC approach preserves the merits and overcomes the limitations of DAEM and DRC.Furthermore, the ARC framework utilizes the backstepping technique to ensure the stability of the closedloop system.However, the derivate signals in backstepping increase the computation burden heavily. 27For high-order nonlinear systems, the calculation of derivatives becomes prohibitive.Therefore, the proposed DARC employs the command filter technique 28 to relieve the calculation complexity.The computational complexity of DARC reduced significantly when compared to taditional ARC algorithm.
This work contributes to a new dual adaptive robust control scheme of nonlinear systems that achieves highperformance feedback control.This is achieved by: 1.A DAEM method, which intelligently integrates BGF estimator and AUKF, is proposed to estimate the unknown states and parameters simultaneously.2. In conventional ARC control, 29,30 the states of the system have assumed known or the state estimation error has been ignored for proving the stability of the whole closed-loop system.This work has taken both state and parameter estimation errors into the consideration.The DAEM has been brought into the ARC framework to form a novel DARC method.The stability of the whole system has been proved by the Lyapunov method.In addition, asymptotic stability can be achieved under specific conditions.3. The proposed DARC combine the command filter technique into the design procedure to avoid heavy computation of the partial derivatives of virtual control law in the backstepping design.
The rest of the work is organized as follows.Problem formulation is done carefully in section Problem formulation.In section Dual adaptive robust control, the DARC control law, which ensures stability while learning, and the learning mechanism DAEM are formulated.In section Simulation example, a secondorder nonlinear system with uncertainties is simulated with different scenarios and the results show the effectiveness of the DARC method.Section Conclusion concludes this work.

Problem formulation
This work considers nonlinear systems that can be transformed to the form below.
where x i = ½x 1 , . . ., x i T 2 R i is the vector of first i states of the system, x = x n 2 R n , Y i (x i , t) 2 R p , i = 1, . . ., n is known function vector represented the nonlinear features of the system, a 2 R p and b i , i = 1, . . ., n are the unknown parameters for nonlinear features, w = ½w 1 , . . ., w n T 2 R n and v 1 2 R are non-parameteric uncertainties caused by modeling errors, disturbances, and noises, y 1 2 R is the output measurement.For notation simplicity, the unknown parameters can be written as The following nomenclature is used throughout this work: For a vector }, } b and } e = } b À } are the estimation value and error of }, } min and } max represent the minimum and maximum value of }, and } i is a vector constituted by first i elements of vector }, } i is the ith element of vector }, } i, min and } i, max represent the minimum and maximum value of } i .Let x 1d be the desired trajectory and the control targets of this work are: (1) Design a control law u that guarantees a prescribed tracking accuracy.(2) A precise estimation of unknown parameters and states and the desired trajectory x 1d can be tracked exactly after the transient phase.
The following practice assumptions are presented: Assumptions 1.The unknown parameter vector a b is bounded by a convex region C a b .b i , i = 1, . . ., n are not equal to zero and with known sign.Without loss of generality, we suppose a i, min 4a i 4a i, max , i = 1, . . ., p and 0 \ b i, min 4b i 4b i, max , i = 1, . . ., n. Assumptions 2. The non-parametric uncertainties of the nonlinear system w i , i = 1, . . ., n and v 1 are bounded.To be specific, jw i j4p i (x, t), i 2 1, . . ., n and jv 1 j4r 1 (x, t) where p i (x, t) and r 1 (x, t) are known functions 22 .Assumptions 3. The reference trajectory x 1d , nonlinear function vector Y i , i = 1, . . ., n, and their derivatives are smooth and bounded.Assumptions 4. The non-parametric uncertainties of the nonlinear system w i , i = 1, . . ., n and v 1 can be viewed as a truncated Gaussian distribution with zero-mean and unknown covariance matrices W and V.

Dual adaptive robust control
The dual adaptive robust control policy proposed in this work can be seen in Figure 1.The parameters and states are estimated by the dual adaptive learning mechanism consisting of the BGF estimator and the AUKF method.Then a discontinuous projection is designed for confining the parameter and state in a convex region which is constructed with the prior knowledge of the bounds of parametric and non-parametric uncertainties.With the projection-type adaption law for parameter and state estimation, the stability of the whole system can be ensured by leveraging the DRC philosophy.

A projection type adaptation law
To improve the robustness of the nonlinear system while retaining learning capability.A discontinuous projection proj â(z), which keeps the estimated parameters â in a known bound, 30 is defined as where z 2 R. Define a saturation function for z to limit the upper bounds of estimation rate as below.
where _ a max is the preset limit of adaptation rate for parameter z.With the projection-type adaptation law and saturation function above, the following lemma can be verified.
Lemma 1. Suppose that the parameter a b, i , i = 1, . . ., n + p or state x i , i = 1, . . ., n are updated by discontinuous projection-type adaptation law represented below.
With the proposed projection-type adaptation law (4), the estimated value of a is always in a bounded region.To be specific, a i, min 4â i 4a i, max , i = 1, . . ., p, 0 \ b i, min 4 bi 4b i, max , i = 1, . . ., n, x i, min 4x i 4x i, max , i = 1, . . ., n. x i, min and x i, max , which are the upper and lower bounds for xi can be derived according to the bounds of a b , w, v i , and dynamic model (1).In addition, the adaptation rate is uniformly bounded by the preset limit as j _ âj4 _ a max .

Dual adaptive robust control law
The last subsection, a discontinuous projection that ensures the boundedness of parameter and state estimation is designed.The parameter and state estimation algorithm for a b and x are described in the next subsection.Under the discontinuous projection, the DARC control law for nonlinear system (1), which maintains stability while learning, is constructed based on the backstepping design procedure as follows.
To avoid the complex computation of the differential of virtual control law in backstepping design, the command filter technique has been used.We define the intermediate virtual control function vector where v i, c is the cutoff frequency for the above command filter and the intermediate virtual control function vector u d is designed in the following steps.
. ., n is bounded and exponential converges to the neighborhood of positive real number e i, u , i = 2, . . ., n in the initial phase t 2 ½0, T 1 . 27In the post-initial phase a, b b, t)j4e i, u .The positive real number e i, u can be chosen arbitrarily small by selecting v i, c , i = 2, . . ., n properly.
Step 1. Define u 1, r = x 1d and _ u 1, r = _ x 1d .Consider the first formula of equation ( 1) In equation ( 6), let x 2 be the virtual input, and we synthesize a virtual control law u 2, r for x 2 such that x 1 will track the desired trajectory x 1d .u 2, r can be derived from the following command filter.
, the input of command filter, is designed as follows. where is the tracking error estimated by b x 1 , h 1, s1 50 is a positive constant, and u 2, s2 (b x 1 , t) can be chosen as any function satisfies the following robust performance conditions. where In equation (10) a À e b 1 u 2, m is related to the parameter uncertainties and a, b b, t)j can be obtained directly from Remark 1 and the boundedness of a, b b, t) À u 2, d j can be obtained from Assumption 3 and the boundedness of the state estimation x 1 .
Remark 2. One design of u 2, s2 (b x 1 , t) that satisfies the conditions of equation ( 9) is as below. where ju 2, m j,0 , ...,0 T 2 R p+ n , and E 1, f1 , E 1, f2 , E 1, f3 are positive constants.With equations ( 10) and (11), it can be shown where Step i.We will design a virtual control law u i + 1, r for x i + 1 such that x i will track the designed virtual control law u i, r in step i À 1 exactly.The same design procedure in step 1 will be used again to construct DARC virtual control law.Define z i = x i À u i, r and b z i = b x i À u i, r .Then, the error dynamics of z i becomes To avoid the complex computation in conventional adaptive backstepping control, we can use the designed command filter in step i À 1 to acquire the _ u i, r .Similar to step 1, we can acquire the virtual control law u i + 1, r for x i + 1 through the following command filter. where is the input for command filter (14) designed as below, where x i can be estimated by the learning mechanism in next subsection, u i + 1, s1 (b x i , t) is the linear feedback, u i + 1, s2 (b x i , t) is the nonlinear robust feedback, h i, s1 50 is a positive constant, and u i + 1, s2 (b x i , t) can be chosen as any function that satisfies the following robust performance conditions.
where E i, t is a positive constant, In equation (17), À (Y i (x i , t) T e a + e b i u i + 1, m + e b iÀ1 z iÀ1 ) is related to parametric uncertainties and w i + b i (b u i + 1, r À u i + 1, d ) is the non-parametric uncertainties.Then, u i + 1, s2 (b x i , t) can be designed as Remark 2.
where E i, f1 , E i, f2 , E i, f3 are positive real numbers and Step n.This is the last step.Let u = u n + 1, d (b x n , b a, b b, t) be designed the same way as a, b b, t) and discontinuous type adaptation projection (4), all the signals of close loop system are bounded.Furthermore, the tracking errors of closed loop system are bounded by where z = ½z 1 , z 2 , . . ., z n T , E bn = P n i = 1 E i, t , and h bn = min (h 1, s1 , . . ., h n, s1 ).Thus, a prescribed transient performance can be guaranteed.
Proof.Define the Lyapunov function as Considering equations ( 10) and ( 17), the derivative of V bn is given by where z 0 , b 0 , and z n + 1 are defined equals to zero.
Taking condition 1 of equations ( 9) and ( 16), and which leads to (19).In addition, the output tracking error z 1 can be made arbitrarily small by increasing h bn or decreasing E bn .Thus, a prescribed transient performance is guaranteed.From (19), we obtain that z is bounded.Considering Assumptions 3 and discontinuous projection (4), it is easy to obtain that x 1d , x 1 , b a b , and b x 1 are bounded.Therefore, the intermediate virtual a, b b, t) is bounded.Then we can ensure the boundedness of u 1, r and _ u 1, r since we derive them from a stable low pass filter with a bounded input.It is easy to ensure the boundedness of u i, r for i = 1, . . ., n recursively.Thus, all the signals of closed loop system are bounded.

Dual adaptive estimation mechanism
In the above subsection, we synthesize the dual adaptive control law to obtain a guaranteed transient performance for nonlinear systems with model uncertainties.In the design procedure, only y 1 , the measurement of state x 1 , is assumed accessible and both the parameter and state estimation error have been taken into consideration.This subsection introduces the learning mechanism called DAEM to estimate the unknown parameters and states with the BGF estimator and AUKF, respectively.Therefore, an improved tracking performance can be achieved with the learning process.
Bounded gain forgetting estimator.The essence of parameter estimation is to extract useful information from measured empirical data.A model which represents the relation between the parameters and data is required.The model for parameter estimation is not necessarily the same as the control system model.Let us assume that the uncertain nonlinearities w i , i = 1, . . ., n and v 1 are ignored, and x is accessible.A conventional model for parameter estimation has been used as follows. where x n 2 R n is a output vector, and F f 2 R (p + n)3n is the signal matrix described as below.
The derivative of x appears in equation (23).To avoid acquiring _ x, we can filter both sides of ( 23) by a stable low pass filter H lf 31 and rearrange the formula.
where o 2 R n , and F 2 R (p + n)3n .The estimation of parameters is computed by minimizing the target function where l .0 is the forgetting factor.Solving equation ( 26), we can get where P is the gain matrix, e = F T b a b À o, and P(0) can be chosen as a positive definite matrix.Exponential forgetting of data is a very effective way to handle timevarying parameters.However, if the matrix signal F is not persistently exciting, 32 P À1 will be diminishing and the gain matrix will go unbounded.In order to avoid the unboundedness of the gain matrix, a tuning technique is designed as where l 0 .0 is the maximum value for l and P 0 .0 is upbound for the norm of the gain matrix P. If the signal F is strong persistently excitation which means kPk will be small and l will be larger, then we will have a faster forgetting speed which means strong ability to track the time-varying parameters.If the signal F is not persistently exciting and kPk reaches the upbound of P 0 , then the forgetting process will suspend.
Remark 3. The gain matrix P and estimation error e a b in BGF are upper bounded. 1 Moreover, if F is persistently exciting, a b converges to its true value exponentially.Theorem 2. Suppose the non-parametric uncertainties w i = 0, i = 1, . . ., n, v 1 = 0, and all states x can be accessible directly.If the regressor F satisfies the persistently exciting condition, with control law , discontinuous projection (4), and BGF estimator (27), tracking error z 1 will asymptotically converge to zero after the initial phase in Remark 1.
Proof.Using the same Lyapunov function V bn and taking Since the regressor F satisfies the persistently exciting condition, we can derive that e a !0, e b i !0, i = 1, . . .n as t !'.Consider Remark 1 and b x = x, we know j(u i + 1, r À u i + 1, d )j \ E i, u for 8E i, u after initial phase.And we suppose e i, u 4h i, s2 jz i j.Taking condition 2 of ( 9)and ( 16) into (29). where b i, min À h i, s2 , we can guarantee h i .0 by selecting h i, s1 properly.It is easy to verify that _ e a, _ e b i , and _ z i are bounded from Lemma 1 and Theorem 1 which leads to the boundedness of € V bn .Since _ V bn 40 and € V bn is bounded, we can obtain _ V bn !0 as t !' with Barbalats Lemma.Therefore, the tracking error is asymptotically converging to zero.
Adaptive unscented Kalman filter.State estimation is an indispensable part of control engineering.UKF is a popular method to estimate states from noise measurements for nonlinear systems with higher accuracy and is easier to implement than EKF.However, the performance of the UKF method suffers from a lack of precise statistical information.To improve the estimation accuracy of UKF, the AUKF method with a statistics estimator based on the MAP rule is proposed.For simplicity, the dynamic system (1) can be written as the following form.
where f is the nonlinear dynamics with parameter a b and is assumed to be known in this subsection.The values of variables are taken at each discrete time kT s where T s is the sampling period.For convenience, the subscript k represents the kth sampling value of a variable.Let us consider the UKF method first. 9 Initialization.The initial values of state expectation b x(0j0) = E½x(0) and covariance Z xx (0j0) = E½(x(0) Àx(0j0))(x(0) À x(0j0)) T are supposed to be known.
are already calculated.The sigma points are generated by, is the Cholesky fraction of (n + g)Z xx (kjk), and g = z 2 (n + k) À n is a scaling factor.The parameter z, 0 \ z \ 1, determines the spread of sigma points and is usually set to a small number.k50 is the other parameter which guarantees the positive semi-definite of covariance matrix.Then, each of these generated sigma points are updated by the nonlinear function f.
The prior estimation is given by b where w i m , w i z are defined as where m .0 is related to the prior knowledge of distribution of x and m = 2 is optimal for Gaussian distribution. 15Then, each sigma point of x(k + 1jk) is propagated through the measurement function h.
The prior estimation of y, covariance matrix Z yy (k + 1jk) and Z xy (k + 1jk) can be calculated as Post estimation.With the prior estimation, the Kalman gain is acquired by Finally, the post estimation of x and covariance matrix can be obtained.
Since the UKF is a modified Kalman filter, some prior knowledge as b x(0j0), Z xx (0j0), W, V should be known precisely.The influence of initial condition b x(0j0), Z xx (0j0) will decrease as more data get involved.The process covariance W and measurement covariance V will affect the precision of UKF as the state estimation is in progress.However, the accurate values of W and V are usually hard to acquire.Therefore, the AUKF with a statistic estimator for process and measurement covariance is proposed.
Suppose the values of W and V are unknown.The MAP estimator is aiming at finding the c W, b V and b x(ijl), i = 1, . . ., l maximizing the objective function below.J a , the estimator can be calculated by maximum the new objective function where 2 kx(0)Àb x(0)k Zxx(0j0) À1 p½W, V, kVk 2 M = V T Mv.For simplicity, we view W and V as Gaussian distributions above.
Note that the extreme points of J b are the same as ln (J b ), taking the derivative of ln (J b ).
where b f(b x(i À 1ji À 1) is the posterior mean of b x(i À 1ji À 1) transformed by nonlinear function f, can be calculated by the unscented transform as Similarly, For the purpose of easy implementation, the recursive formula of equation ( 44) can be written as

Simulation example
Simulations of the following second-order nonlinear system with model uncertainties are carried out to evaluate the performance of the proposed DARC.The dynamic model of the system, which is often used to represent the servo motor, 34 is described as where x 1 and x 2 represent unknown states, a 1 , a 2 , a 3 , and b 1 are unknown parameters, w 1 and v 1 are the uncertain nonlinearities which are set to truncated Gaussian noise, and S( _ x 1 ) = tanh(1000 _ x 1 ).a 1 , a 2 , a 3 , b 1 , w 1 , and v 1 are confined to a known region as below.
The desired trajectory is x 1d = 0:2sin(0:5pt).To verify the effectiveness of the proposed method, the indirect adaptive control with nonlinear tracking differentiator (IARC-NTD) 31,35 is regarded as the comparative object.The initial value and true value of the parameters in the simulation example are the same for referred two methods as Table 1 shown.All the design parameters have been tuned carefully for both methods.Simulations of three different scenarios have been carried out.In case 1, the unknown parameter a 3 is a constant.To observe the response of proposed methods to variation of parameters, the unknown parameter a 3 is supposed to change suddenly at 8 s and 16 s in case 2 and change with slow time-varying signal 1 + 0:05sin(0:0625pt) in case 3. Several error indexes are used to evaluate the results.where z 1 = x 1 À x 1d and E 2 = b x 2 À x 2 , and T f is the total simulation running time.
Figure 2 and Table 2 show that DARC achieves a higher tracking accuracy.Figure 3 and Table 2 indicate that our method is more accurate and reliable for the estimation of x 2 , especially the transient phase of the system.As can be seen in Figure 4, all the estimation values of parameters converge to their true values fast.Figure 5 and Table 2 show that DARC achieves a higher tracking accuracy even sudden changes of a 3 occur.The superiority is more remarkable at the time the sudden variation of parameter a 3 happens.Figure 6 and Table 2 indicate that DARC obtains a higher  accuracy for state estimation in case 2. Both Figures 5  and 6 indicate that DARC is more robust to the unpredictable change of parameters.We can see from Figure 7 that when sudden changes of a 3 occur, parameters can converge to their true values after a short period of the learning process.Figure 8 and Table 2 show the tracking performance of DARC is better when the parameter a 3 changes with the sinusoid signal.Figure 9 and Table 2 indicate that DARC achieves a better the state estimation accuracy of x 2 even the parameter of a 3 is changing continuously.Figure 10 shows that the parameter estimation accuracy of both methods is degraded slightly compared with case 1.In general, the DARC algorithm shows strong robustness in presence of unpredictable disturbances and time-varying parameters.Also, high tracking accuracy is achieved in different scenarios due to the precise estimation of parameters and unknown states.

Figure 1 .
Figure 1.The schematic of proposed dual adaptive robust control policy.

Figure 2 .
Figure 2. Reponse and tracking error of case 1: (a) System response, (b) Control input, and (c) Tracking error.

Figure 3 .
Figure 3.The estimation error of state x 2 of case 1.

Figure 4 .
Figure 4.The estimation of parameters of case 1: (a) Estimation of b 1 , (b) Estimation of a 1 , (c) Estimation of a 2 , and (d) Estimation of a 3 .

Figure 5 .
Figure 5. Reponse and tracking error of case 2: (a) System response, (b) Control input, and (c) Tracking error.

Figure 6 .
Figure 6.The estimation error of state x 2 of case 2.

Figure 7 .
Figure 7.The estimation of parameters of case 2: (a) Estimation of b 1 , (b) Estimation of a 1 , (c) Estimation of a 2 , and (d) Estimation of a 3 .

Figure 9 .
Figure 9.The estimation error of state x 2 of case 3

Table 1 .
Initial and true value of parameters.

Table 2 .
Performances of comparative experiments.