Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation

Page created by Michael Lee
 
CONTINUE READING
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
International Journal of Intelligent Robotics and Applications (2021) 5:89–100
https://doi.org/10.1007/s41315-020-00159-8

 REGULAR PAPER

Adaptive dynamic programming enhanced admittance control
for robots with environment interaction and actuator saturation
Hong Zhan1 · Dianye Huang2 · Chenguang Yang3

Received: 29 November 2019 / Accepted: 6 December 2020 / Published online: 2 February 2021
© The Author(s) 2021

Abstract
This paper focuses on the optimal tracking control problem for robot systems with environment interaction and actuator
saturation. A control scheme combined with admittance adaptation and adaptive dynamic programming (ADP) is devel-
oped. The unknown environment is modelled as a linear system and admittance controller is derived to achieve compliant
behaviour of the robot. In the ADP framework, the cost function is defined with non-quadratic form and the critic network
is designed with radial basis function neural network which introduces to obtain an approximate optimal control of the
Hamilton–Jacobi–Bellman equation, which guarantees the optimal trajectory tracking. The system stability is analysed by
Lyapunov theorem and simulations demonstrate the effectiveness of the proposed strategy.

Keywords Adaptive dynamic programming · Admittance control · Robot-environment interaction · Actuator saturation ·
Optimal control · Neural network

1 Introduction It’s noted that there are two main approaches applied in
 current research in robotics to ensure the compliant behav-
In recent decades, robots are widely applied in industrial iour, i.e., hybrid position/force control proposed by Raibert
automation, such as assembling robots, handling robots, and Craig (1981) and impedance control proposed by Hogan
welding robots. They can not only cooperate with human (1981). The former requires decomposition in position and
partners for certain work, but also can complete some tasks force subspaces and control law switching during imple-
independently, or even replace human beings to work in mentation process. Since the dynamic coupling between
some hazard environment with high temperature, pressure the robot and external environment is not considered, the
and radiation. However, in some practical applications, accuracy of this approach is difficult to be guaranteed. Com-
robots will unavoidably interact with the external environ- paratively, the latter establishes the relationship between the
ment, which will not only affect execution of the work, but robot and environment, and achieve compliant behaviour
also directly threaten safety of human partners and robots by adjusting mechanical impedance to a target value in case
themselves. Consequently, interaction control between robot interaction occurs, which guarantees the interaction safety.
and the environment has become an important research Impedance control has two execution methods according to
topic. the controller causality, i.e., impedance control and admit-
 tance control. For impedance control system, the external
 force imposed by the environment can be obtained by desired
 trajectory and impedance model, while for admittance con-
* Chenguang Yang
 cyang@ieee.org trol system, the modified motion trajectory can be derived
 from the measured interaction force and the expected admit-
1
 Key Lab. Autonomous Systems and Networked tance model. Therefore, we adopt admittance control to deal
 Control, Ministry of Education, South China University with robot-environment interaction problem.
 of Technology, Guangzhou 510640, China
 The interaction force and admittance model are signifi-
2
 Department of Informatics, Technical University of Munich, cant parts for admittance control. If interaction between the
 85748 München, Germany
 robot and environment occurs, the interaction force can be
3
 Bristol Robotics Laboratory, University of the West measured by the force sensors which are mounted at the
 of England, Bristol BS16 1QY, UK

 13
 Vol.:(0123456789)
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
90 H. Zhan et al.

end-effector of the robot. However, due to the complex envi- overcome this problem, Werbos (1992) proposed adaptive
ronment, it’s often very hard to obtain the desired admittance dynamic programming (ADP) strategy using NN to approxi-
model which is critical for admittance control system. In mate the cost function forward and then obtain the solution
addition, the fixed model can’t satisfy requirements of all of HJB equation. During the past few years, great efforts
situations. Consequently, Braun et al. (2012) took human- have been made on ADP to deal with the control issues for
robot cooperation as an example and proposed that it was nonlinear systems (Liu et al. 2014; Jiang and Jiang 2015),
essential to adopt variable admittance model to improve such as systems with dynamic uncertainties (Wang et al.
system efficiency. For variable admittance control, iterative 2018) and disturbances (Cui et al. 2017).
learning has been studied to derive the admittance param- In practical control system, actuator saturation is a com-
eters to adapt to unknown environment in robot intelligent mon phenomenon, which may affect the system perfor-
control field. To complete a wall-following task, Cohen and mance, or even result in system instability. Therefore, it is
Flash (1991) proposed an impedance learning strategy with essential and challenging to derive optimal control strategy
an associate network. Tsuji et al. (1996) introduced neural for nonlinear systems with actuator saturation. Wenzhi and
network into impedance control to tuning the model param- Selmic (2006) proposed a NN-based and feed-forward sat-
eters. But iterative learning approach requires the robot to uration compensation strategy for nonlinear systems with
perform the same task repeatedly, which is not available in Brunovsky canonical form. In Wen et al. (2011), the Nuss-
some practical application execution. So researchers have baum function was employed to compensate for the non-
adopted adaptation methods to solve this problem such as linear term caused by the input saturation. To handle the
Love and Book (2004), Uemura and Kawamura (2009), control issue for nonlinear systems with unknown saturation,
Stanisic and Fernndez (2012), Landi et al. (2017) and Yao an auxiliary system in He et al. (2016) and Peng et al. (2020)
et al. (2018). was proposed to tackle the actuator saturation. And in Zhao
 Tracking control is a very important research topic in et al. (2018) a control strategy consisted of an ADP-based
robot intelligent control area. In the current studies, a lot nominal control and a NN-based compensator was proposed.
of control methods have been employed to robot systems. In Abu-Khalaf and Lewis (2005), the HJB equation was in
Cervantes and Alvarez-Ramirez (2001) and Parra-Vega the form of a non-quadratic function and NN least-squares
et al. (2003) applied the classic proportional-integral- method was proposed to obtain the solution.
derivative(PID) control into the robot system with satis- In Peng et al. (2020), robot-environment interaction and
fied tracking performance. PID control is often used in actuator saturation are considered, while optimal control is
the industrial field owing to the simple structure and good not. However, for robot systems, it’s worthwhile to investi-
performance. But for complex systems, it is very difficult gate how to realize tracking control in an optimal manner.
to choose appropriate PID parameters which normally Therefore, based on our previous work, optimal tracking
depends on experiences of the operator. In recent years, neu- control issue for robot systems with environment interaction
ral network(NN) control has been investigated and applied and actuator saturation will be studied in this paper. Inspired
to robot systems because of strong approximation property by Abu-Khalaf and Lewis (2005), Lyshevski (1998) and
for unknown system (Yang et al. 2017). In Zhang et al. Jiang and Jiang (2012), a control scheme based on admit-
(2018), NN control was employed to improve the tracking tance control and ADP method is employed to improve the
performance of the robot system with uncertainties. In Yang control performance of robot systems. The main contribu-
et al. (2019), NN-based controller combined with admit- tions of this paper are summarized as follows
tance adaptation was proposed to tackle the robot-environ- (i) To solve interaction problem, the unknown envi-
ment interaction problem. However, these control methods ronment is regarded as a linear system and an admittance
only deal with stabilization problem of the system without adaptation approach based on iterative linear quadratic
considering optimal control. Based on the optimal control regulator(LQR) is adopted to obtain the compliant behav-
theory, we expect to find a control strategy that enables the iour of the robot.
system to reach the target in an optimal manner. To achieve (ii) To tackle the optimal tracking problem, an ADP-basd
this goal, it is usually required to minimize the specified cost controller is designed. The cost function is defined with non-
function by solving the Hamilton–Jacobi–Bellman (HJB) quadratic form. The critic network with RBFNN is devel-
equation. The HJB equation for a nonlinear system is nonlin- oped to derive an approximate solution to the minimum cost
ear partial differential, so its analytical solution is non-triv- of the HJB equation, and then the corresponding optimal
ial to derive. Dynamic programming proposed by Bellman control is obtained.
(1957) provides a useful method for solving HJB equation. The rest of this paper is arranged as follows. In Sect. 2,
However, since this method is studied based on backward the robot systems with actuator saturation and the environ-
numerical process, it will be affected by well-known curse ment dynamics are described, and the control objective is
of dimensionality with the increase of system dimension. To provided. In Sect. 3, the control strategy based on admittance

13
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
Adaptive dynamic programming enhanced admittance control for robots with environment… 91

adaptation and ADP-based optimal controller is proposed. In interaction force by the force sensor and x is the end-effector
Sect. 4, simulation studies are performed on a 2-DOF planar position of the robot in Cartesian space.
manipulator. In Sect. 5, conclusion is drawn. The system We define xd as the corresponding desired trajectory,
stability is discussed and proved in Appendix. and Ud ∈ ℝm×m is a known matrix, then xd is expressed
 as follows
2 Preliminaries and problem formulation ẋ d = Ud xd (7)

2.1 Robot dynamics Consequently, define = [x, xd ]T , the dynamics of the envi-
 ronment and desired trajectory will be derived.
The dynamics of n-link robot manipulator subjected to actu- [ ] [ ]
 −CE−1 GE −CE−1
ator saturation is described as: ̇ = + F
 Ud (8)
M(q)q̈ + C(q, q)
 ̇ q̇ + G(q) = (1) = Ae + Be F
where q ∈ ℝn , q̇ ∈ ℝn , and q̈ ∈ ℝn denote the position, Therefore, (8) can be regarded as a linear system, where F
velocity and acceleration vectors in joint space of the robot, is the control input and is the controlled state. F = −Ke 
respectively. , and A denote the joint torque, admissi- is the corresponding optimal feedback control law and the
ble control set and constant saturation bound, respectively,
and ∈ , = { ∈ ℝn ∶ | i | ≤ A} is satisfied. For the
 object is to minimize the cost function given as

sake of brevity, we use M, C and G to denote the known
 ∞ ( T )
 ∫0 (9)
 1 = xe QE1 xe + F T RE F dt
inertial matrix M(q) ∈ ℝn×n , coriolis/centrifugal matrix
 ̇ ∈ ℝn×n and gravity vector G(q) ∈ ℝn , respectively.
C(q, q)
 If we define the reference trajectory as qr ∈ ℝn, the track- From (9), we can see that the purpose of modifying trajec-
ing error qe ∈ ℝn is given as qe = q − qr . Define the sliding tory xd is to balance interaction force F and tracking error xe
motion surface as = qe + q̇ e , where ∈ Rn×n is a con- defined as xe = x − xd , which can be realized by adjusting
stant positive matrix, then we have user-defined matrices QE1 and RE.
 The robot dynamics with saturated actuator and
q̇ = − qe + q̇ r unknown environment dynamics are described in this sec-
 (2)
q̈ = ̇ − qe + q̈r tion. Then, an ADP enhanced admittance control scheme
 will be designed to ensure the complaint behaviour and
According to (1) and (2), the error dynamics is derived as optimal trajectory tracking with robot-environment
 interaction.
 ̇ = − M −1 C( − qe + q̇ r ) − M −1 G
 (3)
 − q̈r + q̇ e + M −1 

Consequently, we can obtain the following system:
 3 Control strategy
 ̇ = f ( ) + g( ) (4)
 As shown in Fig.1, the designed control scheme inspired
where f ∶ ℝn → ℝn and g ∶ ℝn → ℝn×n are non-linear func- by Zhan et al. (2020) in this section consists of three parts,
tions and described as including an optimal trajectory modifier using admittance
 control to modify user-desired trajectory xd to modified
 f ( ) = − M −1 C( − qe + q̇ r ) − M −1 G − q̈ r + q̇ e
 (5) trajectory xr , a closed loop inverse kinematics (CLIK)
g( ) = M −1 solver to transform xr in Cartesian space into qr in joint
 space and an optimal trajectory tracking controller based
 on ADP with the output torque acting on robot manipula-
2.2 Environment dynamics tor to ensure optimal tracking performance.

In this paper, we consider the unknown interaction environ-
ment which is regarded as a damping-stiffness model in Ge 3.1 Trajectory modifier using admittance control
et al. (2014) given by
CE ẋ + GE x = −F (6) Formula (9) can be written as the following form by trans-
 formation, and the system counterpart is consistent with
where CE and GE are unknown damping and stiffness of system (8).
the environment, respectively. F represents the measured

 13
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
92 H. Zhan et al.

Fig. 1  An illustration of the
proposed control scheme

 ∞
 ( T ) Q(k)
 ∫
 = QE + F T RE F dt E
 = QE + Ke(k)T RE Ke(k)
 [ ( ) ( )]
 [0 ] (10) (k) = d ̄ , −2I In ⊗ Ke(k)T RE − 2IF In ⊗ RE
 QE1 − QE1 Ud ( )
QE = (14)
 −UdT QE1 UdT QE1 Ud (k) = −I vec Q(k)
 E
 [ (k) ]
It’s noted that solving (10) can be regarded as the process p̂ ( ( )
 ) = (k)T (k) −1 (k)T (k)
similar to LQR problem. Then, the algebraic riccati equa- vec Ke (k+1)

tion (ARE) associated with (9) and (10) is given in (11).
And in this subsection, an algorithm proposed by Jiang and When we obtain the optimal feedback gain Ke, the modified
Jiang (2012) is employed to solve the ARE and obtain the trajectory xr which is to be tracked and equal to x in (15)
feedback gain Ke in (11). can be calculated by (16), where Ke1 and Ke2 are compatible
 matrices of Ke.
PAe + ATe P + QE − PBe R−1
 E
 BTe P = 0 [ ]
 (11) [ ] x
Ke = −R−1
 E
 BTe P F = − Ke = − Ke1 Ke2
 xd (15)

Now, we list the matrices with sampled signals as follows
 (16)
 −1 −1
 [ ]T xr = − Ke1 F − Ke1 Ke2 xd
 p̂ = p11 , 2p12 , … , 2p1n , p22 , 2p23 , … , pnn
 [ ]T
 ̄ = 12 , 1 2 , … , 1 n , 22 , 2 3 , … , n2
 [ ( ) ( ) ( ) ( ) ( ) ( )]T 3.2 CLIK solver
d ̄ = ̄ t1 − ̄ t0 , ̄ t2 − ̄ t1 , … , ̄ td − ̄ td−1
 [ t1 t2 td ]T
 We adopt the CLIK algorithm proposed by Siciliano
 ∫t0 ∫t1 ∫td−1
 
I = ⊗ dt, ⊗ dt, … , ⊗ dt
 1990 to transform reference trajectory xr in Cartesian
 [ t1 t2 td ]T space into qr in joint space. Let (∗) and Kf represent
 ∫t0 ∫t1 ∫td−1
 
IF = ⊗ Fdt, ⊗ Fdt, … , ⊗ Fdt the forward kinematics and a positive user-defined
 matrix, respectively. Define e ∶= (qr ) − xr , ė = −Kf e ,
 (12)
 ẋ = Jco q̇ , Jco = (q)∕ q , then
where n, m, d denote the length of , F and the sample times
integer, respectively. pij , i represent entries of P and , †
 q̇ r = Jco (ẋ r − Kf ( (qr ) − xr )) (17)
respectively. In addition, in (12), ⊗1 reprents the Kronecker
 1 1
product, and p ∈ ℝ 2 n(n+1) , ̄ ∈ ℝ 2 n(n−1) , d ̄ ∈ ℝd× 2 n(n−1) , Integrating both sides of the above equation, qr can be
I ∈ ℝd×n , IF ∈ ℝd×nm.
 2
 obtained as follows
 Let ‖ ∗ ‖ and vec(∗) denote the 2-norm of ∗ and the column t ( † )
 ∫0
vectorization of ∗, respectively. Let k and In ∈ ℝn×n denote qr = †
 J ẋ r − Jco Kf ( (qr ) − xr ) dt (18)
iteration index and an identity matrix, respectively. If the sam-
pled data is large enough and the rank condition in (13) is where q(0) = −1 (xr (0)) , Jco† T
 = Jco T
 (Jco Jco + In )−1 , and
satisfied, Ke can be solved by iteratively calculate (14) until ∈ ℝ. Note that is used to prevent the singularity prob-
||̂p(k) − p̂ (k−1) || < , where is an acceptable range. lem and it is also required to be small enough to promote the
 ([ ]) n(n + 1) accuracy of the solution.
rank I , IF = + nm (13)
 2

13
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
Adaptive dynamic programming enhanced admittance control for robots with environment… 93

3.3 Optimal control using ADP where J( ) denotes J( (t)) and ∇ ∗≜ ∗
 denotes the partial
 
 derivative of * for convenience.
The objective of this section is to find the stabilizing con- Therefore, the Hamiltonian function and optimal cost
trol input of the robot system (4) which could minimize function are described as
the defined cost function. According to optimal theory, the
optimal feedback control of system (4) can be obtained by 

 ∫0
 H( , ( ), ∇J( )) = ( ) + 2Ar (tanh−1 (v∕A))T dv+
solving HJB equation in ADP framework. The structure dia-
gram of the ADP-based tracking controller is given in Fig. 2. (∇J( ))T (f ( ) + g( ) ( ))
 We assume that system (4) is controllable, the nonlinear (23)
functions f ( ) and g( ) are Lipschitz continuous and differ-
 J( )∗ = min
entiable in ℝ2n . In order to deal with actuator saturation of ∈ 
the robot system, inspired by Abu-Khalaf and Lewis (2005) ∞ (24)
 ∫t
and Lyshevski (1998), we define the cost function as follows [ ( (s)) + U( (s), ( (s)))]ds
 ∞

 ∫t
J( (t)) = [ ( (s)) + U( (s), ( (s)))]ds (19) We can derive HJB equation as below
 0 = min H( , ( ), ∇J ∗ ( )) (25)
where ∈ 

 Suppose that the minimum value on the right side of formula
 ( (s)) = (s)T Q (s) (20) ∗ ( ))
 (25) exists and also is unique, then from H( , ( ),∇J
 
 = 0,
 we can obtain the optimal control ∗ ( ) as
 ∫0
 −1
 (21)
 T
U( (s), ( (s))) = 2A ( (v∕A)) Rdv ( )
 1 −1 T
 ∗ ( ) = −A tanh r g ( )∇J ∗ ( ) (26)
 2A
It is noted that Q ∈ ℝ in (20) is symmetric positive defi-
 n×n
 [ ]T Substituting (26) into (22), another HJB equation form
nite. In (21), −1 (v∕A) = −1 (v1 ∕A), −1 (v2 ∕A), ⋯ , −1 (vn ∕A) ,
 ∈ ℝn . (⋅) is a strictly monotonic odd function and its related to ∇J ∗ ( ) will be derived as
first derivative is bounded by a constant B. Meanwhile, R H( , ∗ ( ), ∇J ∗ ( )) = 0 (27)
is also a symmetric and positive definite matrix. Therefore,
U( (s), ( (s))) is also positive definite. Without loss of gen- Then, from (26) and (27), the HJB equation for the robot
erality, we select (⋅) = tanh(⋅) and R = rIn with r as a posi- system with actuator saturation becomes
tive constant and In as the identity matrix of n-dimension.
 If J( (t)) defined in (19) is continuously differentiable, by H( , ∗ ( ), ∇J ∗ ( )) = (∇J ∗ ( ))T f ( )
taking the time derivative of (19), we can get the following − 2A2 rDT ( ) tanh(D( ))
nonlinear Lyapunov equation with J(0) = 0 , which is an −A tanh(D( ))
 (28)

 ∫0
infinitesimal form of (19).
 -T
 + ( ) + 2Ar tanh (v∕A)dv = 0
 
 ∫0
0 = ( ) + 2Ar (tanh−1 (v∕A))T dv where D( ) = 2A r g ( )∇J ∗ ( ). Applying the integral for-
 1 −1 T
 (22) mula of inverse hyperbolic function, we have
 + (∇J( ))T (f ( ) + g( ) ( ))

Fig. 2  Structure diagram of the
ADP-based tracking controller

 13
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
94 H. Zhan et al.

 −A tanh(D( )) 1 −1 T
 ∫0
2Ar tanh-T (v∕A)dv B1 ( ) = r g ( )(∇S( ))T w (36)
 2A
 n
 ∑ −A tanh(Di ( )) ( ) where B1 ( ) = (B11 ( ), … , B1n )T , B1i ∈ ℝ and HJB is the
 ∫
 = 2Ar tanh-T vi ∕A dvi HJB approximation error.
 i=1 0 (29)
 2 T
 Actually, the ideal w and J ∗ ( ) in (31) are not available, so
 = 2A rD ( ) tanh (D( )) we can derive the estimated weight and optimal cost function
 n
 ∑ [ ] which are represented by ŵ and J( )
 ̂ respectively by the con-
 + A2 r ln 1 − tanh2 (Di ( ))
 structed critic NN described as
 i=1

w h e re D( ) = (D1 ( ), … , Dn ( ))T w i t h
 ̂
 J( ) = ŵ T S( ) (37)
Di ( ) ∈ ℝ, i = 1, … , n. Substituting (29) into (28), (28) can
 Then, the partial derivative of J( )
 ̂ refers to and the approx-
be rewritten as follows
 imate optimal control ( )
 ̂ can be obtained as follows
H( , ∗ ( ), ∇J ∗ ( )) = (∇J ∗ ( ))T f ( ) ̂
 ∇J( ) = (∇S( ))T ŵ (38)
 n
 ∑ [ ] (30)
+ ( ) + A2 r ln 1 − tanh2 (Di ( )) = 0 ( )
 1 −1 T
 i=1 ( )
 ̂ = − A tanh r g ( )(∇S( ))T ŵ (39)
 2A
However, (30) is a nonlinear partial differential equation
with regard to J ∗ ( ) and it’s very difficult to obtain J ∗ ( ) According to (23), (38) and (39), we can obtain the approxi-
from it, even impossible. mate Hamitonian function H( ,
 ̂ ̂ n ( ), ∇J( ))
 ̂ shown as
 Suppose J ∗ ( ) is continuously differentiable, it can be con- ̂ ( ), ̂
 H( , ̂ ∇J( )) = ŵ T (∇S( ))f ( ) + ( )
structed by RBFNN and described by
 (40)
 n
 ∑ [ ]
 ∗ T
J ( ) = w S( ) + ( ) (31)
 2
 +A r ln 1 − tanh2 (B2i ( ))
 i=1
where w ∈ ℝl and S ∶ ℝ2n → ℝl represent the ideal constant
weight and the activation function, respectively. l and ( ) 1 −1 T
denote the node number of the hidden layer and the unknown
 B2 ( ) = r g ( )(∇S( ))T ŵ (41)
 2A
approximation error of the critic NN, respectively. Conse-
 where B2 ( ) = (B21 ( ), … , B2n ( ))T, B2i ( ) ∈ ℝ . Now we
quently, we can obtain the derivation of (31) refer to as
 define the approximate neural network weight error as
follows.
 w̃ = w − ŵ , and the error between Ĥ and H ∗ as EH , then,
∇J ∗ ( ) = (∇S( ))T w + ∇ ( ) (32) we have

From (26) and (32) and using Taylor series expansion, we ̂ ( ),
 EH = H( , ̂ ̂
 ∇J( )) − H ∗ ( , ∗ ( ), ∇J ∗ ( ))
have ∗ shown as ̂ ( ),
 = H( , ̂ ̂
 ∇J( ))
 ( )
 1 −1 T = − w̃ T ∇S( )f ( ) (42)
 ∗ ( ) = − A tanh r g ( )(∇S( ))T w + ∗ (33)
 2A n
 ∑[ ]
 + A2 r (B2i ( )) − (B1i ( )) − HJB
 1( ) i=1
 ∗ = − − tanh2 ( ) gT ( )∇ ( ) (34) [ ]
 2 where (B i ( )) = ln 1 − tanh2 (B i ( )) , = 1, 2 and
where = (1, … , 1)T ∈ ℝn and ∈ ℝ(n is selected between i = 1, … , n. Note that (B i ( )) can be expressed as
 )
 r g ( )(∇S( ))T w and 2A
 1 −1 T
 r g ( ) (∇S( ))T w + ∇ ( ) .
 1 −1 T
 [ ]
2A (B i ( )) = ln 1 − tanh2 (B i ( ))
Then, by substituting (32) into (30), (30) will be written as { ( ( ))
 ln 4 − 2B i ( ) − 2 ln (1 + exp (−2B i ( )
 )) , B i ( ) > 0
 =
H ∗ ( , ∗ ( ), ∇J ∗ ( )) = wT (∇S( ))f ( ) + ( ) ln 4 + 2B i ( ) − 2 ln 1 + exp 2B i ( ) , B i ( ) < 0
 n (43)
 ∑ [ ]
 +A r 2
 ln 1 − tanh2 (B1i ( )) For convenience, it can be written as follows
 i=1
 (B i ( ))
 + HJB = 0
 (35) = ln 4 − 2B i ( )sgn(B i ( )) − 2 (44)
 ln [1 + exp(−2B i ( )sgn(B i ( )))]

13
Adaptive dynamic programming enhanced admittance control for robots with environment interaction and actuator saturation
Adaptive dynamic programming enhanced admittance control for robots with environment… 95

where sgn(B i ( )) is the sign function. 3.4 Stability analysis
 To train the critic NN, inspired by Liu et al. (2017)
and Yang et al. (2013), a suitable weight updating law ŵ We will discuss the system stability of the robot and give
is designed, which can minimize the objective function detailed proof that the estimated weight error w̃ and the sys-
Ec = 21 EH2 and also ensure that ŵ converges to w. tem state are ultimately uniformly bounded.
 Now we give the necessary assumption as follows:
ŵ̇ = − H ( ( )
 ̄
 Assumption There exist known positive constants wm ,
 n
 ∑
 ln[1 − tanh2 (B2i ( ))]) M , N such that ‖w‖ ≤ wm , ‖ ‖ ≤ M , ‖ u∗ ‖ ≤ N , respec-
 T 2
 + ŵ ∇S( )f (x) + A r
 i=1
 tively. Item g( ) in (4) is bounded over a compact set ,
 + H h∇S( )g( )[In − Z(B2 ( ))]gT ( )∇Vs ( ) (45) i.e., there exist positive constants gm and gM such that
 gm ≤ ‖g( )‖ ≤ gM .
 2
 T
 + H (A∇S( )g( )[tanh(B2 ( )) − sgn(B2 ( ))] ŵ
 ms
 Theorem Considering the robot system (1) referring to
 − (F2 ŵ − F1 T w))
 ̂ actuator saturation, the corresponding HJB equation (30)
w h e r e ̄ = ∕ms 2 , ms = 1 + T , = ∇S( )f ( ) and Assumption, if the control law is designed as (39) and
−A∇S( )g( ) tanh(B2 ( )), the critic NN weight update according to (45), it can be
 [ = ∕ms , H > 0 is a designed] concluded that the critic NN weight approximation error
parameter, Z(B2 ( )) = diag tanh2 (B21 ( )), … , tanh2 (B2n ( ))
and F1 and F2 are tuning parameters with suitable dimen- w̃ and the state are guaranteed to be ultimately uniformly
sions. In (45), h is described as follows: bounded(UUB).
 {
 0, if (∇Vs ( ))T (f ( ) − Ag( ) tanh(B2 ( ))) < 0 Proof see the Appendix.  ◻
h=
 1, else
 (46)
where Vs ( ) is chosen as a Lyapunov function candidate 4 Simulation study
which is continuously differentiable. Suppose that one posi-
tive definite matrix N exists, we have the following formula 4.1 Simulation settings
satisfied.
 A two-link manipulator, constructed by the robotics toolbox
V̇ s ( ) = (∇Vs ( ))T (f ( ) + g( ) ∗ ) in Corke (2017) and shown in Fig. 3, is employed to verify the
 (47) proposed control strategy, whose dynamic parameters are given
 = −(∇Vs ( ))T N∇Vs ( ) < 0
 in Table 1. The simulation runs on the Matlab 2018a software
Here, Vs ( ) is a polynomial with regard to the state variable , with an ode3 solver and the fixed time step is 0.01s. The robot
which can be appropriately selected, such as Vs ( ) = 12 T k . manipulator is required to track a reference trajectory and inter-
 act simultaneously with a virtual environment govern by

 −CE ẋ − GE (x − x0 ) x ≤ x0
Remark 1 The ŵ̇ in (45) composes of two parts: the first {
term is based on the standard gradient descent algorithm and F= (48)
the others are introduced to ensure the stability of the robot 0 x > x0
system during the critic NN learning process. Note that in
(46), if (∇Vs ( ))T (f ( ) − Ag( ) tanh(B2 ( ))) ≥ 0, the system
 where CE = 0.1, GE = 1.0 , x0 denotes the contour of an
 object and F denotes the reactive force due to the penetration
is unstable, then h = 1 and the second term in (46) will be
 into the object. For simplicity and generality, only the trajec-
activated, which improves the learning process. Therefore,
 tory along x-axis is modified and disturbed by the external
the initial stabilizing control requirement will be released.
 interaction forces.
Remark 2 From (40) and (45), we can see that if x = 0 and
 f (x) = 0 , then H( ,
 ̂ ( ),
 ̂ ̂
 ∇J( )) = 0 . If F2 = F1 T , then Table 1  Parameters of the robot manipulator
we have ŵ = 0 and the critic NN will not be updated and
 ̇ Item Value of link 1 Value of link 2
the optimal control may not be obtained. Consequently, the
 Length of the link 0.50 m 0.50 m
persistence excitation is required.
 Mass of the link 5.00 kg 5.00 kg
 Initial joint position 0.08211 rad 1.89 7rad

 13
96 H. Zhan et al.

 0.3
 Modified trajectory
 Actual trajectory
 0.25 Reference trajectory

 Trajectory along x-axis (m
 )
 0.2

 0.15

Fig. 3  An illustration of the simulation settings 0.1

 Time instant when trajectory begins to be modified
 0.05
 Parameters of the proposed control scheme are selected
as follows: for the “Optimal Trajectory Modifier” block 0
 0 2 4 6 8 10 12 14 16 18 20
in Fig. 1, in (10), QE1 = 1.0 , RE = 1.0 , the reference tra- Time (sec)
 0.06
jectory is xd = [0.3e−0.5t , 0.5]T m , where Ud = 0.3 ; the Tracking error along x-axis
feedback gain of the inverse kinematics in (18), Kf = 30 , 0.04
 Tracking error along y-axis

 = 1e − 6. An RBFNN is selected to approximate the cost
function in (31), where S( ) = exp(( − c)T ( − c)∕ N 2 )

 Tracking errors (m
 0.02

 )
with ŵ ∈ ℝ49 , S( ) ∈ ℝ49 . For the controller in (39),
 0
A = 6N ⋅ m , the centers and variance of the RBFNN are
c ∈ [−1.5, −0.5, −0.1, 0, 0.1, 0.5, 1.5] × [−1.5, −0.5, −0.1, 0, -0.02
0.1, 0.5, 1.5], N = 0.6 , w(0)
 ̂ = . For the updated law in
(45), Vs = 2 T , H = 30, Q = 200, R = 0.006, F1 = 1e − 6, -0.04

F2 = 1e − 8. -0.06
 0 2 4 6 8 10 12 14 16 18 20
 Time (sec)
4.2 Results analysis
 Fig. 4  Tracking performance of the proposed control scheme. Up:
The control performance is shown in Fig. 4, from which we reference and modified trajectory during the interaction. Down: track-
can see that at the beginning of the control, there exists a large ing errors
transient error since the weights of the RBFNN have not con-
verged. However, before the trajectory starts to be modified
at t = 4.2 s, the tracking error has reduced to an acceptable 5 Conclusion
range. Subsequently, the actual trajectory gradually converges
to the desired trajectory. Fig. 5 gives the control signals during In this paper, the optimal tracking control issue for robot
the control process. In this figure, we can clearly see that the systems with environment interaction and actuator satura-
control input stays within the limits of the actuator and weights tion is addressed. An ADP-based controller enhanced admit-
of the RBFNN eventually converge to constant values. These tance adaptation control scheme is developed. The unknown
observations demonstrate the effectiveness of the ADP-based environment is considered as a linear system and admittance
controller under the saturation effect. adaptation control ensures the complaint behaviour of the
 To show the effectiveness of the optimal admittance adap- robot. In ADP-based controller, to guarantee the optimal
tation control, the control performance under two different tracking performance, RBFNN is used to approach to the
feedback gain Ke that affects the trajectory modification in minimum cost function and make an optimal control of the
(16) is compared, wherein Ke is obtained by assuming that
 opt
 HJB equation. The system stability is analysed and the simu-
the dynamic parameters of the environment in (6) are exactly lation studies are performed to demonstrate the effectiveness
known, while Ke is calculated by the algorithm presented
 pro
 of this control scheme.
in (14). Note that unlike the virtual environment used in Other input constraints such as dead zones and hysteresis,
(48), environmental dynamics in (6) adopted for theoreti- and dynamic uncertainties are also very common in actual
cal design does not take the contour of the environment x0 robotic systems. These constraints will not only reduce the
into consideration. Thus, Ke is sub-optimal. The results are
 opt
 system performance, but also affect the system stability.
shown in Fig. 6. We can notice that both the tracking error Consequently, under ADP framework, the optimal control
and value of the cost function in (9) under Ke are smaller
 pro
 with other constraints and dynamic uncertainties will be
than those under Ke , which shows the superiority of the
 opt
 considered in our future work.
proposed method when dynamics of the environment are
unknown.

13
Adaptive dynamic programming enhanced admittance control for robots with environment… 97

 0.45

 0.4

 0.35

 0.3

 0.25

 0.2

 0.15

 0.1

 0.05

 0
 0 2 4 6 8 10 12 14 16 18 20
 Time (sec)

 0.3
 M
 A
 0.25
 M
 A

 Tracing performance
 0.2

 0.15

 0.1

 0.05

 0
 0 2 4 6 8 10 12 14 16 18 20
 Time (sec)
Fig. 5  Control signals of the proposed control scheme. Up: control
input of the controller. Down: weights of the RBFNN
 Fig. 6  A comparison for the modified tracking performance under the
 optimal feedback gain Ke in (11) and feedback gain obtained from
 opt

 the proposed control scheme Ke . Up: time series of the cost function
 pro
Appendix in (9); Down: tracking performance

Stability analysis

This appendix demonstrates the stability of the ADP-
based controller proposed in this paper for robot systems From (40) and (51), we have
with actuator saturation. The Lyapunov candidate is
selected as follows (Liu et al. 2017) ̂ ( ),
 H( , ̂ ̂
 ∇J( ))
 1 ̃ T −1 ̃ = 2A2 r[BT1 ( )sgnB1 ( ) − BT2 ( )sgnB2 ( )]
V( ) = Vs ( ) + W H W (49)
 2 − w̃ T ∇S( )f ( )
From (49) and (39), the derivative of V( ) can be derived as + A2 r B − HJB

 ̇ = (∇Vs )T ( )(f ( ) − Ag( )tanh(B2 ( ))) + w̃̇ T −1 w. = A[wT ∇S( )g( )sgnB1 ( ) (52)
V( ) H ̃ T
 (50) − ŵ ∇S( )g( )sgnB2 ( )]
Next we will calculate the last term in (50). Note that − w∇S( )f
 ̃ ( ) + A2 r B − HJB
n = −w∇S( )f
 ̃ ( )
∑
 (B i ( )) + Aw̃ T ∇S( )g( )sgnB2 ( ) + D1 ( )
i=1
 = n ln 4 − 2BT ( )sgn(B ( )) (51) where
 n
 ∑ n
 [ ]
 ∑ 1 + exp −2B1i ( )sgn(B1i ( ))
 −2 ln [1 + exp(−2B i ( )sgn(B i ( )))] B = 2 ln [ ] (53)
 i=1 i=0 1 + exp −2B2i ( )sgn(B2i ( ))

 13
98 H. Zhan et al.

 [ ]
D1 ( ) = AwT ∇S( )g( ) sgn(B1 ( )) − sgn(B2 ( )) w̃̇ T H−1 w̃
 (54)
 + A2 r B − HJB = − T W1 + T W2
 1 (59)
From given in (45), we have − h(∇Vs ( ))T g( )[In − Z(B2 ( ))]
 2
 = ∇S( )f ( ) − A∇S( )g( ) tanh(B2 ( )) . Then, (52)
 gT ( )(∇S( ))T w̃
becomes [ ] [ ]
 I − 12 F1 T D̄1 ( )
̂ ( ),
H( , ̂ ̂
 ∇J( )) where W1 = , W2 = ̄ .
 (55) − 21 F1 F2 D2 ( ) + F2 w − F1 T w
 T T
 = −w̃ + Aw̃ ∇S( )g( )T( ) + D1 ( ) From (59) and (50), if we choose appropriate F1 and F2 to
 make W1 positive definite, the following result will be
where T( ) = sgn(B2 ( )) − tanh(B2 ( )).
 derived.
 Based on (40), (45) and (55), we have
 ̇
 V( ) ≤ (∇Vs ( ))T (f ( )
w̃̇ = H [−w̃ T + Aw̃ T ∇S( )g( )T( ) + D1 ( )]
 ms − Ag( )tanh(B2 ( ))) − min (W1 )‖ ‖2
 
 − H h∇S( )g( )[In − Z(B2 ( ))]gT ( )∇Vs ( ) (56) + bm ‖ ‖− (60)
 2
 1
 T h(∇Vs ( ))T g( )[Im − Z(B2 ( ))]
 + H [A∇S( )g( )T( ) ŵ + (F2 ŵ − F1 T w)] ̂ 2
 ms
 gT ( )(∇S( ))T w̃
Consequently, the last term in (50) can be expressed as
 where min (∗) denotes the minimum eigenvalue of matrix ∗
follows
 and bm is the upper bound of ‖W2 ‖.
w̃̇ T H−1 w̃ =[−w̃ T Case One: h = 0 , that is (∇Vs ( ))T (f ( ) − Ag( )
 tanh(B2 ( ))) < 0 . Since ‖ ‖ > 0 , then there exist a constant
 + Aw̃ T ∇S( )g( )T( ) ̇ implies ∇Vs ( ))T ̇ ≤ −as ‖∇Vs ( )‖.
 as such that 0 < as ≤ ‖ ‖
 T Consequently, we can obtain
 + D1 ( )] w̃
 ≤ − as ‖∇Vs ( )‖
 ms
 ̇
 V( )
 1
 − h(∇Vs ( ))T bm
 2 − min (W1 )(‖ ‖ − )2
 g( )[Im − Z(B2 ( ))]gT ( )(∇S( ))T w̃ 2 min (W1 ) (61)
 T b2m
 + Aw̃ T ∇S( )g( )T( ) ŵ +
 ms (57) 4 min (W1 )
 + w̃ T (F2 ŵ − F1 T w)
 ̂
 From (61), we can see that if one of the following conditions
 = − w̃ T T w̃ is satisfied, then V( )
 ̇ < 0 will be obtained.
 + D̄1 ( ) T w̃ + w̃ T D̄2 ( )
 bm 2
 1 ‖∇Vs ( )‖ > ,
 − h(∇Vs ( ))T g( )[In − Z(B2 ( ))] 4as min (W1 )
 2 (62)
 bm
 gT ( )(∇S( ))T w̃ or ‖ ‖ >
 min (W1 )
 + w̃ T (F2 ŵ
 − F1 T w) Note that (1+a)
 a
 ≤ 41 , ∀a, while ‖ ‖2 = T 
 , then we have
 ≤ 21 .
 ̂ 2 (1+ T )2
 T ‖ ‖ From the definition of , we can obtain
where D̄1 ( ) = , D̄2 ( ) = A∇S( )g( )T( ) m w.
 D1 ( )

 ‖ ‖ ≤ 1 + ‖ ‖2 ‖w‖
 ̃ ≤
 � √
 ms
 Applying ŵ = w − w̃ , we have
 s
 2
 5
 ̃ .
 ‖w‖ Consequently, From (62),
 we have
w̃ T (F2 ŵ − F1 T w)
 ̂ = w̃ T F2 w − w̃ T F2 w̃ − w̃ T F1 T w + w̃ T F1 T w̃
 (58) 2bm
 ‖w‖
 ̃ >√ (63)
Substituting (58) into (57) and defining T = [w̃ T , w̃ T ], (57) 5 min (W1 )
can be written as
 Case Two: h = 1, that is (∇Vs ( ))T (f ( ) − Ag( ) tanh(B2 ( )))
 ≥ 0, then (60) becomes

13
Adaptive dynamic programming enhanced admittance control for robots with environment… 99

 ≤ (∇Vs ( ))T f ( ) − A(∇Vs ( ))T g( )(tanh(B2 (x)) From ‖ ‖ ≤
 √
̇
V( )
 2
 5
 ̃ and
 (70), we have
 ‖w‖
 1 �
 + [I − Z(B2 ( ))]gT ( )(∇S( ))T w) ̃ bm 0
 2A m (64) ‖w‖
 ̃ >√ +2
 5 min (W1 ) (71)
 bm bm 2 5 min (W1 )
 − min (W1 )(‖ ‖ − )2 +
 2 min (W1 ) 4 min (W1 )
 According to the Lyapunov theorem and combining Case
Using the Taylor series expansion, we have One and Case Two, it’s concluded that the NN weight
 approximation error w̃ and function Vs ( ) are UUB. Since
tanh(B1 ( )) − tanh(B2 ( )) Vs ( ) is a selected polynomial with regard to , we can con-
 = tanh(B2 ( ))(B1 ( ) − B2 ( )) + O((B1 ( ) − B2 ( ))2 ) cluded that the state is also UUB. This completes the sta-
 1 bility analysis.
 = [I − Z(B2 ( ))]gT ( )(∇S( ))T w̃
 2A n
 + O((B1 ( ) − B2 ( ))2 )
 (65) Open Access This article is licensed under a Creative Commons Attri-
 bution 4.0 International License, which permits use, sharing, adapta-
Then, we can get tion, distribution and reproduction in any medium or format, as long
 as you give appropriate credit to the original author(s) and the source,
 1 provide a link to the Creative Commons licence, and indicate if changes
 tanh(B2 ( )) + [I − Z(B2 ( ))]gT ( )(∇S( ))T w̃
 2A n (66) were made. The images or other third party material in this article are
=tanh(B1 ( )) − O((B1 ( ) − B2 ( ))2 ) included in the article’s Creative Commons licence, unless indicated
 otherwise in a credit line to the material. If material is not included in
 the article’s Creative Commons licence and your intended use is not
Substituting (66) into (64), we have permitted by statutory regulation or exceeds the permitted use, you will
 ≤ (∇Vs ( ))T (f ( ) + g( ) ∗ ) − Vs T g( ) ∗ +
 need to obtain permission directly from the copyright holder. To view a
̇
V( ) copy of this licence, visit http://creat​iveco​mmons​.org/licen​ses/by/4.0/.
 A(∇Vs ( ))T g( )O((B1 ( ) − B2 ( ))2 )−
 (67)
 bm bm 2
 min (W1 )(‖ ‖ − )2 + References
 2 min (W1 ) 4 min (W1 )
 Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlin-
According to the Assumption, (67) can be rewritten as ear systems with saturating actuators using a neural network hjb
 ≤ − min (N)(‖Vs ‖
̇ approach. Automatica 41(5), 779–791 (2005)
V( ) Bellman, R.: Dynamic programming. Princeton University Press,
 Princeton (1957)
 − )2 − min (W1 )(‖ ‖ Braun, D., Petit, F., Huber, F., Haddadin, S., van der Smagt, P., Albu-
 2 min (N) (68)
 Schaffer, A., Vijayakumar, S.: Optimal torque and stiffness control
 bm in compliantly actuated robots. pp. 2801–2808 (2012)
 − ) + 0
 2 min (W1 ) Cervantes, I., Alvarez-Ramirez, J.: On the pid tracking control of robot
 manipulators. Syst. Control Lett. 42(1), 37–46 (2001)
where = gM ( N + A m ) , m is the upper bound of Cohen, M., Flash, T.: Learning impedance parameters for robot
 control using an associative search network. IEEE Trans Robot
O((B1 ( ) − B2 ( ))2 ), and 0 is shown as follows Autom 7, 382–390 (1991)
 2 Corke, P.: Robotics, vision and control: fundamental algorithms in
 2 bm MATLAB® second, completely revised, vol. 118. Springer,
 0 = + (69)
 4 min (N) 4 min (W1 ) New York (2017)
 Cui, X., Zhang, H., Luo, Y., Jiang, H.: Adaptive dynamic program-
From (68), we can see that if one of the following conditions ming for tracking design of uncertain nonlinear systems with
 disturbances and input constraints. Int. J. Adapt. Control Signal
is satisfied, then V( )
 ̇ < 0 will be obtained.
 Process. 31(11), 1567–1583 (2017)
 Ge, S.S., Li, Y., Wang, C.: Impedance adaptation for optimal robot-
‖Vs ( )‖ > environment interaction. Int. J. Control 87(2), 249–263 (2014)
 2 min (N) He, W., Dong, Y., Sun, C.: Adaptive neural impedance control of a
 �
 0 robotic manipulator with input saturation. IEEE Trans. Syst.
 + Man Cybern. Syst. 46(3), 334–344 (2016)
 min (N) Hogan, N.: Impedance control: an approach to manipulation-part
 bm (70) i: theory; part ii implementation; part iii: applications. Trans
 or ‖ ‖ > ASME J. Dyn. Syst. Meas. Control 107(2), 1–24 (1981)
 2 min (W1 )
 � Jiang, Y., Jiang, Z.P.: Computational adaptive optimal control for
 0 continuous-time linear systems with completely unknown
 + dynamics. Automatica 48(10), 2699–2704 (2012)
 min (W1 )

 13
100 H. Zhan et al.

Jiang, Y., Jiang, Z.: Global adaptive dynamic programming for con- Zhan, H., Huang, D., Chen, Z., Wang, M., Yang, C.: Adaptive dynamic
 tinuous-time nonlinear systems. IEEE Trans. Autom. Control programming-based controller with admittance adaptation for robo-
 60(11), 2917–2929 (2015) tenvironment interaction. Int. J. Adv. Robot. Syst. 17(3), (2020)
Landi, C.T., Ferraguti, F., Sabattini, L., Secchi, C., Fantuzzi, C.: Zhang, S., Dong, Y., Ouyang, Y., Yin, Z., Peng, K.: Adaptive neu-
 Admittance control parameter adaptation for physical human- ral control for robotic manipulators with output constraints and
 robot interaction. In: 2017 IEEE International Conference on uncertainties. IEEE Trans. Neural Netw. Learn. Syst. 29(11),
 Robotics and Automation (ICRA), pp. 2911–2916 (2017) 5554–5564 (2018)
Liu, D., Wang, D., Wang, F., Li, H., Yang, X.: Neural-network-based Zhao, B., Jia, L., Xia, H., Li, Y.: Adaptive dynamic programming-based
 online hjb solution for optimal robust guaranteed cost control stabilization of nonlinear systems with unknown actuator satura-
 of continuous-time uncertain nonlinear systems. IEEE Trans. tion. Nonlinear Dyn. 93(4), 2089–2103 (2018)
 Cybern. 44(12), 2834–2847 (2014)
Liu, D., Wei, Q., Wang, D., Yang, X., Li, H.: Adaptive dynamic Publisher’s Note Springer Nature remains neutral with regard to
 programming with applications in optimal control. Springer, jurisdictional claims in published maps and institutional affiliations.
 New York (2017)
Love, L., Book, W.: Force reflecting teleoperation with adaptive
 impedance control. IEEE Trans. Syst. Man Cybern. Part B:
 Cybern. 34, 159–165 (2004) Hong Zhan received the M.S.
Lyshevski, S.E.: Optimal control of nonlinear continuous-time sys- degree in automation from South
 tems: design of bounded controllers via generalized nonquad- China University of Technology,
 ratic functionals. pp. 205–209 (1998) Guangzhou, China, in 2012.
Parra-Vega, V., Arimoto, S., Yun-Hui, L., Hirzinger, G., Akella, P.: Then, she works in School of
 Dynamic sliding pid control for tracking of robot manipula- Automation Science and Engi-
 tors: theory and experiments. IEEE Trans. Robot. Autom. 19(6), neering in South China Univer-
 967–976 (2003) sity of Technology, where she is
Peng, G., Yang, C., He, W., Chen, C.L.P.: Force sensorless admit- currently pursuing the Ph.D.
 tance control with neural learning for robots with actuator satu- degree. Her research interests
 ration. IEEE Trans. Ind. Electron. 67(4), 3138–3148 (2020) include robotics, intelligent con-
Raibert, H.M., Craig, J.J., et al.: Hybrid position/force control of trol and human-robot
 manipulators. J. Dyn. Syst. Meas. Control 103(2), 126–133 interaction.
 (1981)
Siciliano, B.: A closed-loop inverse kinematic scheme for on-line joint-
 based robot control. Robotica 8, 231–243 (1990)
Stanisic, R.Z., Fernndez, N.V.: Adjusting the parameters of the Dianye Huang received the
 mechanical impedance for velocity, impact and force control. B.Eng. and M.Eng. degrees in
 Robotica 30(4), 583597 (2012) automation from South China
Tsuji, T., Ito, K., Morasso, P.: Neural network learning of robot arm University of Technology,
 impedance in operational space. IEEE Trans. Syst. Man Cybern. Guangzhou, China, in 2017 and
 Part B Cybern. 26, 290–8 (1996) 2020, respectively. His research
Uemura, M., Kawamura, S.: Resonance-based motion control method interests include robotics, intel-
 for multi-joint robot through combining stiffness adaptation and ligent control and human-robot
 iterative learning control. pp. 1543 – 1548 (2009) interaction.
Wang, D., Liu, D., Mu, C., Zhang, Y.: Neural network learning and robust
 stabilization of nonlinear systems with dynamic uncertainties. IEEE
 Trans. Neural Netw. Learn. Syst. 29(4), 1342–1351 (2018)
Wen, C., Zhou, J., Liu, Z., Su, H.: Robust adaptive control of uncertain
 nonlinear systems in the presence of input saturation and external
 disturbance. IEEE Trans. Autom. Control 56(7), 1672–1678 (2011)
Wenzhi, G., Selmic, R.R.: Neural network control of a class of non-
 linear systems with actuator saturation. IEEE Trans Neural Netw
 Chenguang Yang received the
 17(1), 147–156 (2006)
 Ph.D. degree in control engi-
Werbos, P.: Approximate dynamic programming for real-time control
 neering from the National Uni-
 and neural modeling. Van Nostrand Reinhold, New York (1992)
 versity of Singapore, Singapore,
Yang, X., Liu, D., Huang, Y.: Neural-network-based online optimal con-
 in 2010 and performed postdoc-
 trol for uncertain non-linear continuous-time systems with control
 toral research in human robotics
 constraints. IET Control Theory Appl. 7(17), 2037–2047 (2013)
 at Imperial College London,
Yang, C., Peng, G., Li, Y., Cui, R., Cheng, L., Li, Z.: Neural networks
 London, UK from 2009 to 2010.
 enhanced adaptive admittance control of optimized robotenviron-
 He has been awarded EU Marie
 ment interaction. IEEE Trans. Cybern. 49(7), 2568–2579 (2019)
 Curie International Incoming
Yang, C., Teng, T., Xu, B., Li, Z., Na, J., Su, C.Y.: Global adaptive
 Fellowship, UK EPSRC UKRI
 tracking control of robot manipulators using neural networks with
 Innovation Fellowship, and the
 finite-time learning convergence. Int. J. Control Autom. Syst.
 Best Paper Award of the IEEE
 15(4), 1916–1924 (2017)
 Transactions on Robotics as well
Yao, B., Zhou, Z., Wang, L., Xu, W., Liu, Q., Liu, A.: Sensorless and
 as over ten conference Best
 adaptive admittance control of industrial robot in physical human-
 Paper Awards. His research
 robot interaction. Robot. Comput.-Integr. Manuf. 51, 158–168
 interest lies in human robot interaction and intelligent system design.
 (2018)

13
You can also read