Keywords :
Reinforcement learning, Actor-critic, PID control, Neural network, Robot manipulators
Abstract :
[en] In this paper, a reinforcement learning structure is proposed to auto-tune
PID gains by solving an optimal tracking control problem for robot manipulators. Taking advantage of the actor-critic framework implemented by
neural networks, optimal tracking performance is achieved while unknown
system dynamics are estimated. The critic network is used to learn the optimal cost-to-go function while the actor-network converges it and learns the
optimal PID gains. Furthermore, Lyapunov’s direct method is utilized to
prove the stability of the closed-loop system. By that means, an analytical
procedure is delivered for a stable robot manipulator system to systematically adjust PID gains without the ad-hoc and painstaking process. The
resultant actor-critic PID-like control exhibits stable adaptive and learning
capabilities, while delivered with a simple structure and inexpensive online
computational demands. Numerical simulation is performed to illustrate the
effectiveness and advantages of the proposed actor-critic neural network PID
control.
Scopus citations®
without self-citations
33