A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking prob...A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation.展开更多
This paper estimates an off-policy integral reinforcement learning(IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the...This paper estimates an off-policy integral reinforcement learning(IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the system data generated by an arbitrary control. Moreover, off-policy IRL can be regarded as a direct learning method, which avoids the identification of system dynamics. In this paper, the performance index function is first given based on the system tracking error and control error. For solving the Hamilton–Jacobi–Bellman(HJB) equation, an off-policy IRL algorithm is proposed.It is proven that the iterative control makes the tracking error system asymptotically stable, and the iterative performance index function is convergent. Simulation study demonstrates the effectiveness of the developed tracking control method.展开更多
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. Acco...We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration (PI) is introduced to solve the rain-max optimization problem. The off-policy adaptive dynamic programming (ADP) algorithm is then proposed to find the solution of the tracking Hamilton-Jacobi- Isaacs (HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network (CNN), action neural network (ANN), and disturbance neural network (DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded (UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem.展开更多
We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the perfo...We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the performance index function reach an optimum. The expression of the performance index function for the chaotic system is first presented. The online ADP algorithm is presented to achieve optimal control. In the ADP structure, neural networks are used to construct a critic network and an action network, which can obtain an approximate performance index function and the control input, respectively. It is proven that the critic parameter error dynamics and the closed-loop chaotic systems are uniformly ultimately bounded exponentially. Our simulation results illustrate the performance of the established optimal control method.展开更多
In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformat...In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformation, the optimal tracking problem is transformed into an optimal regulation problem, and then the novel optimal tracking control method is proposed. It is shown that for the iterative ADP algorithm with finite approximation error, the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance index functions under some convergence conditions. Two examples are given to demonstrate the validity of the proposed optimal tracking control scheme for chaotic systems.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.61034002,61233001,61273140,61304086,and 61374105)the Beijing Natural Science Foundation,China(Grant No.4132078)
文摘A policy iteration algorithm of adaptive dynamic programming(ADP) is developed to solve the optimal tracking control for a class of discrete-time chaotic systems. By system transformations, the optimal tracking problem is transformed into an optimal regulation one. The policy iteration algorithm for discrete-time chaotic systems is first described. Then,the convergence and admissibility properties of the developed policy iteration algorithm are presented, which show that the transformed chaotic system can be stabilized under an arbitrary iterative control law and the iterative performance index function simultaneously converges to the optimum. By implementing the policy iteration algorithm via neural networks,the developed optimal tracking control scheme for chaotic systems is verified by a simulation.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61304079 and 61374105)the Beijing Natural Science Foundation,China(Grant Nos.4132078 and 4143065)+2 种基金the China Postdoctoral Science Foundation(Grant No.2013M530527)the Fundamental Research Funds for the Central Universities,China(Grant No.FRF-TP-14-119A2)the Open Research Project from State Key Laboratory of Management and Control for Complex Systems,China(Grant No.20150104)
文摘This paper estimates an off-policy integral reinforcement learning(IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the system data generated by an arbitrary control. Moreover, off-policy IRL can be regarded as a direct learning method, which avoids the identification of system dynamics. In this paper, the performance index function is first given based on the system tracking error and control error. For solving the Hamilton–Jacobi–Bellman(HJB) equation, an off-policy IRL algorithm is proposed.It is proven that the iterative control makes the tracking error system asymptotically stable, and the iterative performance index function is convergent. Simulation study demonstrates the effectiveness of the developed tracking control method.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61304079,61673054,and 61374105)the Fundamental Research Funds for the Central Universities,China(Grant No.FRF-TP-15-056A3)the Open Research Project from SKLMCCS,China(Grant No.20150104)
文摘We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration (PI) is introduced to solve the rain-max optimization problem. The off-policy adaptive dynamic programming (ADP) algorithm is then proposed to find the solution of the tracking Hamilton-Jacobi- Isaacs (HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network (CNN), action neural network (ANN), and disturbance neural network (DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded (UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem.
基金Project supported by the Open Research Project from the SKLMCCS(Grant No.20120106)the Fundamental Research Funds for the Central Universities of China(Grant No.FRF-TP-13-018A)+2 种基金the Postdoctoral Science Foundation of China(Grant No.2013M530527)the National Natural Science Foundation of China(Grant Nos.61304079 and 61374105)the Natural Science Foundation of Beijing,China(Grant No.4132078 and 4143065)
文摘We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the performance index function reach an optimum. The expression of the performance index function for the chaotic system is first presented. The online ADP algorithm is presented to achieve optimal control. In the ADP structure, neural networks are used to construct a critic network and an action network, which can obtain an approximate performance index function and the control input, respectively. It is proven that the critic parameter error dynamics and the closed-loop chaotic systems are uniformly ultimately bounded exponentially. Our simulation results illustrate the performance of the established optimal control method.
基金supported by the Open Research Project from SKLMCCS (Grant No. 20120106)the Fundamental Research Funds for the Central Universities of China (Grant No. FRF-TP-13-018A)+1 种基金the Postdoctoral Science Foundation of China (Grant No. 2013M530527)the National Natural Science Foundation of China (Grant Nos. 61304079, 61125306, and 61034002)
文摘In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformation, the optimal tracking problem is transformed into an optimal regulation problem, and then the novel optimal tracking control method is proposed. It is shown that for the iterative ADP algorithm with finite approximation error, the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance index functions under some convergence conditions. Two examples are given to demonstrate the validity of the proposed optimal tracking control scheme for chaotic systems.