摘要
针对空间站短期任务重规划问题规划周期短、实时性约束多、约束传播复杂的特点,结合深度强化学习在智能学习和决策上的优势,提出了适用于深度强化学习的空间站任务状态空间编码方式,实现了基于深度确定性策略梯度算法的空间站任务重规划方法。该方法可以通过学习,自主化解约束冲突,摆脱人为预先设定约束冲突化解策略的限制。仿真分析表明:该方法可以通过算法本身不断地学习进化,找到空间站任务重规划问题的近似最优解,相比于传统解决方法,具有很强的智能性和适应性,为解决空间站任务规划问题提供了新思路。
The short-term task re-planning for the space station has the following features:short planning period,multiple real-time constraints,and complex propagation of constraints.A state space coding method for the space station tasks based on Deep Deterministic Policy Gradient(DDPG)algorithm was proposed.Through the study of complex event scheduling and intelligent decision-making,the disadvantages of artificial constraint conflict resolution strategies were eliminated.The simulation results showed that the method could find the approximate optimal solution of the space station mission re-planning problem by learning and evolving the algorithm itself.Compared with the traditional solutions,it had strong intelligence and good adaptability which provides a new idea for solving the space station mission planning problem.
作者
史兼郡
张进
罗亚中
郭帅
李智远
李大鹏
SHI Jianjun;ZHANG Jin;LUO Yazhong;GUO Shuai;LI Zhiyuan;LI Dapeng(College of Aerospace Science,National University Of Defense Technology,Changsha 410073,China;China Xi’an Satellite Control Center,Xi’an 710043,China)
出处
《载人航天》
CSCD
北大核心
2020年第4期469-476,共8页
Manned Spaceflight
基金
载人航天预先研究项目(010201)。
关键词
空间站
任务重规划
深度强化学习
约束满足
space station
task re-planning
deep reinforcement learning
constraint satisfaction