期刊文献+

Tube-based robust reinforcement learning for autonomous maneuver decision for UCAVs

原文传递
导出
摘要 Reinforcement Learning(RL)algorithms enhance intelligence of air combat AutonomousManeuver Decision(AMD)policy,but they may underperform in target combat environmentswith disturbances.To enhance the robustness of the AMD strategy learned by RL,thisstudy proposes a Tube-based Robust RL(TRRL)method.First,this study introduces a tube todescribe reachable trajectories under disturbances,formulates a method for calculating tubes basedon sum-of-squares programming,and proposes the TRRL algorithm that enhances robustness byutilizing tube size as a quantitative indicator.Second,this study introduces offline techniques forregressing the tube size function and establishing a tube library before policy learning,aiming toeliminate complex online tube solving and reduce the computational burden during training.Furthermore,an analysis of the tube library demonstrates that the mitigated AMD strategy achievesgreater robustness,as smaller tube sizes correspond to more cautious actions.This finding highlightsthat TRRL enhances robustness by promoting a conservative policy.To effectively balanceaggressiveness and robustness,the proposed TRRL algorithm introduces a“laziness factor”as aweight of robustness.Finally,combat simulations in an environment with disturbances confirm thatthe AMD policy learned by the TRRL algorithm exhibits superior air combat performance comparedto selected robust RL baselines.
出处 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第7期391-405,共15页 中国航空学报(英文版)
  • 相关文献

参考文献3

二级参考文献14

共引文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部