The deep deterministic policy gradient(DDPG)algo-rithm is an off-policy method that combines two mainstream reinforcement learning methods based on value iteration and policy iteration.Using the DDPG algorithm,agents ...The deep deterministic policy gradient(DDPG)algo-rithm is an off-policy method that combines two mainstream reinforcement learning methods based on value iteration and policy iteration.Using the DDPG algorithm,agents can explore and summarize the environment to achieve autonomous deci-sions in the continuous state space and action space.In this paper,a cooperative defense with DDPG via swarms of unmanned aerial vehicle(UAV)is developed and validated,which has shown promising practical value in the effect of defending.We solve the sparse rewards problem of reinforcement learning pair in a long-term task by building the reward function of UAV swarms and optimizing the learning process of artificial neural network based on the DDPG algorithm to reduce the vibration in the learning process.The experimental results show that the DDPG algorithm can guide the UAVs swarm to perform the defense task efficiently,meeting the requirements of a UAV swarm for non-centralization,autonomy,and promoting the intelligent development of UAVs swarm as well as the decision-making process.展开更多
The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the mai...The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%.展开更多
目的通过临床分析结合文献计量学研究探讨慢性冠状动脉(冠脉)综合征(CCS)患者发生冠脉慢血流(CSF)的影响因素。方法选择2021年9月至2022年7月于东直门医院心血管内科就诊的CCS患者,根据纳排标准,最终纳入CSF组37例,冠脉正常组40例。分...目的通过临床分析结合文献计量学研究探讨慢性冠状动脉(冠脉)综合征(CCS)患者发生冠脉慢血流(CSF)的影响因素。方法选择2021年9月至2022年7月于东直门医院心血管内科就诊的CCS患者,根据纳排标准,最终纳入CSF组37例,冠脉正常组40例。分析两组一般资料、临床资料与CSF的相关性。以Web of Science为文献来源,检索2002—2022年CSF相关的研究,运用Citespace和Vosviewer软件以关键词作为节点进行共现、聚类和Burst分析,绘制对应的可视化图谱并进行解析。结果临床研究中,单因素分析后继行多变量logistic回归分析显示,血红蛋白(HGB)水平高(OR=1.103,P=0.001)、心房颤动(AF)(OR=19.791,P=0.010)、冠心病家族史(OR=3.811,P=0.046)为CCS患者发生CSF的独立危险因素。文献计量研究中,共检索到CSF相关文献1367篇,关键词共现及聚类分析显示,CSF相关的研究热点疾病主要集中在心绞痛、心肌梗死、经皮冠状动脉介入治疗和动脉疾病;影像学研究热点集中在血管内超声、心肌梗死溶栓试验(TIMI)血流计数、造影;机制研究热点主要集中在动脉粥样硬化、内皮功能障碍和炎症,且近5年CSF的研究热点偏重于临床管理及预后。结论CCS患者发生CSF的独立危险因素有HGB水平高、AF、冠心病家族史。文献计量研究中CSF的机制研究热点主要集中于动脉粥样硬化、内皮功能障碍和炎症。展开更多
基金supported by the Key Research and Development Program of Shaanxi(2022GY-089)the Natural Science Basic Research Program of Shaanxi(2022JQ-593).
文摘The deep deterministic policy gradient(DDPG)algo-rithm is an off-policy method that combines two mainstream reinforcement learning methods based on value iteration and policy iteration.Using the DDPG algorithm,agents can explore and summarize the environment to achieve autonomous deci-sions in the continuous state space and action space.In this paper,a cooperative defense with DDPG via swarms of unmanned aerial vehicle(UAV)is developed and validated,which has shown promising practical value in the effect of defending.We solve the sparse rewards problem of reinforcement learning pair in a long-term task by building the reward function of UAV swarms and optimizing the learning process of artificial neural network based on the DDPG algorithm to reduce the vibration in the learning process.The experimental results show that the DDPG algorithm can guide the UAVs swarm to perform the defense task efficiently,meeting the requirements of a UAV swarm for non-centralization,autonomy,and promoting the intelligent development of UAVs swarm as well as the decision-making process.
基金supported by the Aeronautical Science Foundation(2017ZC53033).
文摘The unmanned aerial vehicle(UAV)swarm technology is one of the research hotspots in recent years.With the continuous improvement of autonomous intelligence of UAV,the swarm technology of UAV will become one of the main trends of UAV development in the future.This paper studies the behavior decision-making process of UAV swarm rendezvous task based on the double deep Q network(DDQN)algorithm.We design a guided reward function to effectively solve the problem of algorithm convergence caused by the sparse return problem in deep reinforcement learning(DRL)for the long period task.We also propose the concept of temporary storage area,optimizing the memory playback unit of the traditional DDQN algorithm,improving the convergence speed of the algorithm,and speeding up the training process of the algorithm.Different from traditional task environment,this paper establishes a continuous state-space task environment model to improve the authentication process of UAV task environment.Based on the DDQN algorithm,the collaborative tasks of UAV swarm in different task scenarios are trained.The experimental results validate that the DDQN algorithm is efficient in terms of training UAV swarm to complete the given collaborative tasks while meeting the requirements of UAV swarm for centralization and autonomy,and improving the intelligence of UAV swarm collaborative task execution.The simulation results show that after training,the proposed UAV swarm can carry out the rendezvous task well,and the success rate of the mission reaches 90%.
文摘目的通过临床分析结合文献计量学研究探讨慢性冠状动脉(冠脉)综合征(CCS)患者发生冠脉慢血流(CSF)的影响因素。方法选择2021年9月至2022年7月于东直门医院心血管内科就诊的CCS患者,根据纳排标准,最终纳入CSF组37例,冠脉正常组40例。分析两组一般资料、临床资料与CSF的相关性。以Web of Science为文献来源,检索2002—2022年CSF相关的研究,运用Citespace和Vosviewer软件以关键词作为节点进行共现、聚类和Burst分析,绘制对应的可视化图谱并进行解析。结果临床研究中,单因素分析后继行多变量logistic回归分析显示,血红蛋白(HGB)水平高(OR=1.103,P=0.001)、心房颤动(AF)(OR=19.791,P=0.010)、冠心病家族史(OR=3.811,P=0.046)为CCS患者发生CSF的独立危险因素。文献计量研究中,共检索到CSF相关文献1367篇,关键词共现及聚类分析显示,CSF相关的研究热点疾病主要集中在心绞痛、心肌梗死、经皮冠状动脉介入治疗和动脉疾病;影像学研究热点集中在血管内超声、心肌梗死溶栓试验(TIMI)血流计数、造影;机制研究热点主要集中在动脉粥样硬化、内皮功能障碍和炎症,且近5年CSF的研究热点偏重于临床管理及预后。结论CCS患者发生CSF的独立危险因素有HGB水平高、AF、冠心病家族史。文献计量研究中CSF的机制研究热点主要集中于动脉粥样硬化、内皮功能障碍和炎症。