Extended Dyna-Q Algorithm for Path Planning of Mobile Robots

Extended Dyna-Q Algorithm for Path Planning of Mobile Robots

下载PDF

导出

摘要 This paper presents an extended Dyna-Q algorithm to improve efficiency of the standard Dyna-Q algorithm.In the first episodes of the standard Dyna-Q algorithm,the agent travels blindly to find a goal position.To overcome this weakness,our approach is to use a maximum likelihood model of all state-action pairs to choose actions and update Q-values in the first few episodes.Our algorithm is compared with one-step Q-learning algorithm and the standard Dyna-Q algorithm for the path planning problem in maze environments.Experimental results show that the proposed algorithm is more efficient than the one-step Q-learning algorithm as well as the standard Dyna-Q algorithm,especially in the large environment of states.

作者 Hoang-huu VIET Sang-hyeok AN Tae-choong CHUNG

机构地区 Dept.of Computer Engineering

出处《Journal of Measurement Science and Instrumentation》 CAS 2011年第3期283-287,共5页 测试科学与仪器（英文版）

基金 supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education,Science and Technology(2010-0012609)

关键词 reinforcement learning Dyna-Q path planning mobile robots 机器人路径规划 Q算法移动学习算法目标位置最大似然规划问题标准

分类号 TP242 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献6

1Sutton RS,Barto AG.Reinforcement Learning: An Introduction[]..1998
2Smart W D,Kaelbling L P.Effective reinforcement learning for mobile robots[].Proceeding of the IEEE Intl Conference on robotics and Automation.2002
3Fujisawa S,Kurozumi R,Yamamoto T,et al.Path planning for mobile robots using an improved reinforcement learning scheme[].Proceedings of the IEEE International Symposium on Intelligent Control.2002
4Andrew W. Moore,Christopher G. Atkeson.Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time[J].Machine Learning.1993(1)
5C.Watkins.Learning from delayed rewards[]..1989
6Kaelbling LP,Littman ML,Moore AW.Reinforcement learning: a survey[].Journal of Artificial Organs.1996

1Xin MA,Ya XU,Guo-qiang SUN,Li-xia DENG,Yi-bin LI.State-chain sequential feedback reinforcement learning for path planning of autonomous mobile robots[J].Journal of Zhejiang University-Science C(Computers and Electronics),2013,14(3):167-178. 被引量：5
2曾彦淞,王忠民,王晨.基于WiFi的移动机器人视频监控系统[J].西安邮电学院学报,2012,17(4):87-91. 被引量：3
3邹腊梅,肖基毅,龚向坚.基于Maximum Likelihood与HMM的文本挖掘[J].计算机技术与发展,2007,17(12):110-112. 被引量：1
4温丰.Localization and navigation using a novel artificial landmark for indoor mobile robots[J].High Technology Letters,2009,15(3):233-238.
5Changbae Jung,Woojin Chung.Accurate parameter estimation of systematic odometry errors for two-wheel differential mobile robots[J].Journal of Measurement Science and Instrumentation,2012,3(3):268-272. 被引量：3
6Kuo-Lan Su,Jr-Hung Guo,Bo-Yi Li.Pattern Formation Control Using QR-Code for Mobile Robots[J].Journal of Mechanics Engineering and Automation,2012,2(10):594-600.
7Wang Shuo,Zhou Tanhao.On the Current French Weakness[J].Contemporary International Relations,2016,26(5):50-61.
8祁永庆,敬忠良,胡士强.Modified maximum likelihood registration based on information fusion[J].Chinese Optics Letters,2007,5(11):639-641. 被引量：1
9WANG Meng-di,HAN Bao-ling,LUO Qing-sheng.Binocular Visual Navigation and Obstacle Avoidance of Mobile Robots Based on Speeded-Up Robust Features[J].Computer Aided Drafting,Design and Manufacturing,2013,23(4):18-24. 被引量：1
10Han, Jianguo, Guo, Junchao, Zhao, Qian.On-Line Real Time Realization and Application of Adaptive Fuzzy Inference Neural Network[J].Journal of Systems Engineering and Electronics,2000,11(1):67-74.

Journal of Measurement Science and Instrumentation

2011年第3期

浏览历史

内容加载中请稍等...

Extended Dyna-Q Algorithm for Path Planning of Mobile Robots

参考文献6

相关作者

相关机构

相关主题

浏览历史