期刊文献+

XCSG在多机器人强化学习中的应用 被引量:2

Applications of XCSG in Multi-robot Reinforcement Learning
在线阅读 下载PDF
导出
摘要 XCS分类器在解决机器人强化学习方面已显示出较强的能力,但在多机器人领域仅局限于MDP环境,只能解决环境空间较小的学习问题。提出了XCSG来解决多机器人的强化学习问题。XCSG建立低维的逼近函数,梯度下降技术利用在线知识建立稳定的逼近函数,使Q-表格一直保持在稳定低维状态。逼近函数Q不仅所需的存储空间更小,而且允许机器人在线对已获得的知识进行归纳一般化。仿真实验表明,XCSG算法很好地解决了多机器人学习空间大、学习速度慢、学习效果不确定等问题。 XCS classifier system has been shown to solve machine-learning problems in a competitive way. However, in multi-robot problems,XCS is restricted to solve very small problems modeled by a Markov decision process. In this pa- per a new learning technique XCSG that combines XCS and gradient descent methods was proposed to solve multi-robot machine--learning problems. XCSG builds love-dimensional approximation of the function, and gradient descent tech- niques use on--line knowledge to establish a stable approximation of functions, so that the Q-form has been maintained at a low-dimensional stable state. Approximate of the function not only requires smaller storage space, but also allows the robot online knowledge is summarized on the generalization. Simulation results show that XCSG algorithm solves the multi--robot reinforcement learning in a large space, slow learning, learning uncertainty and other issues.
出处 《计算机科学》 CSCD 北大核心 2013年第8期249-251,292,共4页 Computer Science
基金 国家自然科学基金(90820004)资助
关键词 强化学习 多机器人 学习分类器 梯度下降法的学习分类器 Reinforcement learning Multi-robot Accuracy-based learning classifier system(XCS) Accuracy-based learning classifier system with gradient descent method(XCSG)
  • 相关文献

参考文献10

  • 1邵杰,杨静宇.基于多LCS和人工势场法的机器人行为控制[J].计算机科学,2011,38(1):264-267. 被引量:2
  • 2朱美强,程玉虎,李明,王雪松,冯涣婷.一类基于谱方法的强化学习混合迁移算法[J].自动化学报,2012,38(11):1765-1776. 被引量:10
  • 3Wiering M. Multi-agent reinforcement learning for traffic light control[C]//Proc. 17th Int. Conf. Mach. Learn. (ICML-00). Stanford Univ. Stanford, CA, 2009 : 1151-1158.
  • 4Dixon P W, Corne D W, (Dates M J. Apreliminary investigation of modified XCS as a generic data mining tool[C]//Lanzi P L, Stolzmann W, Wilson S W, eds. LNAI, Advances in Learning Classifier Systems. vol. 2321, Berlin, Germany: Springer-Verlag, 2002:133-150.
  • 5欧世峰,高颖,赵晓晖.基于随机梯度的变动量因子自适应白化算法[J].自动化学报,2012,38(8):1370-1374. 被引量:9
  • 6Butz M V, Goldberg D E, Lanzi P L. Gradient descent methods in learning classifier systems: Improving XCS performance in multistep problems [J]. IEEE Trans. Evol. Comput. , 2005, 9 (5) :452-473.
  • 7Bernad E, o-Mansilla, Garrell J. Accuracy-based Learning Classi- fier Systems: Models, analysis and applications to classification tasks[J]. Evolutionar Computation, 2003,11 (3) : 209-238.
  • 8Hung K-T,Liu J-S,Chang Y-Z. Smooth path planning for a mo- bile robot by evolutionary multiobjective optimization[C]// IEEE Int. Symposium on Computational Intelligence in Robotics and Automation. Jacksonville, Florida,June 2007.
  • 9Butz M V,Lanzi P L,Wilson S W. Function approximation with XCS: Hyperellipsoidal conditions, recursive least squares, and compaction[J]. IEEE Trans. Evol. Comput. , 2008, 12 ( 3 ) : 355- 376.
  • 10Bagnall A J, Cawley G C. Learning classifier systems for data mining A comparison of XCS with other classifiers for the Fo- rest Cover dataset[C]//Proc. IEEE/INNS Int. Joint Conf. Arti- ficial Neural Netw. vol. 3, Portland, OR, 2003 : 1802-1807.

二级参考文献16

  • 1罗四维,赵连伟.基于谱图理论的流形学习算法[J].计算机研究与发展,2006,43(7):1173-1179. 被引量:76
  • 2Arkin R C. Behavior-based Robotics [M]. London: The MIT Press, 1998.
  • 3Baneamoon S M,Salam R A,Talib A Hj. Learning Process Enhancement for Robot Behavior[J].International Journal of intel- ligent Teehnology, 2007,2(3).
  • 4孟但 王田苗.基于遗传算法的行为控制在机器人路径规划中的应用.Robot,2008,30(3):217-222.
  • 5Gao Yang,Sun Shu-dong. A collision based local path planning of mobile robots[A]//2009 International Asia Conference on informatics in Control[C]. Automation and Robots, Xian, China, 2009,185 -190.
  • 6Baneamoon S M, Salam R A. Applying steady state in Genetic Algorithm for Robot Behaviors[C]//2008 International conference on Electronic Design. Malasia,December 2008.
  • 7Holland J H. A Mathematical Frame work for Studying Learning in Classiffer systems [M]. Cambridge, MA: MIT Press, 1998.
  • 8Petr M. Enhanced learning classifier system for robot navigation [C] // Intelligent robots and systems(IROS2005) international conference. 2005 : 3390-3395.
  • 9Larry B, Mattew S, Anthony B, et al. Learing classifier system ensembles with rule-sharing[J]. IEEE transactions on evolu tionary computation, 2007 (4): 496-502.
  • 10Baneamoon S M, Salam R A. Bucket Brigade Algorithm Enhancement for Robot Behaviors[C]// International Conference on Robotics Vision, Information and Signal Processing(ROVISP 2007). Penang, Malaysia, November 2007: 28 -30.

共引文献18

同被引文献10

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部