摘要
游戏中的非玩家角色(NPC)通过学习获得智能,因此学习算法的设计是一个关键问题。提出一种改进型Q学习算法(SA-QL),它以模拟退火算法为基础,在状态空间、探索策略、报酬函数等方面改进了Q学习算法的不足。将该算法运用到行为树的设计中,使NPC能在游戏过程中实时学习,调整行为树中逻辑行为的最佳执行点,从而产生合适的行为响应。实验结果表明,SA-QL算法比传统Q学习算法效率更高,控制NPC的效果更好。
The non-player character (NPC) in a game gains intelligence by learning, so the design of the learning algorithm becomes the key issue. In this paper, an improved Q-learning algorithm (SA-QL) was proposed. Based on simulated annealing algorithm, the Q-learning algorithm was improved in the aspects of state space, exploration strategy and reward function. Then the algorithm was applied to the design of behaviour tree, so that the NPC Could learn and adjust the best execution point of the logical behaviour in the process of the game in real time, and produced the appropriate behavior response. Experimental results showed that the SA-QL algorithm was more efficient than the traditional Q-learning algorithm, and had better control effect on NPC.
出处
《计算机应用与软件》
2017年第12期235-239,共5页
Computer Applications and Software
基金
国家自然科学基金项目(61472294)
中央高校基本科研业务费基金项目(15521004)