摘要
提前预测道路两旁的行人是否存在过街意图或一段时间后是否会出现过街行为是自动驾驶汽车面临的重要挑战之一,如何有效融合不同模态的多元信息是准确预测行人过街意图的重要问题.基于此,提出一种基于混合注意力机制的多信息融合预测模型,使用一种基于交叉注意力机制的图像特征融合网络来提取原始图像与语义图像之间的互补信息,并使模型更加关注与行人过街行为有关的图像部分.同时,提出一种融合注意力机制的分级GRU模块,用以捕捉不同模态的非视觉信息对行人过街意图的影响.在PIE和JAAD数据集上对所提模型进行对比实验,已验证其具有领先于同类研究的性能;针对所提出模块进行广泛的消融实验,表明其有效性.
Predicting in advance whether pedestrians on both sides of the road have the intention to cross the street or whether crossing behavior will occur after a period of time is one of the important challenges facing self-driving cars.How to effectively fuse the multi-information from these different modalities is an important issue in accurately predicting pedestrian crossing intentions.Therefore,this paper proposes a multi-information fusion prediction model based on a hybrid attention mechanism.The model uses an image feature fusion network based on a cross-attention mechanism to extract complementary information between the original image and the semantic image and to make the model more attentive to the parts of the image that are relevant to the behavior of the pedestrian crossing the street.We also propose a hierarchical gated recurrent unit(GRU)module incorporating an attentional mechanism to capture the effects of different modalities of non-visual information on pedestrian crossing intentions.Finally,the proposed model is compared on the PIE and JAAD datasets and achieves leading performance,and extensive ablation experiments are conducted on the proposed module to prove its effectiveness.
作者
桑海峰
刘玉龙
刘泉恺
SANG Hai-feng;LIU Yu-long;LIU Quan-kai(School of Information Science and Engineering,Shenyang University of Technology,Shenyang 110870,China)
出处
《控制与决策》
EI
CSCD
北大核心
2024年第12期3946-3954,共9页
Control and Decision
基金
国家自然科学基金项目(62173078)
辽宁省自然科学基金项目(2022-MS-268)。
关键词
行人过街意图预测
交叉注意力机制
自动驾驶
视频分析
计算机视觉
多信息融合
prediction of pedestrian crossing intention
cross attention mechanism
autonomous driving
video analysis
computer vision
multi-information fusion