摘要
基于串联(concat)操作的特征融合方法仅仅融合了相邻尺度的特征,并没有充分利用来自其他尺度的输出特征。并且,串联操作只是在通道维度上将不同尺度的特征连接,不能反映不同通道间特征的相关性和重要性。针对这些问题,提出了一种基于注意力机制的特征融合算法。该算法利用注意力机制来融合不同尺度的特征,通过对每个通道的特征进行权重分配来学习不同通道间特征的相关性。将基于注意力机制的特征融合算法与YOLO V3相结合,构建多尺度目标检测器,并利用Focal loss和GIOU loss来设计检测器的损失函数。在PASCAL VOC和KITTI数据集上对不同算法进行对比实验,实验结果表明,多尺度目标检测器具有更高的检测精度和较快的检测速度。
The feature fusion method based on concatenation(concat)operation only fuses features of adjacent scales without fully utilizing output features of other scales.Moreover,the concatenation operation only combines features of different scales in the channel dimension,which cannot reflect the correlation and importance of features between different channels.To address these challenges,a feature fusion algorithm based on attention mechanism is proposed.The proposed algorithm uses attention mechanism to fuse features of different scales and learns the correlation between features of different channels by considering the weight allocation of features of each channel.A multi-scale target detector is established by combining the feature fusion algorithm based on attention mechanism with YOLO V3;further,the loss function of detector is designed using Focal and GIOU losses.Comparative experimental results on PASCAL VOC and KITTI datasets show that the proposed multi-scale target detector can effectively improve the detection accuracy and speed.
作者
鞠默然
罗江宁
王仲博
罗海波
Ju Moran;Luo Jiangning;Wang Zhongbo;Luo Haibo(Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang,Liaoning 110016,China;Institute of Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang,Liaoning 110016,China;University of Chinese Academy of Sciences,Beijing 100049,China;Key Laboratory of Opto-Electronic Information Processing,Chinese Academy of Sciences,Shenyang,Liaoning 110016,China;Liaoning Key Laboratory of Image Understanding and Computer Vision,Shenyang,Liaoning 110016,China;McGill University,Quebec H3A 0G4,Canada)
出处
《光学学报》
EI
CAS
CSCD
北大核心
2020年第13期126-134,共9页
Acta Optica Sinica