期刊文献+

融合卷积神经网络与视觉注意机制的苹果幼果高效检测方法 被引量:14

Efficient detection method for young apples based on the fusion of convolutional neural network and visual attention mechanism
在线阅读 下载PDF
导出
摘要 果实表型数据高通量、自动获取是果树新品种育种研究的基础,实现幼果精准检测是获取生长数据的关键。幼果期果实微小且与叶片颜色相近,检测难度大。为了实现自然环境下苹果幼果的高效检测,采用融合挤压激发块(Squeeze-and-Excitation block,SE block)和非局部块(Non-Local block,NL block)两种视觉注意机制,提出了一种改进的YOLOv4网络模型(YOLOv4-SENL)。YOLOv4模型的骨干网络提取高级视觉特征后,利用SE block在通道维度整合高级特征,实现通道信息的加强。在模型改进路径聚合网络(Path Aggregation Network,PAN)的3个路径中加入NL block,结合非局部信息与局部信息增强特征。SE block和NL block两种视觉注意机制从通道和非局部两个方面重新整合高级特征,强调特征中的通道信息和长程依赖,提高网络对背景与果实的特征捕捉能力。最后由不同尺寸的特征图实现不同大小幼果的坐标和类别计算。经过1920幅训练集图像训练,网络在600幅测试集上的平均精度为96.9%,分别比SSD、Faster R-CNN和YOLOv4模型的平均精度提高了6.9百分点、1.5百分点和0.2百分点,表明该算法可准确地实现幼果期苹果目标检测。模型在480幅验证集的消融试验结果表明,仅保留YOLOv4-SENL中的SE block比YOLOv4模型精度提高了3.8百分点;仅保留YOLOv4-SENL中3个NL block视觉注意模块比YOLOv4模型的精度提高了2.7百分点;将YOLOv4-SENL中SE block与NL blocks相换,比YOLOv4模型的精度提高了4.1百分点,表明两种视觉注意机制可在增加少量参数的基础上显著提升网络对苹果幼果的感知能力。该研究结果可为果树育种研究获取果实信息提供参考。 Accurate detection of young fruits is critical to obtain growth data,particularly in the high-throughput and automatic acquisition of phenotypic information serving as the basis of fruit tree breeding.Since the fruits at young stage are in a small shape similar to the leaf color,it has made it difficult to be detected in deep learning.In this study,an improved YOLOv4 network model(YOLOv4-SENL)was proposed to achieve highly efficient detection of young apples in a natural environment.Squeeze-and-excitation(SE)and Non-local(NL)blocks were also combined to detect young apples.The backbone network of feature extraction in YOLOv4 was utilized to extract high-level features,whereas,the SE block was used to reorganize and consolidate high-level features in the channel dimension to achieve the enhancement of the channel information.The NL block was added to three paths of improved path aggregation network(PAN),combining non-local and local information obtained by convolution operations to enhance features.Two visual attention mechanisms(SE and NL block)were used to re-integrate high-level features from both channel and non-local aspects,with emphasis on the channel information and long-range dependencies in features.As such,the improved ability was achieved to capture the characteristics of background and fruit.Finally,the coordinates and classification were performed on the feature maps with different sizes of young apples.The pre-training weights of the backbone network on MS COCO dataset were loaded in the process of network training,where random gradient descent was used to update the parameters.The initial parameters were set as follows:The initial learning rate was 0.01,the training epoch was 350,the weight decay rate was 0.000484,and the momentum factor was 0.937.A total of 3000 images were collected in the natural environment,including young fruits in different periods and different interference factors,with abundant samples.Four indexes were selected to evaluate the detection of models in the experiments,including precision,the recall rate,F1 score,and average precision.1920 images of the dataset were trained,where the average precision of network was 96.9%on 600 test set images,6.9 percentage points,1.5 percentage points,and 0.2 percentage points higher than that of SSD,Faster R-CNN,and YOLOv4 models,respectively.The size of the YOLOv4-SENL model was 69 M larger than that of the SSD model,59 M smaller than that of the Faster R-CNN model,and 11M larger than that of the YOLOv4 model.It indicated that the detection of young apple objects was accurately realized.The ablation experiment on 480 validation set images showed that only retaining the SE block in YOLOv4-SENL,the precision of the model was improved by 3.8 percentage points,compared with the YOLOv4 model.Only retaining three NL block visual attention modules in YOLOv4-SENL,the precision of the model was improved by 2.7 percentage points,compared with the YOLOv4 model.When replacing the SE and NL blocks in YOLOv4-SENL,the precision of model was improved by 4.1 percentage points,compared with the YOLOv4 model.These indicated that two visual attention mechanisms contributed to significantly improving the perception of network for young apples with a small increase in parameters.This finding can provide a potential reference to obtain the growth information in fruit breeding.
作者 宋怀波 江梅 王云飞 宋磊 Song Huaibo;Jiang Mei;Wang Yunfei;Song Lei(College of Mechanical and Electronic Engineering,Northwest A&F University,Yangling 712100,China;Key Laboratory of Agricultural Internet of Things,Ministry of Agriculture and Rural Affairs,Yangling 712100,China;Shaanxi Key Laboratory of Agricultural Information Perception and Intelligent Services,Yangling 712100,China)
出处 《农业工程学报》 EI CAS CSCD 北大核心 2021年第9期297-303,共7页 Transactions of the Chinese Society of Agricultural Engineering
基金 国家重点研发计划(2019YFD1002401) 国家自然科学基金项目(31701326) 国家高技术研究发展计划(863计划)项目(2013AA10230402)。
关键词 机器视觉 图像处理 苹果幼果 果实检测 YOLOv4 卷积神经网络 视觉注意机制 machine vision image processing young apples fruit detection YOLOv4 convolutional neural network visual attention mechanism
  • 相关文献

参考文献5

二级参考文献53

  • 1李鹏,张文革,王琦,李民赞.谷物产量监视器的试验研究[J].农业网络信息,2004(S1):47-50. 被引量:2
  • 2张亚静,邓烈,李民赞,赵瑞娇,何绍兰,易时来.基于图像处理的柑橘测产方法[J].农业机械学报,2009,40(S1):97-99. 被引量:18
  • 3张凯,赵丽宁,孙哲,耿长兴,李伟.葡萄套袋智能机器人系统设计与目标提取[J].农业机械学报,2013,44(S1):240-246. 被引量:9
  • 4张铁中,陈利兵,宋健.草莓采摘机器人的研究:Ⅱ.基于图像的草莓重心位置和采摘点的确定[J].中国农业大学学报,2005,10(1):48-51. 被引量:48
  • 5Kondo N, Shibano Y, Mohri K, et al. Basic studies on robot to work in vineyard (part 2)[J]. Journal of the Japanese Society of Agricultural Machinery, 1994, 56(1): 45-53.
  • 6Chamelat R, Rosso E, Choksuriwong A, et al. Grape detection by image processing[C]. IECON 2006-32nd Annual Conference on IEEE Industrial Electronics, 2006: 3521 -3526.
  • 7Reis J C S, Raul M, Carlos P, et al. A low-cost system to detect bunches of grapes in natural environment from color images[C],13th International Conference on AdvanceConcepts for Intelligent Vision Systems, 2011: 92-102.
  • 8Reis M J S, Morais R, Peres E. Automatic detection of bunches of grapes in natural environment from color images[J]. Journal of Applied Logic, 2012, 10(4): 285-290.
  • 9Font D, Pallej T, Tresanehez M, et al. Counting red grapes in vineyards by detecting specular spherical reflection peaks in RGB images obtained at night with artificial illumination[J]. Computers and Electronics in Agriculture, 2014, 108: 105-11.
  • 10Bac C W, Hemming J, Van Henten E J. Stem localization of sweet-pepper plants using the support wire as a visual cue[J]. "Computers and Electronics in Agriculture, 2014, 105:111- 120.

共引文献359

同被引文献156

引证文献14

二级引证文献117

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部