通过分析基于交并比(Intersection over union,IoU)预测的尺度估计模型的梯度更新过程,发现其在训练和推理过程仅将IoU作为度量,缺乏对预测框和真实目标框中心点距离的约束,导致外观模型更新过程中模板受到污染,前景和背景分类时定位出...通过分析基于交并比(Intersection over union,IoU)预测的尺度估计模型的梯度更新过程,发现其在训练和推理过程仅将IoU作为度量,缺乏对预测框和真实目标框中心点距离的约束,导致外观模型更新过程中模板受到污染,前景和背景分类时定位出现偏差.基于此发现,构建了一种结合IoU和中心点距离的新度量NDIoU(Normalization distance IoU),在此基础上提出一种新的尺度估计方法,并将其嵌入判别式跟踪框架.即在训练阶段以NDIoU为标签,设计了具有中心点距离约束的损失函数监督网络的学习,在线推理期间通过最大化NDIoU微调目标尺度,以帮助外观模型更新时获得更加准确的样本.在七个数据集上与相关主流方法进行对比,所提方法的综合性能优于所有对比算法.特别是在GOT-10k数据集上,所提方法的AO、SR_(0.50)和SR_(0.75)三个指标达到了65.4%、78.7%和53.4%,分别超过基线模型4.3%、7.0%和4.2%.展开更多
边界框回归分支是深度目标跟踪器的关键模块,其性能直接影响跟踪器的精度.评价精度的指标之一是交并比(Intersection over union,IoU).基于IoU的损失函数取代了l_(n)-norm损失成为目前主流的边界框回归损失函数,然而IoU损失函数存在2个...边界框回归分支是深度目标跟踪器的关键模块,其性能直接影响跟踪器的精度.评价精度的指标之一是交并比(Intersection over union,IoU).基于IoU的损失函数取代了l_(n)-norm损失成为目前主流的边界框回归损失函数,然而IoU损失函数存在2个固有缺陷:1)当预测框与真值框不相交时IoU为常量0,无法梯度下降更新边界框的参数;2)在IoU取得最优值时其梯度不存在,边界框很难收敛到IoU最优处.揭示了在回归过程中IoU最优的边界框各参数之间蕴含的定量关系,指出在边界框中心处于特定位置时存在多种尺寸不同的边界框使IoU损失最优的情况,这增加了边界框尺寸回归的不确定性.从优化两个统计分布之间散度的视角看待边界框回归问题,提出了光滑IoU(Smooth-IoU,SIoU)损失,即构造了在全局上光滑(即连续可微)且极值唯一的损失函数,该损失函数自然蕴含边界框各参数之间特定的最优关系,其唯一取极值的边界框可使IoU达到最优.光滑性确保了在全局上梯度存在使得边界框更容易回归到极值处,而极值唯一确保了在全局上可梯度下降更新参数,从而避开了IoU损失的固有缺陷.提出的光滑损失可以很容易取代IoU损失集成到现有的深度目标跟踪器上训练边界框回归,在LaSOT、GOT-10k、TrackingNet、OTB2015和VOT2018测试基准上所取得的结果,验证了光滑IoU损失的易用性和有效性.展开更多
Medical image segmentation has become a cornerstone for many healthcare applications,allowing for the automated extraction of critical information from images such as Computed Tomography(CT)scans,Magnetic Resonance Im...Medical image segmentation has become a cornerstone for many healthcare applications,allowing for the automated extraction of critical information from images such as Computed Tomography(CT)scans,Magnetic Resonance Imaging(MRIs),and X-rays.The introduction of U-Net in 2015 has significantly advanced segmentation capabilities,especially for small datasets commonly found in medical imaging.Since then,various modifications to the original U-Net architecture have been proposed to enhance segmentation accuracy and tackle challenges like class imbalance,data scarcity,and multi-modal image processing.This paper provides a detailed review and comparison of several U-Net-based architectures,focusing on their effectiveness in medical image segmentation tasks.We evaluate performance metrics such as Dice Similarity Coefficient(DSC)and Intersection over Union(IoU)across different U-Net variants including HmsU-Net,CrossU-Net,mResU-Net,and others.Our results indicate that architectural enhancements such as transformers,attention mechanisms,and residual connections improve segmentation performance across diverse medical imaging applications,including tumor detection,organ segmentation,and lesion identification.The study also identifies current challenges in the field,including data variability,limited dataset sizes,and issues with class imbalance.Based on these findings,the paper suggests potential future directions for improving the robustness and clinical applicability of U-Net-based models in medical image segmentation.展开更多
文摘通过分析基于交并比(Intersection over union,IoU)预测的尺度估计模型的梯度更新过程,发现其在训练和推理过程仅将IoU作为度量,缺乏对预测框和真实目标框中心点距离的约束,导致外观模型更新过程中模板受到污染,前景和背景分类时定位出现偏差.基于此发现,构建了一种结合IoU和中心点距离的新度量NDIoU(Normalization distance IoU),在此基础上提出一种新的尺度估计方法,并将其嵌入判别式跟踪框架.即在训练阶段以NDIoU为标签,设计了具有中心点距离约束的损失函数监督网络的学习,在线推理期间通过最大化NDIoU微调目标尺度,以帮助外观模型更新时获得更加准确的样本.在七个数据集上与相关主流方法进行对比,所提方法的综合性能优于所有对比算法.特别是在GOT-10k数据集上,所提方法的AO、SR_(0.50)和SR_(0.75)三个指标达到了65.4%、78.7%和53.4%,分别超过基线模型4.3%、7.0%和4.2%.
文摘边界框回归分支是深度目标跟踪器的关键模块,其性能直接影响跟踪器的精度.评价精度的指标之一是交并比(Intersection over union,IoU).基于IoU的损失函数取代了l_(n)-norm损失成为目前主流的边界框回归损失函数,然而IoU损失函数存在2个固有缺陷:1)当预测框与真值框不相交时IoU为常量0,无法梯度下降更新边界框的参数;2)在IoU取得最优值时其梯度不存在,边界框很难收敛到IoU最优处.揭示了在回归过程中IoU最优的边界框各参数之间蕴含的定量关系,指出在边界框中心处于特定位置时存在多种尺寸不同的边界框使IoU损失最优的情况,这增加了边界框尺寸回归的不确定性.从优化两个统计分布之间散度的视角看待边界框回归问题,提出了光滑IoU(Smooth-IoU,SIoU)损失,即构造了在全局上光滑(即连续可微)且极值唯一的损失函数,该损失函数自然蕴含边界框各参数之间特定的最优关系,其唯一取极值的边界框可使IoU达到最优.光滑性确保了在全局上梯度存在使得边界框更容易回归到极值处,而极值唯一确保了在全局上可梯度下降更新参数,从而避开了IoU损失的固有缺陷.提出的光滑损失可以很容易取代IoU损失集成到现有的深度目标跟踪器上训练边界框回归,在LaSOT、GOT-10k、TrackingNet、OTB2015和VOT2018测试基准上所取得的结果,验证了光滑IoU损失的易用性和有效性.
文摘Medical image segmentation has become a cornerstone for many healthcare applications,allowing for the automated extraction of critical information from images such as Computed Tomography(CT)scans,Magnetic Resonance Imaging(MRIs),and X-rays.The introduction of U-Net in 2015 has significantly advanced segmentation capabilities,especially for small datasets commonly found in medical imaging.Since then,various modifications to the original U-Net architecture have been proposed to enhance segmentation accuracy and tackle challenges like class imbalance,data scarcity,and multi-modal image processing.This paper provides a detailed review and comparison of several U-Net-based architectures,focusing on their effectiveness in medical image segmentation tasks.We evaluate performance metrics such as Dice Similarity Coefficient(DSC)and Intersection over Union(IoU)across different U-Net variants including HmsU-Net,CrossU-Net,mResU-Net,and others.Our results indicate that architectural enhancements such as transformers,attention mechanisms,and residual connections improve segmentation performance across diverse medical imaging applications,including tumor detection,organ segmentation,and lesion identification.The study also identifies current challenges in the field,including data variability,limited dataset sizes,and issues with class imbalance.Based on these findings,the paper suggests potential future directions for improving the robustness and clinical applicability of U-Net-based models in medical image segmentation.