A robust TV logo detection method based on the modified single shot multibox detector (SSD) is presented. Unlike most other existing methods which can only detect the TV logo from video frames, the proposed method can...A robust TV logo detection method based on the modified single shot multibox detector (SSD) is presented. Unlike most other existing methods which can only detect the TV logo from video frames, the proposed method can also detect the TV logo from photo pictures taken by smartphones or other smart terminals. Firstly, using a simple and effective way of collecting and labelling TV logo, a large-scale TV logo dataset used to train the detection model is built. Then, parameters and loss function of SSD are modified to make it more suitable for the task of TV logo detection. Moreover, a soft-NMS algorithm is introduced to remove the redundant overlapping boxes and obtain the final output box. And also an approach for hard example mining is designed to improve the detection accuracy. Finally, extensive comparison experiments are carried out which take into consideration different image resolutions, logo positions and environmental factors existing in real-world applications. Experimental results demonstrate that the proposed method achieve superior performances in robustness compared to other state-of-the-art methods.展开更多
A method of multi-block Single Shot Multi Box Detector(SSD)based on small object detection is proposed to the railway scene of unmanned aerial vehicle surveillance.To address the limitation of small object detection,a...A method of multi-block Single Shot Multi Box Detector(SSD)based on small object detection is proposed to the railway scene of unmanned aerial vehicle surveillance.To address the limitation of small object detection,a multi-block SSD mechanism,which consists of three steps,is designed.First,the original input images are segmented into several overlapped patches.Second,each patch is separately fed into an SSD to detect the objects.Third,the patches are merged together through two stages.In the first stage,the truncated object of the sub-layer detection result is spliced.In the second stage,a sub-layer suppression and filtering algorithm applying the concept of non-maximum suppression is utilized to remove the overlapped boxes of sub-layers.The boxes that are not detected in the main-layer are retained.In addition,no sufficient labeled training samples of railway circumstance are available,thereby hindering the deployment of SSD.A two-stage training strategy leveraging to transfer learning is adopted to solve this issue.The deep learning model is preliminarily trained using labeled data of numerous auxiliaries,and then it is refined using only a few samples of railway scene.A railway spot in China,which is easily damaged by landslides,is investigated as a case study.Experimental results show that the proposed multi-block SSD method produces an overall accuracy of 96.6%and obtains an improvement of up to 9.2%compared with the traditional SSD.展开更多
To achieve automatic,fast,efficient and high-precision pavement distress classification and detection,road surface distress image classification and detection models based on deep learning are trained.First,a pavement...To achieve automatic,fast,efficient and high-precision pavement distress classification and detection,road surface distress image classification and detection models based on deep learning are trained.First,a pavement distress image dataset is built,including 9017pictures with distress,and 9620 pictures without distress.These pictures were captured from 4 asphalt highways of 3 provinces in China.In each pavement distress image,there exists one or more types of distress,including alligator crack,longitudinal crack,block crack,transverse crack,pothole and patch.The distresses are labeled by a rectangle bounding box on the pictures.Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification,and as multi-label classification models for six types of distress classification.Training techniques,such as data augmentation,batch normalization,dropout,momentum,weight decay,transfer learning,and discriminative learning rate are used in training the model.Among the 4 CNNs considered in this study,namely ResNet 34 and 50,and VGG 16 and 19,for the binary classification,ResNet 50 has the highest Accuracy of 96.243%,Precision of 95.183%,and ResNet 34 has the highest Recall of 97.824%,and F2 score of 97.052%.For multi-label classification,ResNet 50 has the best performance,with the highest Accuracy of 90.257%,higher than 90%required by the Chinese standard(JTG H20-2018)for road distresses detection,F2 score-82.231%,and Precision-76.509%,and ResNet34 has the highest Recall of 87.32%.To locate and quantify the distress areas in the images,the single shot multibox detector(SSD)model is developed,in which the ResNet 50 is used as the base network to extract features.When the intersection over union(IoU)is set to 0,0.25,0.50,0.75,the mean average precision(mAP)of the model are found to be 74.881%,50.511%,28.432%,3.969%,respectively.展开更多
文摘针对学生注意力分配困难和对学习影响等问题,提出一种基于机器视觉的精准注意力追踪系统。该系统包括图像采集装置和精准的注意力追踪算法。图像采集装置可以获得更清晰的眼部区域图像。瞳孔中心定位算法用轻量级的MobileNet v3替换VGG16(visual geometry group network),采用两级特征融合和中心关键点预测技术,提高了检测速度和准确率。该算法检测速度可达36帧/s,准确率为97.42%。视线追踪算法旨在解决头部偏移的影响,实现对视线的精确追踪。研发了一款面向学龄儿童的阅读认知评价交互软件。该软件利用采集到的视线坐标计算相关眼动指标,再通过心理学理论分析建模来评估学龄儿童的思维认知能力,为心理学和教育学相关领域研究提供了参考和借鉴。
基金Supported by the National Natural Science Foundationof China(No.61702466)“Double Tops” Discipline Construction Project
文摘A robust TV logo detection method based on the modified single shot multibox detector (SSD) is presented. Unlike most other existing methods which can only detect the TV logo from video frames, the proposed method can also detect the TV logo from photo pictures taken by smartphones or other smart terminals. Firstly, using a simple and effective way of collecting and labelling TV logo, a large-scale TV logo dataset used to train the detection model is built. Then, parameters and loss function of SSD are modified to make it more suitable for the task of TV logo detection. Moreover, a soft-NMS algorithm is introduced to remove the redundant overlapping boxes and obtain the final output box. And also an approach for hard example mining is designed to improve the detection accuracy. Finally, extensive comparison experiments are carried out which take into consideration different image resolutions, logo positions and environmental factors existing in real-world applications. Experimental results demonstrate that the proposed method achieve superior performances in robustness compared to other state-of-the-art methods.
基金supported by Beijing Natural Science Foundation,China(No.4182020)Open Fund of State Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,China(No.17E01)Key Laboratory for Health Monitoring and Control of Large Structures,Shijiazhuang,China(No.KLLSHMC1901)。
文摘A method of multi-block Single Shot Multi Box Detector(SSD)based on small object detection is proposed to the railway scene of unmanned aerial vehicle surveillance.To address the limitation of small object detection,a multi-block SSD mechanism,which consists of three steps,is designed.First,the original input images are segmented into several overlapped patches.Second,each patch is separately fed into an SSD to detect the objects.Third,the patches are merged together through two stages.In the first stage,the truncated object of the sub-layer detection result is spliced.In the second stage,a sub-layer suppression and filtering algorithm applying the concept of non-maximum suppression is utilized to remove the overlapped boxes of sub-layers.The boxes that are not detected in the main-layer are retained.In addition,no sufficient labeled training samples of railway circumstance are available,thereby hindering the deployment of SSD.A two-stage training strategy leveraging to transfer learning is adopted to solve this issue.The deep learning model is preliminarily trained using labeled data of numerous auxiliaries,and then it is refined using only a few samples of railway scene.A railway spot in China,which is easily damaged by landslides,is investigated as a case study.Experimental results show that the proposed multi-block SSD method produces an overall accuracy of 96.6%and obtains an improvement of up to 9.2%compared with the traditional SSD.
基金supported by the National Key R&D Program of China(Grant number 2018YFC0705604)。
文摘To achieve automatic,fast,efficient and high-precision pavement distress classification and detection,road surface distress image classification and detection models based on deep learning are trained.First,a pavement distress image dataset is built,including 9017pictures with distress,and 9620 pictures without distress.These pictures were captured from 4 asphalt highways of 3 provinces in China.In each pavement distress image,there exists one or more types of distress,including alligator crack,longitudinal crack,block crack,transverse crack,pothole and patch.The distresses are labeled by a rectangle bounding box on the pictures.Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification,and as multi-label classification models for six types of distress classification.Training techniques,such as data augmentation,batch normalization,dropout,momentum,weight decay,transfer learning,and discriminative learning rate are used in training the model.Among the 4 CNNs considered in this study,namely ResNet 34 and 50,and VGG 16 and 19,for the binary classification,ResNet 50 has the highest Accuracy of 96.243%,Precision of 95.183%,and ResNet 34 has the highest Recall of 97.824%,and F2 score of 97.052%.For multi-label classification,ResNet 50 has the best performance,with the highest Accuracy of 90.257%,higher than 90%required by the Chinese standard(JTG H20-2018)for road distresses detection,F2 score-82.231%,and Precision-76.509%,and ResNet34 has the highest Recall of 87.32%.To locate and quantify the distress areas in the images,the single shot multibox detector(SSD)model is developed,in which the ResNet 50 is used as the base network to extract features.When the intersection over union(IoU)is set to 0,0.25,0.50,0.75,the mean average precision(mAP)of the model are found to be 74.881%,50.511%,28.432%,3.969%,respectively.