In order to solve the problem that existing multivariate grey incidence models cannot be applied to time series on different scales, a new model is proposed based on spatial pyramid pooling.Firstly, local features of ...In order to solve the problem that existing multivariate grey incidence models cannot be applied to time series on different scales, a new model is proposed based on spatial pyramid pooling.Firstly, local features of multivariate time series on different scales are pooled and aggregated by spatial pyramid pooling to construct n levels feature pooling matrices on the same scale. Secondly,Deng's multivariate grey incidence model is introduced to measure the degree of incidence between feature pooling matrices at each level. Thirdly, grey incidence degrees at each level are integrated into a global incidence degree. Finally, the performance of the proposed model is verified on two data sets compared with a variety of algorithms. The results illustrate that the proposed model is more effective and efficient than other similarity measure algorithms.展开更多
Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially...Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially leading to false positives or missed detections.To solve these problems,the YOLOv8 network is enhanced by adding deformable convolution and atrous spatial pyramid pooling(ASPP),along with the integration of a coordinate attention(CA)mechanism.This allows the network to focus on small targets while expanding the receptive field without losing resolution.At the same time,context information on the target is gathered and feature expression is enhanced by attention modules in different directions.It effectively improves the positioning accuracy and achieves good results on the LUNA16 dataset.Compared with other detection algorithms,it improves the accuracy of pulmonary nodule detection to a certain extent.展开更多
Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the g...Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the great potential to deal with pore pressure prediction.However,most of the traditional deep learning models are less efficient to address generalization problems.To fill this technical gap,in this work,we developed a new adaptive physics-informed deep learning model with high generalization capability to predict pore pressure values directly from seismic data.Specifically,the new model,named CGP-NN,consists of a novel parametric features extraction approach(1DCPP),a stacked multilayer gated recurrent model(multilayer GRU),and an adaptive physics-informed loss function.Through machine training,the developed model can automatically select the optimal physical model to constrain the results for each pore pressure prediction.The CGP-NN model has the best generalization when the physicsrelated metricλ=0.5.A hybrid approach combining Eaton and Bowers methods is also proposed to build machine-learnable labels for solving the problem of few labels.To validate the developed model and methodology,a case study on a complex reservoir in Tarim Basin was further performed to demonstrate the high accuracy on the pore pressure prediction of new wells along with the strong generalization ability.The adaptive physics-informed deep learning approach presented here has potential application in the prediction of pore pressures coupled with multiple genesis mechanisms using seismic data.展开更多
With the successful application and breakthrough of deep learning technology in image segmentation,there has been continuous development in the field of seismic facies interpretation using convolutional neural network...With the successful application and breakthrough of deep learning technology in image segmentation,there has been continuous development in the field of seismic facies interpretation using convolutional neural networks.These intelligent and automated methods significantly reduce manual labor,particularly in the laborious task of manually labeling seismic facies.However,the extensive demand for training data imposes limitations on their wider application.To overcome this challenge,we adopt the UNet architecture as the foundational network structure for seismic facies classification,which has demonstrated effective segmentation results even with small-sample training data.Additionally,we integrate spatial pyramid pooling and dilated convolution modules into the network architecture to enhance the perception of spatial information across a broader range.The seismic facies classification test on the public data from the F3 block verifies the superior performance of our proposed improved network structure in delineating seismic facies boundaries.Comparative analysis against the traditional UNet model reveals that our method achieves more accurate predictive classification results,as evidenced by various evaluation metrics for image segmentation.Obviously,the classification accuracy reaches an impressive 96%.Furthermore,the results of seismic facies classification in the seismic slice dimension provide further confirmation of the superior performance of our proposed method,which accurately defines the range of different seismic facies.This approach holds significant potential for analyzing geological patterns and extracting valuable depositional information.展开更多
Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to ...Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to extract universal rules for effective detection.With the progress in techniques such as transfer learning and meta-learning,few-shot network attack detection has progressed.However,challenges in few-shot network attack detection arise from the inability of time sequence flow features to adapt to the fixed length input requirement of deep learning,difficulties in capturing rich information from original flow in the case of insufficient samples,and the challenge of high-level abstract representation.To address these challenges,a few-shot network attack detection based on NFHP(Network Flow Holographic Picture)-RN(ResNet)is proposed.Specifically,leveraging inherent properties of images such as translation invariance,rotation invariance,scale invariance,and illumination invariance,network attack traffic features and contextual relationships are intuitively represented in NFHP.In addition,an improved RN network model is employed for high-level abstract feature extraction,ensuring that the extracted high-level abstract features maintain the detailed characteristics of the original traffic behavior,regardless of changes in background traffic.Finally,a meta-learning model based on the self-attention mechanism is constructed,achieving the detection of novel APT few-shot network attacks through the empirical generalization of high-level abstract feature representations of known-class network attack behaviors.Experimental results demonstrate that the proposed method can learn high-level abstract features of network attacks across different traffic detail granularities.Comparedwith state-of-the-artmethods,it achieves favorable accuracy,precision,recall,and F1 scores for the identification of unknown-class network attacks through cross-validation onmultiple datasets.展开更多
To detect bull’s-eye anomalies in low-frequency seismic inversion models,the study proposed an advanced method using an optimized you only look once version 7(YOLOv7)model.This model is enhanced by integrating advanc...To detect bull’s-eye anomalies in low-frequency seismic inversion models,the study proposed an advanced method using an optimized you only look once version 7(YOLOv7)model.This model is enhanced by integrating advanced modules,including the bidirectional feature pyramid network(BiFPN),weighted intersection-over-union(wise-IoU),efficient channel attention(ECA),and atrous spatial pyramid pooling(ASPP).BiFPN facilitates robust feature extraction by enabling bidirectional information fl ow across network scales,which enhances the ability of the model to capture complex patterns in seismic inversion models.Wise-IoU improves the precision and fineness of reservoir feature localization through its weighted approach to IoU.Meanwhile,ECA optimizes interactions between channels,which promotes eff ective information exchange and enhances the overall response of the model to subtle inversion details.Lastly,the ASPP module strategically addresses spatial dependencies at multiple scales,which further enhances the ability of the model to identify complex reservoir structures.By synergistically integrating these advanced modules,the proposed model not only demonstrates superior performance in detecting bull’s-eye anomalies but also marks a pioneering step in utilizing cutting-edge deep learning technologies to enhance the accuracy and reliability of seismic reservoir prediction in oil and gas exploration.The results meet scientific literature standards and provide new perspectives on methodology,which makes significant contributions to ongoing eff orts to refine accurate and efficient prediction models for oil and gas exploration.展开更多
In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually ...In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually constrained by limited computational resources and limited collected data.Therefore,it becomes necessary to lighten the model to better adapt to complex cornfield scene,and make full use of the limited data information.In this paper,we propose an improved image segmentation algorithm based on unet.Firstly,the inverted residual structure is introduced into the contraction path to reduce the number of parameters in the training process and improve the feature extraction ability;secondly,the pyramid pooling module is introduced to enhance the network’s ability of acquiring contextual information as well as the ability of dealing with the small target loss problem;and lastly,Finally,to further enhance the segmentation capability of the model,the squeeze and excitation mechanism is introduced in the expansion path.We used images of corn seedlings collected in the field and publicly available corn weed datasets to evaluate the improved model.The improved model has a total parameter of 3.79 M and miou can achieve 87.9%.The fps on a single 3050 ti video card is about 58.9.The experimental results show that the network proposed in this paper can quickly segment corn weeds in a cornfield scenario with good segmentation accuracy.展开更多
In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial netw...In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial network(GAN)was proposed.First,a noise model based on style GAN2 was constructed to estimate the real noise distribution,and the noise information similar to the real noise distribution was generated as the experimental noise data set.Then,a network model with encoder-decoder architecture as the core based on GAN idea was constructed,and the network model was trained with the generated noise data set until it reached the optimal value.Finally,the noise and artifacts in low-dose CT images could be removed by inputting low-dose CT images into the denoising network.The experimental results showed that the constructed network model based on GAN architecture improved the utilization rate of noise feature information and the stability of network training,removed image noise and artifacts,and reconstructed image with rich texture and realistic visual effect.展开更多
Infrared target detection models are more required than ever before to be deployed on embedded platforms,which requires models with less memory consumption and better real-time performance while considering accuracy.T...Infrared target detection models are more required than ever before to be deployed on embedded platforms,which requires models with less memory consumption and better real-time performance while considering accuracy.To address the above challenges,we propose a modified You Only Look Once(YOLO)algorithm PF-YOLOv4-Tiny.The algorithm incorpo-rates spatial pyramidal pooling(SPP)and squeeze-and-excitation(SE)visual attention modules to enhance the target localization capability.The PANet-based-feature pyramid networks(P-FPN)are proposed to transfer semantic information and location information simultaneously to ameliorate detection accuracy.To lighten the network,the standard convolutions other than the backbone network are replaced with depthwise separable convolutions.In post-processing the images,the soft-non-maximum suppression(soft-NMS)algorithm is employed to subside the missed and false detection problems caused by the occlusion between targets.The accuracy of our model can finally reach 61.75%,while the total Params is only 9.3 M and GFLOPs is 11.At the same time,the inference speed reaches 87 FPS on NVIDIA GeForce GTX 1650 Ti,which can meet the requirements of the infrared target detection algorithm for the embedded deployments.展开更多
Multi-license plate detection in complex scenes is still a challenging task because of multiple vehicle license plates with different sizes and classes in the images having complex background.The edge features of high...Multi-license plate detection in complex scenes is still a challenging task because of multiple vehicle license plates with different sizes and classes in the images having complex background.The edge features of high-density distribution and the high curvature features of stroke turning of Chinese character are important signs to distinguish Chinese license plate from other objects.To accurately detect multiple vehicle license plates with different sizes and classes in complex scenes,a multi-object detection of Chinese license plate method based on improved YOLOv3 network was proposed in this research.The improvements include replacing the residual block of the YOLOv3 backbone network with the Inception-ResNet-A block,imbedding the SPP block into the detection network,cutting the redundant Inception-ResNet-A block to suit for the multi-license plate detection task,and clustering the ground truth boxes of license plates to obtain a new set of anchor boxes.A Chinese vehicle license plate image dataset was built for training and testing the improved network,and the location and class of the license plates in each image were accurately labeled.The dataset has 62,153 pieces of images and 4 classes of China vehicle license plates,almost images have multiple license plates with different sizes.Experiments demonstrated that the multilicense plate detection method obtained 83.4%mAP,98.88%precision,98.17%recall,98.52 F1 score,89.196 BFLOPS and 22 FPS on the test dataset,and whole performance was better than the other five compared networks including YOLOv3,SSD,Faster-RCNN,EfficientDet and RetinaNet.展开更多
Objective For computer-aided Chinese medical diagnosis and aiming at the problem of insufficient segmentation,a novel multi-level method based on the multi-scale fusion residual neural network(MF2ResU-Net)model is pro...Objective For computer-aided Chinese medical diagnosis and aiming at the problem of insufficient segmentation,a novel multi-level method based on the multi-scale fusion residual neural network(MF2ResU-Net)model is proposed.Methods To obtain refined features of retinal blood vessels,three cascade connected UNet networks are employed.To deal with the problem of difference between the parts of encoder and decoder,in MF2ResU-Net,shortcut connections are used to combine the encoder and decoder layers in the blocks.To refine the feature of segmentation,atrous spatial pyramid pooling(ASPP)is embedded to achieve multi-scale features for the final segmentation networks.Results The MF2ResU-Net was superior to the existing methods on the criteria of sensitivity(Sen),specificity(Spe),accuracy(ACC),and area under curve(AUC),the values of which are 0.8013 and 0.8102,0.9842 and 0.9809,0.9700 and 0.9776,and 0.9797 and 0.9837,respectively for DRIVE and CHASE DB1.The results of experiments demonstrated the effectiveness and robustness of the model in the segmentation of complex curvature and small blood vessels.Conclusion Based on residual connections and multi-feature fusion,the proposed method can obtain accurate segmentation of retinal blood vessels by refining the segmentation features,which can provide another diagnosis method for computer-aided Chinese medical diagnosis.展开更多
The moving vehicles present different scales in the image due to the perspective effect of different viewpoint distances.The premise of advanced driver assistance system(ADAS)system for safety surveillance and safe dr...The moving vehicles present different scales in the image due to the perspective effect of different viewpoint distances.The premise of advanced driver assistance system(ADAS)system for safety surveillance and safe driving is early identification of vehicle targets in front of the ego vehicle.The recognition of the same vehicle at different scales requires feature learning with scale invariance.Unlike existing feature vector methods,the normalized PCA eigenvalues calculated from feature maps are used to extract scale-invariant features.This study proposed a convolutional neural network(CNN)structure embedded with the module of multi-pooling-PCA for scale variant object recognition.The validation of the proposed network structure is verified by scale variant vehicle image dataset.Compared with scale invariant network algorithms of Scale-invariant feature transform(SIFT)and FSAF as well as miscellaneous networks,the proposed network can achieve the best recognition accuracy tested by the vehicle scale variant dataset.To testify the practicality of this modified network,the testing of public dataset ImageNet is done and the comparable results proved its effectiveness in general purpose of applications.展开更多
X-ray ptychographic tomography is a nondestructive method for three dimensional(3D)imaging with nanometer-sized resolvable features.The size of the volume that can be imaged is almost arbitrary,limited only by the pen...X-ray ptychographic tomography is a nondestructive method for three dimensional(3D)imaging with nanometer-sized resolvable features.The size of the volume that can be imaged is almost arbitrary,limited only by the penetration depth and the available scanning time.Here we present a method that rapidly accelerates the imaging operation over a given volume through acquiring a limited set of data via large angular reduction and compensating for the resulting ill-posedness through deeply learned priors.The proposed 3D reconstruction method“RAPID”relies initially on a subset of the object measured with the nominal number of required illumination angles and treats the reconstructions from the conventional two-step approach as ground truth.It is then trained to reproduce equal fidelity from much fewer angles.After training,it performs with similar fidelity on the hitherto unexamined portions of the object,previously not shown during training,with a limited set of acquisitions.In our experimental demonstration,the nominal number of angles was 349 and the reduced number of angles was 21,resulting in a×140 aggregate speedup over a volume of 4.48×93.18×3.92μm^(3) and with(14 nm)^(3) feature size,i.e.-10^(8) voxels.RAPID’s key distinguishing feature over earlier attempts is the incorporation of atrous spatial pyramid pooling modules into the deep neural network framework in an anisotropic way.We found that adjusting the atrous rate improves reconstruction fidelity because it expands the convolutional kernels’range to match the physics of multi-slice ptychography without significantly increasing the number of parameters.展开更多
Automatic segmentation of pulmonary vessels is a fundamental and essential task for the diagnosis of various pulmonary vessels diseases.The accuracy of segmentation is suffering from the complex vascular structure.In ...Automatic segmentation of pulmonary vessels is a fundamental and essential task for the diagnosis of various pulmonary vessels diseases.The accuracy of segmentation is suffering from the complex vascular structure.In this paper,an Improved Residual Attention U-Net(IRAU-Net)aiming to segment pulmonary vessel in 3D is proposed.To extract more vessel structure information,the Squeeze and Excitation(SE)block is embedded in the down sampling stage.And in the up sampling stage,the global attention module(GAM)is used to capture target features in both high and low levels.These two stages are connected by Atrous Spatial Pyramid Pooling(ASPP)which can sample in various receptive fields with a low computational cost.By the evaluation experiment,the better performance of IRAU-Net on the segmentation of terminal vessel is indicated.It is expected to provide robust support for clinical diagnosis and treatment.展开更多
基金supported by the National Natural Science Foundation of China(71401052)the Fundamental Research Funds for the Central Universities(2019B19514)。
文摘In order to solve the problem that existing multivariate grey incidence models cannot be applied to time series on different scales, a new model is proposed based on spatial pyramid pooling.Firstly, local features of multivariate time series on different scales are pooled and aggregated by spatial pyramid pooling to construct n levels feature pooling matrices on the same scale. Secondly,Deng's multivariate grey incidence model is introduced to measure the degree of incidence between feature pooling matrices at each level. Thirdly, grey incidence degrees at each level are integrated into a global incidence degree. Finally, the performance of the proposed model is verified on two data sets compared with a variety of algorithms. The results illustrate that the proposed model is more effective and efficient than other similarity measure algorithms.
文摘Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially leading to false positives or missed detections.To solve these problems,the YOLOv8 network is enhanced by adding deformable convolution and atrous spatial pyramid pooling(ASPP),along with the integration of a coordinate attention(CA)mechanism.This allows the network to focus on small targets while expanding the receptive field without losing resolution.At the same time,context information on the target is gathered and feature expression is enhanced by attention modules in different directions.It effectively improves the positioning accuracy and achieves good results on the LUNA16 dataset.Compared with other detection algorithms,it improves the accuracy of pulmonary nodule detection to a certain extent.
基金funded by the National Natural Science Foundation of China(General Program:No.52074314,No.U19B6003-05)National Key Research and Development Program of China(2019YFA0708303-05)。
文摘Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the great potential to deal with pore pressure prediction.However,most of the traditional deep learning models are less efficient to address generalization problems.To fill this technical gap,in this work,we developed a new adaptive physics-informed deep learning model with high generalization capability to predict pore pressure values directly from seismic data.Specifically,the new model,named CGP-NN,consists of a novel parametric features extraction approach(1DCPP),a stacked multilayer gated recurrent model(multilayer GRU),and an adaptive physics-informed loss function.Through machine training,the developed model can automatically select the optimal physical model to constrain the results for each pore pressure prediction.The CGP-NN model has the best generalization when the physicsrelated metricλ=0.5.A hybrid approach combining Eaton and Bowers methods is also proposed to build machine-learnable labels for solving the problem of few labels.To validate the developed model and methodology,a case study on a complex reservoir in Tarim Basin was further performed to demonstrate the high accuracy on the pore pressure prediction of new wells along with the strong generalization ability.The adaptive physics-informed deep learning approach presented here has potential application in the prediction of pore pressures coupled with multiple genesis mechanisms using seismic data.
基金funded by the Fundamental Research Project of CNPC Geophysical Key Lab(2022DQ0604-4)the Strategic Cooperation Technology Projects of China National Petroleum Corporation and China University of Petroleum-Beijing(ZLZX 202003)。
文摘With the successful application and breakthrough of deep learning technology in image segmentation,there has been continuous development in the field of seismic facies interpretation using convolutional neural networks.These intelligent and automated methods significantly reduce manual labor,particularly in the laborious task of manually labeling seismic facies.However,the extensive demand for training data imposes limitations on their wider application.To overcome this challenge,we adopt the UNet architecture as the foundational network structure for seismic facies classification,which has demonstrated effective segmentation results even with small-sample training data.Additionally,we integrate spatial pyramid pooling and dilated convolution modules into the network architecture to enhance the perception of spatial information across a broader range.The seismic facies classification test on the public data from the F3 block verifies the superior performance of our proposed improved network structure in delineating seismic facies boundaries.Comparative analysis against the traditional UNet model reveals that our method achieves more accurate predictive classification results,as evidenced by various evaluation metrics for image segmentation.Obviously,the classification accuracy reaches an impressive 96%.Furthermore,the results of seismic facies classification in the seismic slice dimension provide further confirmation of the superior performance of our proposed method,which accurately defines the range of different seismic facies.This approach holds significant potential for analyzing geological patterns and extracting valuable depositional information.
基金supported by the National Natural Science Foundation of China(Nos.U19A208162202320)+2 种基金the Fundamental Research Funds for the Central Universities(No.SCU2023D008)the Science and Engineering Connotation Development Project of Sichuan University(No.2020SCUNG129)the Key Laboratory of Data Protection and Intelligent Management(Sichuan University),Ministry of Education.
文摘Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to extract universal rules for effective detection.With the progress in techniques such as transfer learning and meta-learning,few-shot network attack detection has progressed.However,challenges in few-shot network attack detection arise from the inability of time sequence flow features to adapt to the fixed length input requirement of deep learning,difficulties in capturing rich information from original flow in the case of insufficient samples,and the challenge of high-level abstract representation.To address these challenges,a few-shot network attack detection based on NFHP(Network Flow Holographic Picture)-RN(ResNet)is proposed.Specifically,leveraging inherent properties of images such as translation invariance,rotation invariance,scale invariance,and illumination invariance,network attack traffic features and contextual relationships are intuitively represented in NFHP.In addition,an improved RN network model is employed for high-level abstract feature extraction,ensuring that the extracted high-level abstract features maintain the detailed characteristics of the original traffic behavior,regardless of changes in background traffic.Finally,a meta-learning model based on the self-attention mechanism is constructed,achieving the detection of novel APT few-shot network attacks through the empirical generalization of high-level abstract feature representations of known-class network attack behaviors.Experimental results demonstrate that the proposed method can learn high-level abstract features of network attacks across different traffic detail granularities.Comparedwith state-of-the-artmethods,it achieves favorable accuracy,precision,recall,and F1 scores for the identification of unknown-class network attacks through cross-validation onmultiple datasets.
文摘To detect bull’s-eye anomalies in low-frequency seismic inversion models,the study proposed an advanced method using an optimized you only look once version 7(YOLOv7)model.This model is enhanced by integrating advanced modules,including the bidirectional feature pyramid network(BiFPN),weighted intersection-over-union(wise-IoU),efficient channel attention(ECA),and atrous spatial pyramid pooling(ASPP).BiFPN facilitates robust feature extraction by enabling bidirectional information fl ow across network scales,which enhances the ability of the model to capture complex patterns in seismic inversion models.Wise-IoU improves the precision and fineness of reservoir feature localization through its weighted approach to IoU.Meanwhile,ECA optimizes interactions between channels,which promotes eff ective information exchange and enhances the overall response of the model to subtle inversion details.Lastly,the ASPP module strategically addresses spatial dependencies at multiple scales,which further enhances the ability of the model to identify complex reservoir structures.By synergistically integrating these advanced modules,the proposed model not only demonstrates superior performance in detecting bull’s-eye anomalies but also marks a pioneering step in utilizing cutting-edge deep learning technologies to enhance the accuracy and reliability of seismic reservoir prediction in oil and gas exploration.The results meet scientific literature standards and provide new perspectives on methodology,which makes significant contributions to ongoing eff orts to refine accurate and efficient prediction models for oil and gas exploration.
文摘In cornfields,factors such as the similarity between corn seedlings and weeds and the blurring of plant edge details pose challenges to corn and weed segmentation.In addition,remote areas such as farmland are usually constrained by limited computational resources and limited collected data.Therefore,it becomes necessary to lighten the model to better adapt to complex cornfield scene,and make full use of the limited data information.In this paper,we propose an improved image segmentation algorithm based on unet.Firstly,the inverted residual structure is introduced into the contraction path to reduce the number of parameters in the training process and improve the feature extraction ability;secondly,the pyramid pooling module is introduced to enhance the network’s ability of acquiring contextual information as well as the ability of dealing with the small target loss problem;and lastly,Finally,to further enhance the segmentation capability of the model,the squeeze and excitation mechanism is introduced in the expansion path.We used images of corn seedlings collected in the field and publicly available corn weed datasets to evaluate the improved model.The improved model has a total parameter of 3.79 M and miou can achieve 87.9%.The fps on a single 3050 ti video card is about 58.9.The experimental results show that the network proposed in this paper can quickly segment corn weeds in a cornfield scenario with good segmentation accuracy.
基金supported by National Natural Science Foundation of China(No.11802272)China Postdoctoral Science Foundation(No.2019M651085)。
文摘In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial network(GAN)was proposed.First,a noise model based on style GAN2 was constructed to estimate the real noise distribution,and the noise information similar to the real noise distribution was generated as the experimental noise data set.Then,a network model with encoder-decoder architecture as the core based on GAN idea was constructed,and the network model was trained with the generated noise data set until it reached the optimal value.Finally,the noise and artifacts in low-dose CT images could be removed by inputting low-dose CT images into the denoising network.The experimental results showed that the constructed network model based on GAN architecture improved the utilization rate of noise feature information and the stability of network training,removed image noise and artifacts,and reconstructed image with rich texture and realistic visual effect.
基金supported by The Natural Science Foundation of the Jiangsu Higher Education Institutions of China(Grants No.19JKB520031).
文摘Infrared target detection models are more required than ever before to be deployed on embedded platforms,which requires models with less memory consumption and better real-time performance while considering accuracy.To address the above challenges,we propose a modified You Only Look Once(YOLO)algorithm PF-YOLOv4-Tiny.The algorithm incorpo-rates spatial pyramidal pooling(SPP)and squeeze-and-excitation(SE)visual attention modules to enhance the target localization capability.The PANet-based-feature pyramid networks(P-FPN)are proposed to transfer semantic information and location information simultaneously to ameliorate detection accuracy.To lighten the network,the standard convolutions other than the backbone network are replaced with depthwise separable convolutions.In post-processing the images,the soft-non-maximum suppression(soft-NMS)algorithm is employed to subside the missed and false detection problems caused by the occlusion between targets.The accuracy of our model can finally reach 61.75%,while the total Params is only 9.3 M and GFLOPs is 11.At the same time,the inference speed reaches 87 FPS on NVIDIA GeForce GTX 1650 Ti,which can meet the requirements of the infrared target detection algorithm for the embedded deployments.
基金supported by the China Sichuan Science and Technology Program under Grant 2019YFG0299the Fundamental Research Funds of China West Normal University under Grant 19B045the Research Foundation for Talents of China Normal University under Grant 17YC163。
文摘Multi-license plate detection in complex scenes is still a challenging task because of multiple vehicle license plates with different sizes and classes in the images having complex background.The edge features of high-density distribution and the high curvature features of stroke turning of Chinese character are important signs to distinguish Chinese license plate from other objects.To accurately detect multiple vehicle license plates with different sizes and classes in complex scenes,a multi-object detection of Chinese license plate method based on improved YOLOv3 network was proposed in this research.The improvements include replacing the residual block of the YOLOv3 backbone network with the Inception-ResNet-A block,imbedding the SPP block into the detection network,cutting the redundant Inception-ResNet-A block to suit for the multi-license plate detection task,and clustering the ground truth boxes of license plates to obtain a new set of anchor boxes.A Chinese vehicle license plate image dataset was built for training and testing the improved network,and the location and class of the license plates in each image were accurately labeled.The dataset has 62,153 pieces of images and 4 classes of China vehicle license plates,almost images have multiple license plates with different sizes.Experiments demonstrated that the multilicense plate detection method obtained 83.4%mAP,98.88%precision,98.17%recall,98.52 F1 score,89.196 BFLOPS and 22 FPS on the test dataset,and whole performance was better than the other five compared networks including YOLOv3,SSD,Faster-RCNN,EfficientDet and RetinaNet.
基金Key R&D Projects in Hebei Province(22370301D)Scientific Research Foundation of Hebei University for Distinguished Young Scholars(521100221081)Scientific Research Foundation of Colleges and Universities in Hebei Province(QN2022107)。
文摘Objective For computer-aided Chinese medical diagnosis and aiming at the problem of insufficient segmentation,a novel multi-level method based on the multi-scale fusion residual neural network(MF2ResU-Net)model is proposed.Methods To obtain refined features of retinal blood vessels,three cascade connected UNet networks are employed.To deal with the problem of difference between the parts of encoder and decoder,in MF2ResU-Net,shortcut connections are used to combine the encoder and decoder layers in the blocks.To refine the feature of segmentation,atrous spatial pyramid pooling(ASPP)is embedded to achieve multi-scale features for the final segmentation networks.Results The MF2ResU-Net was superior to the existing methods on the criteria of sensitivity(Sen),specificity(Spe),accuracy(ACC),and area under curve(AUC),the values of which are 0.8013 and 0.8102,0.9842 and 0.9809,0.9700 and 0.9776,and 0.9797 and 0.9837,respectively for DRIVE and CHASE DB1.The results of experiments demonstrated the effectiveness and robustness of the model in the segmentation of complex curvature and small blood vessels.Conclusion Based on residual connections and multi-feature fusion,the proposed method can obtain accurate segmentation of retinal blood vessels by refining the segmentation features,which can provide another diagnosis method for computer-aided Chinese medical diagnosis.
基金supported by the National Natural Science Foundation of China(Grant No.51875340).
文摘The moving vehicles present different scales in the image due to the perspective effect of different viewpoint distances.The premise of advanced driver assistance system(ADAS)system for safety surveillance and safe driving is early identification of vehicle targets in front of the ego vehicle.The recognition of the same vehicle at different scales requires feature learning with scale invariance.Unlike existing feature vector methods,the normalized PCA eigenvalues calculated from feature maps are used to extract scale-invariant features.This study proposed a convolutional neural network(CNN)structure embedded with the module of multi-pooling-PCA for scale variant object recognition.The validation of the proposed network structure is verified by scale variant vehicle image dataset.Compared with scale invariant network algorithms of Scale-invariant feature transform(SIFT)and FSAF as well as miscellaneous networks,the proposed network can achieve the best recognition accuracy tested by the vehicle scale variant dataset.To testify the practicality of this modified network,the testing of public dataset ImageNet is done and the comparable results proved its effectiveness in general purpose of applications.
基金funded by the Intelligence Advanced Research Projects Activity,Office of the Director of National Intelligence(IARPA-ODNI)under contract FA8650-17-C-9113.
文摘X-ray ptychographic tomography is a nondestructive method for three dimensional(3D)imaging with nanometer-sized resolvable features.The size of the volume that can be imaged is almost arbitrary,limited only by the penetration depth and the available scanning time.Here we present a method that rapidly accelerates the imaging operation over a given volume through acquiring a limited set of data via large angular reduction and compensating for the resulting ill-posedness through deeply learned priors.The proposed 3D reconstruction method“RAPID”relies initially on a subset of the object measured with the nominal number of required illumination angles and treats the reconstructions from the conventional two-step approach as ground truth.It is then trained to reproduce equal fidelity from much fewer angles.After training,it performs with similar fidelity on the hitherto unexamined portions of the object,previously not shown during training,with a limited set of acquisitions.In our experimental demonstration,the nominal number of angles was 349 and the reduced number of angles was 21,resulting in a×140 aggregate speedup over a volume of 4.48×93.18×3.92μm^(3) and with(14 nm)^(3) feature size,i.e.-10^(8) voxels.RAPID’s key distinguishing feature over earlier attempts is the incorporation of atrous spatial pyramid pooling modules into the deep neural network framework in an anisotropic way.We found that adjusting the atrous rate improves reconstruction fidelity because it expands the convolutional kernels’range to match the physics of multi-slice ptychography without significantly increasing the number of parameters.
文摘Automatic segmentation of pulmonary vessels is a fundamental and essential task for the diagnosis of various pulmonary vessels diseases.The accuracy of segmentation is suffering from the complex vascular structure.In this paper,an Improved Residual Attention U-Net(IRAU-Net)aiming to segment pulmonary vessel in 3D is proposed.To extract more vessel structure information,the Squeeze and Excitation(SE)block is embedded in the down sampling stage.And in the up sampling stage,the global attention module(GAM)is used to capture target features in both high and low levels.These two stages are connected by Atrous Spatial Pyramid Pooling(ASPP)which can sample in various receptive fields with a low computational cost.By the evaluation experiment,the better performance of IRAU-Net on the segmentation of terminal vessel is indicated.It is expected to provide robust support for clinical diagnosis and treatment.