A new method to detect multiple outliers in multivariate data is proposed. It is a combination of minimum subsets, resampling and self-organizing map (SOM) algorithm introduced by Kohonen,which provides a robust way w...A new method to detect multiple outliers in multivariate data is proposed. It is a combination of minimum subsets, resampling and self-organizing map (SOM) algorithm introduced by Kohonen,which provides a robust way with neural network. In this method, the number and organization of the neurons are selected by the characteristics of the spectra, e.g., the spectra data are often changed linearly with the concentration of the components and are often measured repeatedly, etc. So the spatial distribution of the neurons can be arranged by this characteristic. With this method, all the outliers in the spectra can be detected, which cannot be solved by the traditional method, and the speed of computation is higher than that of the traditional neural network method. The results of the simulation and the experiment show that this method is simple, effective, intuitionistic and all the outliers in the spectra can be detected in a short time. It is useful when associated with the regression model in the near infra-red research.展开更多
Recent finance and debt crises have made credit risk management one of the most important issues in financial research.Reliable credit scoring models are crucial for financial agencies to evaluate credit applications ...Recent finance and debt crises have made credit risk management one of the most important issues in financial research.Reliable credit scoring models are crucial for financial agencies to evaluate credit applications and have been widely studied in the field of machine learning and statistics.In this paper,a novel feature-weighted support vector machine(SVM) credit scoring model is presented for credit risk assessment,in which an F-score is adopted for feature importance ranking.Considering the mutual interaction among modeling features,random forest is further introduced for relative feature importance measurement.These two feature-weighted versions of SVM are tested against the traditional SVM on two real-world datasets and the research results reveal the validity of the proposed method.展开更多
文摘A new method to detect multiple outliers in multivariate data is proposed. It is a combination of minimum subsets, resampling and self-organizing map (SOM) algorithm introduced by Kohonen,which provides a robust way with neural network. In this method, the number and organization of the neurons are selected by the characteristics of the spectra, e.g., the spectra data are often changed linearly with the concentration of the components and are often measured repeatedly, etc. So the spatial distribution of the neurons can be arranged by this characteristic. With this method, all the outliers in the spectra can be detected, which cannot be solved by the traditional method, and the speed of computation is higher than that of the traditional neural network method. The results of the simulation and the experiment show that this method is simple, effective, intuitionistic and all the outliers in the spectra can be detected in a short time. It is useful when associated with the regression model in the near infra-red research.
基金Project supported by the National Basic Research Program (973) of China (No. 2011CB706506)the National Natural Science Foundation of China (No. 50905159)+1 种基金the Natural Science Foundation of Jiangsu Province (No. BK2010261)the Fundamental Research Funds for the Central Universities (No. 2011XZZX005),China
文摘Recent finance and debt crises have made credit risk management one of the most important issues in financial research.Reliable credit scoring models are crucial for financial agencies to evaluate credit applications and have been widely studied in the field of machine learning and statistics.In this paper,a novel feature-weighted support vector machine(SVM) credit scoring model is presented for credit risk assessment,in which an F-score is adopted for feature importance ranking.Considering the mutual interaction among modeling features,random forest is further introduced for relative feature importance measurement.These two feature-weighted versions of SVM are tested against the traditional SVM on two real-world datasets and the research results reveal the validity of the proposed method.