期刊文献+

基于GBMTS算法的不平衡数据分类研究 被引量:6

Research on the Classification of Imbalanced Data Based on GBMTS Algorithm
原文传递
导出
摘要 解决不平衡数据分类问题,在现实中有着深远的意义。马田系统利用单一的正常类别构建基准空间和测量基准尺度,并由此建立数据分类模型,十分适合不平衡数据分类问题的处理。本文以传统马田系统方法为基础,结合信噪比及F-value、G-mean等分类精度,建立了基于遗传算法的基准空间优化模型,同时运用Bagging集成化算法,构造了改进马田系统模型算法GBMTS。通过对不同分类方法及相关数据集的实验分析,表明:GBMTS算法较其他分类算法,更能够有效的处理不平衡数据的分类问题。 It is of great significance in reality to solve the problem of classification with imbalanced data. Mahalanobis-Tagnchi system (MTS) uses a single normal group to construct the reference space and measurement reference scale, and thus establishes the data classification model which is suitable for the classification problem of imbalaneed data. In this paper, the reference space optimization model is constructed based on the traditional MTS method combined with the signal-to-noise ratio and classification accuracy indicators such as F-value and G-mean, and then an improved MTS model algorithm GBMTS is proposed by using the bagging algorithm. Through the experimental analysis of different classification methods and related data sets, it is shown that the GBMTS algorithm is more effective to deal with the classification problem of imbalanced data compared to the other methods.
出处 《数理统计与管理》 CSSCI 北大核心 2016年第6期1016-1027,共12页 Journal of Applied Statistics and Management
基金 国家自然科学基金资助项目(71271114)
关键词 马田系统 不平衡数据 分类 遗传算法 BAGGING算法 Mahalanobis-Taguchi system, imbalanced data, classification, genetic algorithm, bagging algorithm
  • 相关文献

参考文献1

二级参考文献14

  • 1边肇祺 张学工 等.模式识别[M].北京:清华大学出版社,2001..
  • 2Chan P K, Stolfo S J. Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection[C]//In. Proc of the Fourth International Conference on Knowledge Discovery and Data Mining(KDD-98). New York, 1998: 164- 168.
  • 3Weiss G M, Hirsh H. Learning to Predict Rare Events in Event Sequences[ C]// In. Proc of the Fourth International Conference on Knowledge Discovery and Data Mining(KDD-98). New York: 1998:359- 363.
  • 4Atiya A F. Bankruptcy Prediction for Credit Risk Using Neural Network: a Survey and New Results [J ]. IEEE Trans. Neural Networks, 2001, 12(4) : 929 - 935.
  • 5Kubat M, Holte R C, Matwin S. Machine Learning for the Detection of Oil Spills in Satellite Radar Images[J ].Machine Learning, 1998, 30(2): 195-215.
  • 6Chawla N V, Japkowicz N, Kolcz A. Editorial. Special Issue on Learning from Imbalanced Data Sets[C]// ACM SIGKDD Explorations, 2004, 6(1) : 1 - 6.
  • 7Weiss G M. Mining with Rarity-Problems and Solutions:A Unifying Framework [ C ] // SIGKDD Explorations,2004,6(1) :7 - 19.
  • 8Chawla N V, Japkowicz N. Kolcz A (editors). ICML'2003 Workshop on Learning from Imbalanced Data Sets[C/OL] [ 2003 ]. http://www, site. uottawa, ca/- nat/Workshop2003/workshop2003. html
  • 9Japkowica N (editor). Proc of the AAM'2000 Workshop on Learning form Imbalanced Data Sets[R]. AAAI Tech Report WS-00-05, AAAI, 2000.
  • 10McLachlan G J. Discriminant Analysis and Statistical Pattern Recognition[M]. New York: Wiley, 1992.

共引文献14

同被引文献49

引证文献6

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部