摘要
解决不平衡数据分类问题,在现实中有着深远的意义。马田系统利用单一的正常类别构建基准空间和测量基准尺度,并由此建立数据分类模型,十分适合不平衡数据分类问题的处理。本文以传统马田系统方法为基础,结合信噪比及F-value、G-mean等分类精度,建立了基于遗传算法的基准空间优化模型,同时运用Bagging集成化算法,构造了改进马田系统模型算法GBMTS。通过对不同分类方法及相关数据集的实验分析,表明:GBMTS算法较其他分类算法,更能够有效的处理不平衡数据的分类问题。
It is of great significance in reality to solve the problem of classification with imbalanced data. Mahalanobis-Tagnchi system (MTS) uses a single normal group to construct the reference space and measurement reference scale, and thus establishes the data classification model which is suitable for the classification problem of imbalaneed data. In this paper, the reference space optimization model is constructed based on the traditional MTS method combined with the signal-to-noise ratio and classification accuracy indicators such as F-value and G-mean, and then an improved MTS model algorithm GBMTS is proposed by using the bagging algorithm. Through the experimental analysis of different classification methods and related data sets, it is shown that the GBMTS algorithm is more effective to deal with the classification problem of imbalanced data compared to the other methods.
出处
《数理统计与管理》
CSSCI
北大核心
2016年第6期1016-1027,共12页
Journal of Applied Statistics and Management
基金
国家自然科学基金资助项目(71271114)
关键词
马田系统
不平衡数据
分类
遗传算法
BAGGING算法
Mahalanobis-Taguchi system, imbalanced data, classification, genetic algorithm, bagging algorithm