摘要
朴素贝叶斯文本分类模型是一种简单而高效的文本分类模型,但是它的独立性假设属性使其无法表示现实世界属性之间的依赖关系,从而影响它的分类性能。这里提出一种改进的基于贝叶斯定理的文本分类模型——“树桩网络(Stump Network)”,并将该方法与朴素贝叶斯文本分类器和TAN(Tree Augmented Naive Bayes)文本分类器进行实验比较,结果表明,在大多数数据集上该文本分类方法具有较高的分类正确率。
Naive Bayes text classifier is a simple and effective text classification method, but its attribute independence assumption makes it unable to express the dependence among attribute in the real world, and affects its classification performance. In this paper, an improved text classification model based on Bayes theorem called Stump Network is presented. Stump Network text classifier is compared with Naive Bayes text classifier and TAN (tree augmented naive Bayes) by an experiment. Experimental results show this model has higher classification accuracy in most data sets.
出处
《邢台职业技术学院学报》
2006年第1期19-21,共3页
Journal of Xingtai Polytechnic College