摘要
通过分析常见的贝叶斯分类方法和实现模型,提出了一种适用于中文邮件的分类算法——基于混合模型的最小风险贝叶斯方法。混合模型将二项独立模型和多项式模型相结合,提高邮件分类的查全率,同时,在此基础上应用最小风险贝叶斯方法,进一步提高准确率。实验表明,应用改进的方法可以得到更准确的邮件分类效果。
With studying some popular methods and models for Bayesian approach,one kind of text classificatory algorithm the paper proposed a new algorithm which was fit for Chinese mails,risk minimization Bayes based on hybird model.The hybird model unified Binary Independence Model and Muhinomial Model,improved the recall of mail filter,in the meanwhile,using the risk minimization Bayes on hybird model,improved the precision.The result of experiments demonstrates that the new algorithm gains better performance in mail classification.
出处
《计算机工程与应用》
CSCD
北大核心
2006年第31期97-100,113,共5页
Computer Engineering and Applications
基金
天津市信息化资助项目(042023012)
关键词
邮件分类
中文分词
最小风险
混合模型
贝叶斯
mail classification
Chinese word segmentation
minimum risk
hybrid model
Bayes