摘要
针对博文内容包含多主题、类别归属不明显,以及传统的文本分类方法直接应用于博文效果不理想等问题,提出了一种基于标签的博客文章分类方法.该方法将文本分类问题转化为图优化问题,并提出了一种利用迭代算法计算图中节点属于各类别的概率值.实验结果表明,与传统的文本分类方法相比,所提出的分类方法有效地提高了博文的分类性能.
Aimed at the problem of blog posts contents include multiple themes, unobvious categories ownership and the failure of common text classification methods to perform well. A new blog post classification method is presented, which utilizes social tags as semantic features to construct the relational graph of blog post and tags. This method casts blog post classification problem as an optimization problem on a graph of bloging post and tags, and then an iterative algorithm is proposed to calculate the probability of the node belonging to every category. The experimental results show that the performance of our method is obviously better than common text classification methods.
出处
《上海电力学院学报》
CAS
2013年第6期544-548,共5页
Journal of Shanghai University of Electric Power
关键词
博文分类
社会标签
图优化
迭代算法
blog classification
social tag
graph optimization
iterative algorithm