期刊文献+

基于博客标签的博文分类算法

Algorithms for Blog Post Classification Based on Blog Tag
在线阅读 下载PDF
导出
摘要 针对博文内容包含多主题、类别归属不明显,以及传统的文本分类方法直接应用于博文效果不理想等问题,提出了一种基于标签的博客文章分类方法.该方法将文本分类问题转化为图优化问题,并提出了一种利用迭代算法计算图中节点属于各类别的概率值.实验结果表明,与传统的文本分类方法相比,所提出的分类方法有效地提高了博文的分类性能. Aimed at the problem of blog posts contents include multiple themes, unobvious categories ownership and the failure of common text classification methods to perform well. A new blog post classification method is presented, which utilizes social tags as semantic features to construct the relational graph of blog post and tags. This method casts blog post classification problem as an optimization problem on a graph of bloging post and tags, and then an iterative algorithm is proposed to calculate the probability of the node belonging to every category. The experimental results show that the performance of our method is obviously better than common text classification methods.
作者 卢露 魏登月
出处 《上海电力学院学报》 CAS 2013年第6期544-548,共5页 Journal of Shanghai University of Electric Power
关键词 博文分类 社会标签 图优化 迭代算法 blog classification social tag graph optimization iterative algorithm
  • 相关文献

参考文献7

  • 1中国互联网络信息中心CNNIC.博客市场及博客行为研究报告[EB/OL].[2009-07-17].http://www.cnnic.cn/hlwfzyj/hlwxzbg/200907/t2009071730737.htm.
  • 2麦林,俞能海.多特征融合的博客文章分类方法[J].小型微型计算机系统,2010,31(6):1129-1132. 被引量:6
  • 3NI Xiaochuan,WU Xiaoyuan,YU Yong. Automatic identification of Chinese weblogger' s interests based on text classification[ C] // Proceedings of the 2006 IEEE/WIC/ACM Intemationl Conference on Web Intelligence,2006:242-253.
  • 4王喜玮,王煦法.一种利用作者兴趣构建博客圈的方法[J].小型微型计算机系统,2009,30(12):2424-2427. 被引量:2
  • 5SUN Aixin, SURYANTO Maggy Anastasia, LIU Ying. Blog classification using tags : an empirical study [C] //Lecture Notes in Computer Science,2007:307-316.
  • 6LI Xin, GUO Lei, ZHAO Yi hong. Tag-based social interest discovery [ C ] // Proceeding of the 17th International Conference on World Wide Web,2008:6754584.
  • 7ZHANG Yin,GAO Kening ,ZHANG Bin. Clustering blog posts using tags and relations in the blogosphere[ C] //Proceedings of the 1st International Conference on Information Science and Engineering, 2009: 817 -820.

二级参考文献8

  • 1杨学明.Web中文文本聚类研究及实现[J].现代图书情报技术,2006(12):81-84. 被引量:8
  • 2China Internet Network Information Center.The 23th Statistical Report of China Internet Network Development[EB/OL].http://www.cnnic.net.cn/uploadfiles/pdf/2009/1/13/92458.pdf,2009.
  • 3Sun Ai-xin,Suryanto M A,Liu Ying.Blog classification using tags:an empirical study[C].ICSDL 2007,LNCS 4882,307-316,2007.
  • 4Brooks C H,Montanez N.Improved annotation of the blogosphere via autotagging and hierarchical clustering[A].WWW 2006,625-632[C].ACM Press,2006.
  • 5McCallum A,Nigam K.A comparison of event models for Nave bayes text classification[A].AAAI-98 Workshop on Learning for Text Categorization[C/OL].AAAI Press.http://www.cs.cmu.edu/-mccallum,2000.
  • 6Ni Xiao-chuan,Wu Xiao-yuan,Yu Yong.Automatic identification of Chinese weblogger's interests based on text classification[C].Proceedings of the 2006 IEEE/WIC/ACM Internationl Conference on Web Intelligence,2006,247-253.
  • 7Yang Yi-ming,Pederson J.O.A comparative study on feature selection in text categorization[A].Proceedings of the 14th International Conference of Machine Learning[C].San Francisco:Morgan Kaufmann Publishers,1997,412-420.
  • 8Kohavi R.A study of cross-validation and bootstrap for accuracy estimation and model selection[A].FourteenthInternational Joint Conference on Artificial Intelligence (IJCAI 95)[C].Morgan Kaufmann Publishers,1995,1137-1143.

共引文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部