期刊文献+

基于聚类分析的网络舆情监控系统的设计 被引量:5

The design of the network public opinion monitoring system based on cluster analysis
在线阅读 下载PDF
导出
摘要 目的结合中文信息处理技术,设计一个网络舆情监控系统;针对网络舆情挖掘研究中存在的问题,提出了一种K-means改进算法,实现Web挖掘基础上的文本聚类与主题发现。方法构建一个基于聚类分析的网络舆情监控系统,并详细介绍系统各个模块用到的关键技术;提出了一种K-means改进算法,对K-means算法中的关键环节(聚类初始值的选择和孤立点的剔除)进行了改进。结果设计的系统能通过对网页、论坛、博客、新闻评论等网络资源的精确采集,并结合网页净化、中文分词、向量模型建立、特征选择、降维处理,文本聚类等中文信息处理技术,实现对网络舆情的监测;改进算法的总体思路是要求用户输入簇的初始个数k和最大值kmax,由改进算法在计算过程中自动计算出聚类的结果数k。结论设计了一个基于聚类分析的网络舆情监控系统;提出了一种K-means改进算法。具体算法实施及将这些关键技术整合实现成一套自动化的网络舆情信息采集、分析、监测与预警系统,是网络舆情挖掘研究工作的下一步重点。 Aim To design a network public opinion monitoring system combined with Chinese information processing and to achieve text cluster and theme discovery by introducing an improved K-means approach for solving the problem that exists in the study of network public opinion mining.Methods An improved K-means approach is set up by constructing one network popular feelings supervisory system based on the cluster analysis,and introducing key technologies in detail,and the key link in K-means algorithm(the cluster starting value's choice and the isolated point rejection) has been improved.Results The system can monitor network public opinions by gathering precisely network resources like webpages,forums,blogs and news commentaries and combining Chinese information processing technology such as webpage purification,Chinese word segmentation,establish of vector model,feature selection,dimensionality reduction and text clustering.The general idea of the improved algorithm is that the improved algorithm can automatically compute the clustering results k after users input the initial number k of the cluster and maximum value kmax.Conclusion Based on the cluster analysis,a network public opinion monitoring system is designed and an improved K-means approach is introduced.The next key step of network public opinion mining and the integration of the key technology is to implement concretely the algorithm and create a set of automatic system of network public opinion gathering,analysis,monitoring and early warning.
作者 黄美璇
出处 《宝鸡文理学院学报(自然科学版)》 CAS 2011年第4期40-44,共5页 Journal of Baoji University of Arts and Sciences(Natural Science Edition)
基金 黎明职业大学2010年度研究规划课题(LZ201002)
关键词 舆情监控 K-MEANS 文本聚类 主题发现 public opinion monitoring K-means text clustering theme discovery
  • 相关文献

参考文献11

二级参考文献89

共引文献155

同被引文献30

引证文献5

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部