摘要
针对文本聚类分析结果表达中缺乏语义关系的缺陷,本文利用人们在可视化形式下对模型和结构的理解和获取能力,提出了一种新的可视化表达方法.通过改进的Force-directed模型对聚类分析结果进行文本布局,表现文本间的语义相似关系;使用等值线生成算法构建层次性主题图,聚集和提炼文本主题;最终实现文本聚类分析结果的语义直观表达.实验结果表明,这种可视化方法不仅能够有效地表达聚类结果,体现类间、文本间的语义相关程度,而且还有助于发现隐含的信息,并通过类别之间的关联实现有效的信息导航.
Considering the lack of semantic relations in the expression of current text clustering analysis, a new visualization method is proposed, which helps people to fully understand the model and structure of original textual data in the visual way. Through the improved Force-directed model, the clustering results of texts are projected onto a plane, whose layout represents the semantic relations between texts. Furthermore, a hierarchical theme map is constructed by the contour algorithm to show the distribution of textual themes. As a result, the textual clustering results can be represented in a more intuitive and semantic form. Experiments show that the proposed method not only expresses the difference between clusters and texts during the clustering analysis, but also facilitates to find the hidden knowledge and retrieve the information between clusters.
出处
《情报学报》
CSSCI
北大核心
2011年第2期115-120,共6页
Journal of the China Society for Scientific and Technical Information
基金
国家自然科学基金,国家高科技研究发展计划863资助项目
关键词
可视化
布局算法
文本聚类
主题图
visualization
placement algorithm
text clustering
theme map