摘要
【目的】发现领域文献中主题的新生、消亡、继承、分裂和合并的演化轨迹。【方法】根据文献出版时间划分多个时间窗口,通过LDA主题模型识别各个时间窗口中的主题;利用主题关联(Topic Association)过滤规则确定相邻时间窗口主题间的演化关系;形成连续时间段内主题新生、消亡、继承、分裂和合并的演化轨迹。【结果】在保证主题延续性的条件下,更准确地识别主题的新生、消亡、继承、分裂和合并的演化类型。【局限】固定的时间窗口,未考虑主题演化周期的多样性。【结论】该方法可以有效降低LDA主题模型中相似度较小主题的干扰,提升主题演化关系识别的准确性。
[Objective] To detect the birth, extinction, development, merge and split of topic evolution of the literatures in a certain field. [Methods] This paper divides time windows according to the publication data of the literatures, and LDA model is applied to extract topics from each time window automatically. The topic association filter rules are used to determine evolution relationships between topics in adjacent time windows. Form a topic evolution path in a continuous time period. [Results] Considering the continuity of the topics, different types of topic evolution could be detected with high accuracy. [Limitations] This method fixes the size of time windows without considering the diversity of topic evolution cycles. [Conclusions] This method can effectively reduce the interference of topics with smaller similarity in LDA, and enhance accuracy of evolution relation recognition.
出处
《现代图书情报技术》
CSSCI
2015年第3期18-25,共8页
New Technology of Library and Information Service
基金
国家科技支撑计划子课题"基于文献知识网络的领域学术关系研究与示范"(项目编号:2011BAH10B06-04)的研究成果之一