摘要
[目的/意义]针对目前医学领域基于主题的语义相似度计算研究较少,尚不足以揭示主题间在语义层面的关系,提出一套用于主题间语义相似度计算的方法,进而从语义角度判断主题间关系,为主题新颖性判断、主题关联研究等提供参考。[方法/过程]以Me SH词表为语义计算的基础,剖析词表结构与现有研究成果,从入口词、语义距离、注释3个维度综合测度主题间的语义相似度,利用Pub Med中2011-2014年干细胞领域的文献进行实证研究。[结果/结论]利用通用验证主题词对,验证了本文所提3个测度维度的有效性。通过主题间语义相似度的计算,发现干细胞领域2011-2014年较为新颖的主题为未成年人干细胞研究。后续研究中还需融入基于统计的主题相似度,从而更加全面地揭示主题间的关系,发现语义层面领域的新颖性研究主题。
[Purpose/significance ] For there are less studies on topic semantic similarity in medical field, and can't reveal the relationship between topics on the semantic level, this paper proposed the semantic similarity calculation meth- od, in order to get the method of judging semantic relationship between topics. [ Method/process] We used MeSH as computing basis. Firstly, it analyzed the structure of MESH. Then, it calculated topic semantic similarity from three dimen- sions of enty terms, semantic distance and annotation. Finally, it used the field of stem cell for empirical study. [ Result/ conclusion ] The validity of three dimensions proposed is verified by using the common verification concept words. It is found that, the young stem cell research is more novel than others between 2011 - 2014 through the topic semantic similarity method. In the follow-up study, it is necessary to integrate statistics method for topic similarity calculation, so as to reveal the relationship between topics, and find the novelty research topic in the field.
出处
《图书情报工作》
CSSCI
北大核心
2017年第8期96-105,共10页
Library and Information Service
基金
国家自然科学基金项目"基于语义的医学领域前沿知识发现及演化机制研究"(项目编号:71303259)
中央级公益性科研院所基本科研业务费"基于统计和语义的医学文献主题新颖性探测方法研究"(项目编号:2016RC330004)研究成果之一