期刊文献+

多模型加权融合的文本相似度计算 被引量:7

Text similarity calculation based on multi model weighted fusion
在线阅读 下载PDF
导出
摘要 目前传统的文本相似度方法大多数存在未考虑语义及结构信息,容易忽略文本特征细节信息等问题。针对上述问题,提出多模型加权融合的文本相似度计算算法。利用词频、词性、词句位置3个特征共同计算句子相似度;为发现文本的结构信息方面,提出分层池化IIG-SIF用于计算文本的相似程度;结合前两个环节的相似度模型构建一种线性加权模型,汇集两个算法使结果更为精确。实验结果表明,该算法能够提高准确率和召回率,在不同语种和粒度的数据集上均得到更优的实验结果。 Most of the current traditional text similarity methods fail to consider the semantic and structural information,and it is easy to ignore the details of the text features and other issues.Aiming at the above problems,a text similarity calculation algorithm based on multi-model weighted fusion was proposed.The three characteristics of word frequency,part of speech,and word and sentence position were used to jointly calculate sentence similarity.To find the structural information of the text,a hierarchical pooling IIG-SIF was proposed to calculate the similarity of the text.The similarity models of first two were combined to construct a linear weighting model,by which two algorithms were brought together to make the result more accurate.Experimental results show that the proposed algorithm can improve the accuracy and recall rate,and obtain better experimental results on data sets of different languages and granularities.
作者 田红鹏 马博 冯健 TIAN Hong-peng;MA Bo;FENG Jian(College of Computer Science and Technology,Xi’an University of Science and Technology,Xi’an 710600,China)
出处 《计算机工程与设计》 北大核心 2021年第11期3239-3245,共7页 Computer Engineering and Design
基金 陕西省自然科学基础研究计划基金项目(2020JM-533)。
关键词 文本相似度 特征融合 词移距离 分层池化 句向量 text similarity feature fusion word movement distance layered pooling sentence vector
  • 相关文献

参考文献5

二级参考文献35

共引文献147

同被引文献77

引证文献7

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部