期刊文献+

基于核心主题特征的作者身份识别研究

Research on Author Attribution Based on Core Topic
原文传递
导出
摘要 [目的/意义]以主题特征在中文社交媒体文本作者识别中的使用研究为基本目的,利用Word2vec补充主题模型获取主题特征的不足,同时进一步制定策略对主题特征中的核心主题进行识别和筛选,优化主题特征的使用方法,从而提高主题特征在作者识别中的使用效果。[方法/过程]首先利用LDA主题模型抽取候选作者的学术主题和社交主题,然后利用Word2vec制定合并筛选策略进行核心主题的识别和表示,最后结合N-gram特征和相似度计算的办法实现作者识别。[结果/结论]利用核心主题特征对科研人员社交文本进行作者识别有一定的积极作用,同时本研究提出的核心主题特征相关策略和应用也能优化主题特征的使用效果,将其结合文体风格特征应用于作者识别,最高识别率达到83%。 [Purpose/Significance]The basic purpose of this study is to study the use of topic characteristics in author attribution of Chinese social media texts.Word2vec is used to supplement the topic model to obtain the deficiencies of topic characteristics.At the same time,strategies are further developed to identify and screen the core topics in the topic characteristics and optimize the use of topic characteristics.So as to improve the using effect of subject features in author attribution.[Methods/Process]The research first used the LDA topic model to extract the academic topics and social topics of the candidate authors,and then used Word2vec to develop a merge screening strategy to identify and represent the core topics,and finally used N-gram features and similarity calculation to achieve author attribution.[Results/Conclusion]The experimental results show that the use of core topic characteristics has a positive effect on author attribution of social texts.Meanwhile,the strategy and application of core topic characteristics proposed in this study can also optimize the effect of the use of topic-features,and the highest recognition rate will reach 83%when it is combined with stylistic-features.
作者 孟旭 谢靖 李春旺 Meng Xu;Xie Jing;Li Chunwang(National Science Library,Chinese Academy of Science,Beijing 100190;Department of Library,Information and Archives Management,School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100190;Institute of Computing Technology,Chinese Academy of Science,Beijing 100190)
出处 《知识管理论坛》 2023年第5期351-364,共14页 Knowledge Management Forum
关键词 作者身份识别 主题特征 N-GRAM 科研作者 社交网络文本 author attribution topic characteristics N-gram scientific research author social media text
  • 相关文献

参考文献11

二级参考文献126

共引文献442

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部