摘要
[目的/意义]旨在利用PhraseLDA-SNA和机器学习方法准确测度技术主题的影响力,以期为制定科技政策、优化资源配置提供理论参考。[方法/过程]本研究首先分析了技术主题影响力的显性及隐性决定因素,据此构建了技术主题影响力测度指标体系。其次,基于PhraseLDA-SNA与机器学习方法分析测度指标,实现对技术主题影响力的测度。最后,以纤维素生物降解领域为例进行实证研究,验证方法的有效性。[结果/结论]本研究提出的基于PhraseLDA-SNA和机器学习的技术主题影响力测度方法与传统方法相比,显著降低了受专利数据授权及引用时滞问题的影响。
[Purpose/Significance]Accurately measuring the influence of technical topics is crucial for decision-makers to understand the developmental trends in the technology sector.It is also an important link in identifying emerging,cutting-edge,and disruptive technical topics.Traditional methods of measuring technical topic influence are significantly affected by the latency of patent data approval and citations,lack a forward-looking perspective on the potential influence of technical topics,and suffer from insufficient semantic richness in the extraction of technical topics.This paper presents a method for measuring technical topic influence based on PhraseLDA-SNA and machine learning.It aims to mitigate the impact of delays in patent data approval and citation,while improving the interpretability and accuracy of the results in assessing technical topic influence.[Method/Process]In this study the explicit and implicit determinants of technical topic influence were first analyzed,based on which an index system for measuring technical topic influence was constructed.Then,the PhraseLDA model was used to extract semantically rich technical topics from a large corpus of pre-processed patent texts and to compute the topic-patent association probabilities.PhraseLDA-SNA enhances the semantic richness of technical topic extraction and deepens the analysis of topic content.Machine learning methods leverage their robust data processing and analysis capabilities to predict the high citation potential of patents related to the topics.This research integrates PhraseLDA-SNA and machine learning methods to accurately measure the significance and advanced nature of technical topics in promoting field development,thereby achieving an accurate measurement of the influence of technical topics.Finally,an empirical study was conducted in the field of cellulose biodegradation to compare the high-impact technical topics identified by the proposed method with those identified by the traditional method.Several experts with high academic influence and extensive experience in cellulose biodegradation research were invited to evaluate the high-impact technical topics identified in this study,thus validating the effectiveness of the proposed method.[Results/Conclusions]Compared with the traditional method,the technical topic influence measurement approach based on PhraseLDA-SNA and machine learning reveals more in-depth content.Moreover,this method also analyzes the importance and leading nature of technical topics,which shows superiority in quantitative analysis.Comparing the distribution of high-impact technical topic-related patents identified by the two methods across different years,the topics identified by the proposed method had a higher association ratio in the most recent data,indicating a significant reduction in the impact of patent data approval and citation delays.
作者
项芮
孙巍
XIANG Rui;SUN Wei(Institute of Agricultural Information,Chinese Academy of Agricultural Sciences,Beijing 100081;Key Laboratory of Agricultural Big Data,Ministry of Agriculture and Rural Affairs,Beijing 100081)
出处
《农业图书情报学报》
2024年第4期45-62,共18页
Journal of Library and Information Science in Agriculture
基金
国家重点研发计划项目“科技文献内容深度挖掘及智能分析关键技术和软件”(2022YFF0711900)。
关键词
主题挖掘
专利
影响力测度
机器学习
知识产权
技术预测
topic mining
patent
influence measurement
machine learning
intellectual property
technology forecasting