期刊文献+

Identification of Topics from Scientific Papers through Topic Modeling

Identification of Topics from Scientific Papers through Topic Modeling
在线阅读 下载PDF
导出
摘要 Topic modeling is a probabilistic model that identifies topics covered in text(s). In this paper, topics were loaded from two implementations of topic modeling, namely, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA). This analysis was performed in a corpus of 1000 academic papers written in English, obtained from PLOS ONE website, in the areas of Biology, Medicine, Physics and Social Sciences. The objective is to verify if the four academic fields were represented in the four topics obtained by topic modeling. The four topics obtained from Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) did not represent the four academic fields. Topic modeling is a probabilistic model that identifies topics covered in text(s). In this paper, topics were loaded from two implementations of topic modeling, namely, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA). This analysis was performed in a corpus of 1000 academic papers written in English, obtained from PLOS ONE website, in the areas of Biology, Medicine, Physics and Social Sciences. The objective is to verify if the four academic fields were represented in the four topics obtained by topic modeling. The four topics obtained from Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) did not represent the four academic fields.
作者 Denis Luiz Marcello Owa Denis Luiz Marcello Owa(Pontifical Catholic University of Sao Paulo, Sao Paulo, Brazil)
出处 《Open Journal of Applied Sciences》 2021年第4期541-548,共8页 应用科学(英文)
关键词 Topic Modeling Corpus Linguistics Gensim LSI LDA Topic Modeling Corpus Linguistics Gensim LSI LDA
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部