期刊文献+

适于长短文本分类的CBLGA和CBLCA混联模型 被引量:2

CBLGA and CBLCA Hybrid Model for Long and Short Text’s Classification
在线阅读 下载PDF
导出
摘要 为提高文本分类的准确性和效率,构建了一种基于Attention的CNN-BiLSTM/BiGRU(简称CBLGA)混联文本分类模型。首先通过并联不同卷积窗口大小的CNN(Convolutional Neural Networks)网络同时提取多种局部特征,之后将数据输入至BiLSTM和BiGRU并联组合模型中,利用BiLSTM和BiGRU组合提取了与文本中的上下文有密切关系的全局特征,最后对两个模型所得到的特征值进行了融合并在其中引入了注意力机制。构建基于Attention的CNN-BiLSTM/CNN(简称CBLCA)混联文本分类模型,特点是将CNN的输出分为两部分,其中一部分输入BiLSTM网络中,另一部分则直接和BiLSTM网络的输出进行融合,既保留了CNN提取的文字序列局部特征,又利用了BiLSTM网络提取出的全局特征。实验表明CBLGA模型和CBLCA模型在准确率和效率方面均实现了有效提升。最后,建立了一套针对不同长度的文本进行相应预处理和后续分类工作的分类的流程,使模型无论面对长文本还是短文本数据,均实现了同时提高文本分类的准确率和效率的目标。 With the development of information technology,a large amount of text classification is needed in many industries.In order to improve the accuracy and the efficiency of classification at the same time,a kind of CNN-BiLSTM/BiGRU mixed text classification model based on the attention mechanism(CBLGA)is proposed,in which parallel CNN(Convolution Neural Networks)with different window sizes to extract a variety of text characteristics,then input the data in BiLSTM/BiGRU parallel model.BiLSTM/BiGRU combination model is used to extract global characteristics relate with the whole text context,finally the characteristics of two models are fused and the Attention mechanism is introduced.Secondly,another kind of Attention of CNN-BiLSTM/CNN mixed text classification model based on the attention mechanism(CBLCA)is proposed,and its feature is divided CNN’s output into two parts.One part is input to the BiLSTM network,another is integrated to the output of BiLSTM network.Successfully retaining the partial text features extracted by CNN and the global text features extracted by BiLSTM.After several experiments,the CBLGA model and CBLCA model is achieved effective improvements in accuracy and efficiency.Finally,a set of preprocessing methods for texts with different lengths is established,so the model can improve the accuracy and efficiency of text classification target in long text and short text.
作者 王得强 吴军 王立平 WANG Deqiang;WU Jun;WANG Liping(Department of Mechanical Engineering,Tsinghua University,Beijing 100084,China)
出处 《吉林大学学报(信息科学版)》 CAS 2021年第5期553-561,共9页 Journal of Jilin University(Information Science Edition)
基金 国家重点研发计划基金资助项目(2018YFB1703502)。
关键词 CBLGA模型 CBLCA混联模型 注意力机制 混联模型 文本分类 CBLGA model CBLCA model attention mechanism hybrid model text classification
  • 相关文献

参考文献6

二级参考文献28

共引文献197

同被引文献15

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部