期刊文献+

基于CNN与Bi-LSTM的唇语识别研究 被引量:2

Research on Lip-Reading Recognition Based on CNN and Bi-LSTM
在线阅读 下载PDF
导出
摘要 针对唇语识别过程中唇部特征提取和时序关系存在的问题,提出一种卷积神经网络(CNN)和双向长短时记忆网络(Bi-LSTM)相结合的深度学习模型。利用CNN学习唇部特征,并将学习到的唇部特征送入Bi-LSTM进行时序编码,通过Softmax进行分类。建立NUMBER DATASET和PHRACE DATASET两个大型汉语数据集以解决汉语唇语数据缺失问题。将该模型与传统的唇语识别方法在两个数据集上进行实验对比,发现在NUMBER DATASET上识别准确率为81.3%,比传统方法提高了8.1%,在PHRACE DATASET上识别准确率为83.5%,比传统方法提高了9%。实验结果表明该模型能有效提高唇语识别的准确率。 Aiming at the existing problems in lip feature extraction and temporal relation recognition during the research of lip-reading,a deep learning model based on convolutional neural network(CNN)and bi-directional long short-term memory(Bi-LSTM)was proposed.This paper utilizes CNN to learn the features of lip,puts these lip features acquired into Bi-LSTM to encode temporal information,and use softmax classifier to classify.Due to the lack of Chinese lip-reading data,it established two large Chinese lip-reading datasets named NUMBER DATASET and PHRACE DATASET.Compared with the traditional lip-reading methods on these two datasets,we find the recognition accuracy rate on NUMBER DATASET is 81.3%,which is 8.1%higher than the traditional method.The recognition accuracy rate on the PHRACE DATASET is 83.5%,which is 9%higher than the traditional method.The above experimental results show that the model can effectively improve the accuracy of lip-reading recognition.
作者 骆天依 刘大运 李修政 房国志 安欣 魏华杰 胡城 LUO Tian-yi;LIU Da-yun;LI Xiu-zheng;FANG Guo-zhi;AN Xin;WEI Hua-jie;HU Cheng(School of Automation,Harbin University of Science and Technology;School of Computer Science and Technology,Harbin University of Science and Technology;College of Measurement and Control Technology and Communication Engineering,Harbin University of Science and Technology,Harbin 150080,China)
出处 《软件导刊》 2019年第10期36-39,共4页 Software Guide
基金 黑龙江省大学生创新创业项目(20180214007)
关键词 唇语识别 卷积神经网络 双向长短时记忆网络 深度学习 时序编码 lip-reading convolutional neural network bi-directional long short-term memory deep learning sequential coding
  • 相关文献

参考文献6

二级参考文献37

  • 1左坤隆,刘文耀.基于活动外观模型的人脸表情分析与识别[J].光电子.激光,2004,15(7):853-857. 被引量:19
  • 2洪晓鹏,姚鸿勋,徐铭辉.基于句子级的唇读语料库及其切分算法[J].计算机工程与应用,2005,41(3):174-177. 被引量:7
  • 3吕东辉,王滨.YCbCr空间中一种基于贝叶斯判决的肤色检测方法[J].中国图象图形学报,2006,11(1):47-52. 被引量:24
  • 4Beinglass A,Wolfson H J.Articulated object recognition,or:How to generalize the generalized Hough transform[C]//Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Hawaii, 1991:461-466.
  • 5Kass M,Witkin A,Terzopoulos D.Snakes: Active contour models[C]// International Conference on Computer Vision,London, 1987:259-268.
  • 6Wiskott L,Fellous J M,Kruger N,et al.Face recognition by elastic graph matching[C]//Jain L C.Intelligent Biometric Techniques in Fingerprint and Face Recognition,Washington DC.[S.l.]:CRC Press, 1999: 355-396.
  • 7Nastar C, Ayaehe N.Fast segmentation, tracking and analysis of deformable objects[C]//International Conference on Computer Vision, Berlin, 1993 : 275-279.
  • 8Cootes T F,Taylor C J.Active shape models-their training and application[J].Computer Vision and Image Understanding, 1995,61 (1):38-59.
  • 9Cootes T F,Edwards G J,Taylor C J.Active appearance models[C]// European Conference on Computer Vision, Berlin, 1998,2: 484-498.
  • 10Zhao M,Li S Z,Chen C,et al.Shape evaluation for weighted active shape models[C]//Asian Conference on Computer Vision,Korea,2004, 2 : 1074-1079.

共引文献63

同被引文献1

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部