摘要
汉语语音识别的基本单元一般为音素、音节以及声韵母 .以声韵母为基元的识别系统由于HMM模型较少 ,计算量小 ,适合于实时实现 .但是由于模型比较孤立 ,对语音信号的声学特性描述得不够精确 ,因而识别率一般比音节基元的系统低 .而以音节、音素 (tri phone、di phone)为基元的系统则有HMM模型数量多、训练和识别过程中计算量大的缺点 ,影响到系统的实时性 .本文提出了一种折衷的方案 ,系统基元仍选择声韵母 ,而在HMM训练阶段 ,对整个音节序列的参数进行运算 ,使声韵过渡段的状态得到平滑 ,同时计算并保存每个音节声韵之间的转移概率 ,识别时动态组装为完整的音节HMM .该方法在保持少量HMM个数的同时 ,能够降低误试率 ,适合于以DSP为核心的实时连接词语音识别系统 .
The base unit in mandarin speech recognition is phoneme, semi-syllable or syllable. Semi-syllable system has fewer HMM models and needs less computation, thus it is suitable for real-time systems. But due to poor description for the acoustic properties of the speech signal, it generally shows a low performance compared with syllable system. While the system based on syllable or phoneme (tri-phone or di-phone) has much more HMM models, and needs massive computation in training and recognition, which goes against to real-time implementation. The new scheme is a compromised one. The new system is based on semi-syllable system, but the parameters of the entire syllable are used in training phase, so smoothing between two semi-syllable units is introduced. The transition probability between semi-syllables is calculated, and the two semi-syllable HMMs are connected into a full syllable HMM in recognition phase. This can increase the system performance without increasing HMM models, and it is fit for real-time systems with DSP kernel.
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2001年第2期146-149,共4页
Journal of Beijing University of Aeronautics and Astronautics
基金
广东省自然科学基金资助项目!(96 0 6 31)
关键词
语音识别
马尔柯夫过程
HMM
声韵基元
平滑声韵基元
算法
汉语语音
Acoustic properties
Learning algorithms
Markov processes
Mathematical models
Probability
Real time systems
Speech processing