期刊文献+

基于多窗谱的心理声学语音增强 被引量:12

Psychoacoustical enhancement of speech based on multitaper spectrum
原文传递
导出
摘要 与传统的周期谱图相比,多窗谱具有更小的估计方差。从含噪语音的多窗谱对噪声及噪声与含噪语音之比(NNSR)进行估计,用基于NNSR的幅度谱减实现用于计算人耳掩蔽阈值的预增强语音,用集成了人耳掩蔽阈值的心理声学加权规则实现最终的增强语音。考虑到多窗谱的特点对掩蔽偏移量进行了修正,修正后的重建语音,其客观测量指标修正巴克谱测度比修正前有一定的改进。再对心理声学加权规则作最大值小于1的限制,则输入信噪比越大(0 dB以上),分段信噪比和总体信噪比提高得越多。非正式试听表明重建语音失真较小,背景噪声大大降低,且没有音乐噪声。 Multitaper spectrum has lower variance than the traditional periodogram. The noise spectrum and the Noise to Noisy Signal Ratio (NNSR) are estimated from the multitaper spectrum of the noisy signal; the pre-enhanced speech for calculating the noise masking threshold is obtained by the spectral amplitude subtraction method, whose gain is a function of NNSR; the final enhanced speech is obtained by suppressing the Fourier spectrum of the noisy signal with the psychoacoustical weighting rule incorporating the noise masking threshold. Because of the low variance feature of the multitaper spectrum, a modified offset formula is proposed to calculate the noise masking threshold, thus the reconstructed speech with this modification has an improvement in MBSD (Modified Bark Spectral Distortion). When a maximum limitation less than one to the psychoacoustical weighting rule is further proposed, the higher the input SNR (〉0 dB) is, the more improvement the segmental SNR and the overall SNR have. The informal listening tests show that there is little speech distortion for the enhanced speech processed by the proposed method, the background noise is reduced much and free of musical noise.
出处 《声学学报》 EI CSCD 北大核心 2007年第3期275-281,共7页 Acta Acustica
基金 国家973项目(2002 CB312102) 国家自然科学基金(60272044 60472058) 苏州大学青年教师研究基金(Q3119610)资助项目
关键词 语音增强 心理声学 掩蔽阈值 背景噪声 估计方差 加权规则 信噪比 Acoustic noise Distortion (waves) Psychophysiology Spectrum analysis
  • 相关文献

参考文献16

  • 1Thomson D J. Spectrum estimation and harmonic analysis. Proc. IEEE, 1982; 70(9): 1055--1096
  • 2Hu Y, Loizou P C. Incorporating a psychoacoustical model in frequency domain speech enhancement. IEEE Signal Processing letters, 2004; 11(2): 270--273
  • 3Cappe O. Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor. IEEE Trans. on Speech and Audio Processing, 1994; 2(2): 345-- 349
  • 4刘海滨,吴镇扬,赵力,曾毓敏.非平稳环境下基于人耳听觉掩蔽特性的语音增强[J].信号处理,2003,19(4):303-307. 被引量:17
  • 5陶智,赵鹤鸣,龚呈卉.基于听觉掩蔽效应和Bark子波变换的语音增强[J].声学学报,2005,30(4):367-372. 被引量:39
  • 6卜凡亮,王为民,戴启军,陈砚圃.基于噪声被掩蔽概率的优化语音增强方法[J].电子与信息学报,2005,27(5):753-756. 被引量:16
  • 7Virag N. Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Speech and Audio Processing, 1999; 7(2): 126--137
  • 8Gustafsson S, Jax P, Vary P. A novel psychoacoustically motivated audio enhancement algorithm preserving background noise characteristics. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, 1998:397--400
  • 9Johnston J D. Transform coding of audio signal using perceptual noise criteria. IEEE J. Select. Areas Commun., 1988; 6(2): 314--323
  • 10Manolakis D G, Lngle V K, Kogon S M. Statistical and adaptive signal processing. 北京:清华大学出版社, 2003: 246-255

二级参考文献48

  • 1曹志刚,郑文涛.基于短时谱最小均方误差估计的语音增强和剩余噪声衰减[J].电子学报,1993,21(4):7-12. 被引量:7
  • 2陆生礼,时龙兴,余崇智,魏荣爵.听觉模拟的语音增强方法[J].声学学报,1996,21(6):879-883. 被引量:4
  • 3沈永欢 梁在中 等.实用数学手册[M].北京:科学出版社,1997..
  • 4..http://spib.rice.edu/spib/select_noise.html.,.
  • 5M. Berouti, R. Schwartz, J. Makhoul. Enhancement of speech corrupted by acoustic noise. Proc. IF.F.F. ICASSP,Washinggton, DC, Apr. 1979; 208-211.
  • 6E Lockwood, J. Boudy. Experiments with a nonlinear spectral subtractor(NSS), hidden Markov models and projection for robust recognition in cars. Speech Communication. 1992; 11: 215-228.
  • 7Boh Lim Sim, Yit Chow Tong etc.. A parametric formulation of the generalized spectral subtraction method. IEEE.Transaction on Speech and Audio Processing. 1998; 6(4):328-337.
  • 8Nathalie Virag, Single channel speech enhancement based on masking properties of human auditory system. IEEE Transactions on Speech and Audio Processing. 1999; 7(2):126-137.
  • 9I. Cohen, B. Berdugo. Speech enhancement for nonstationary noise environments. Signal Processing. 2001; 81:2403-2418.
  • 10Y. Epharim, D. Malah. Speech enhancement using a minimum mean square log-spectral amplitude estimator.IEEE. Transactions on Acoustics. Speech, and Signal Processing. 1984; 32(6): 1109-1121.

共引文献66

同被引文献96

引证文献12

二级引证文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部