2019年11月19日
Evaluation of the Lombard Effect Model on Synthesizing Lombard Speech in Varying Noise Level Environments with Limited Data
2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
- ,
- ,
- 開催年月日
- 2019年11月19日 - 2019年11月19日
- 記述言語
- 英語
- 会議種別
- 主催者
- Institute of Electrical and Electronics Engineers (IEEE)
Lombard speech is intelligible speech produced by humans in noises. In this study, we focus on mimicking Lombard speech from natural neutral speech under backgrounds with varying noise levels to increase its intelligibility in these noises. Other approaches map corresponding speech features from the neutral speech to Lombard speech, which can only apply for an individual noise level, and cannot reveal feature tendencies. Instead, we implement a Lombard effect model to continuously estimate feature values with varying noise levels. The techniques, which are based on coarticulation, a source-filter model with MRTD and spectral-GMM, are used to easily modify features of the neutral speech to obtain their tendencies. Finally, these features are synthesized by STRAIGHT vocoder to obtain Lombard speech. The mimicking quality is evaluated in subjective listening experiments on similarity, naturalness, and intelligibility. The evaluation results show that the proposed method could convert neutral speech into Lombard speech in varying noise levels, which obtains comparable results with the state-of-the-art method.
- リンク情報