講演・口頭発表等

2019年11月19日

Evaluation of the Lombard Effect Model on Synthesizing Lombard Speech in Varying Noise Level Environments with Limited Data

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  • Ngo Thuan Van
  • ,
  • Kubo Rieko
  • ,
  • Akagi Masato

開催年月日
2019年11月19日 - 2019年11月19日
記述言語
英語
会議種別
主催者
Institute of Electrical and Electronics Engineers (IEEE)

Lombard speech is intelligible speech produced by humans in noises. In this study, we focus on mimicking Lombard speech from natural neutral speech under backgrounds with varying noise levels to increase its intelligibility in these noises. Other approaches map corresponding speech features from the neutral speech to Lombard speech, which can only apply for an individual noise level, and cannot reveal feature tendencies. Instead, we implement a Lombard effect model to continuously estimate feature values with varying noise levels. The techniques, which are based on coarticulation, a source-filter model with MRTD and spectral-GMM, are used to easily modify features of the neutral speech to obtain their tendencies. Finally, these features are synthesized by STRAIGHT vocoder to obtain Lombard speech. The mimicking quality is evaluated in subjective listening experiments on similarity, naturalness, and intelligibility. The evaluation results show that the proposed method could convert neutral speech into Lombard speech in varying noise levels, which obtains comparable results with the state-of-the-art method.

リンク情報
URL
http://hdl.handle.net/10119/16659