2017年
Combined multi-channel NMF-based robust beamforming for noisy speech recognition
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
- ,
- ,
- ,
- ,
- ,
- 巻
- 2017-
- 号
- 開始ページ
- 2451
- 終了ページ
- 2455
- 記述言語
- 英語
- 掲載種別
- 研究論文(国際会議プロシーディングス)
- DOI
- 10.21437/Interspeech.2017-642
- 出版者・発行元
- International Speech Communication Association
We propose a novel acoustic beamforming method using blind source separation (BSS) techniques based on non-negative matrix factorization (NMF). In conventional mask-based ap- proaches, hard or soft masks are estimated and beamforming is performed using speech and noise spatial covariance matri- ces calculated from masked noisy observations, but the phase information of the target speech is not adequately preserved. In the proposed method, we perform complex-domain source sep- aration based on multi-channel NMF with rank-1 spatial model (rank-1 MNMF) to obtain a speech spatial covariance matrix for estimating a steering vector for the target speech utilizing the separated speech observation in each time-frequency bin. This accurate steering vector estimation is effectively combined with our novel noise mask prediction method using multi-channel robust NMF (MRNMF) to construct a Maximum Likelihood (ML) beamformer that achieved a better speech recognition per- formance than a state-of-the-art DNN-based beamformer with no environment-specific training. Superiority of the phase pre- serving source separation to real-valued masks in beamforming is also confirmed through ASR experiments.
- ID情報
-
- DOI : 10.21437/Interspeech.2017-642
- ISSN : 1990-9772
- ISSN : 2308-457X
- SCOPUS ID : 85039163147