Combining feature space discriminative training with long-term spectro-temporal features for noise-robust speech recognition

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5

Takashi Fukuda
Osamu Ichikawa
Masafumi Nishimura

開始ページ: 236
終了ページ: 239
記述言語: 英語
掲載種別: 研究論文（国際会議プロシーディングス）
出版者・発行元: ISCA-INT SPEECH COMMUNICATION ASSOC

Discriminative training of feature space using maximum mutual information (fMMI) objective function has been shown to yield remarkable accuracy improvements. For noisy environments, fMMI can be regarded as an effective noise compensation algorithm and can play a significant role for noise robustness. Feature space speaker adaptation techniques such as feature space maximum likelihood linear regression (fMLLR) are also well known, suitable for mismatched test data. These feature space transform algorithms are essential for modem speech recognition but still need further improvement against low SNR conditions. In contrast, long-term spectro-temporal information has also received attention to support traditional. short-term features. We previously proposed long-term temporal features to improve ASR accuracy for low SNR speech. In this paper, we show that long-term temporal features can be combined with fMMI to build more discriminative models for noisy speech and the proposed method performed favorably at low SNR conditions.

リンク情報

Web of Science: https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000316502200061&DestApp=WOS_CPL

ID情報

Web of Science ID : WOS:000316502200061

エクスポート: BibTeX RIS

市川治

論文

Combining feature space discriminative training with long-term spectro-temporal features for noise-robust speech recognition

メニュー

共著者の一覧