2005年11月

Effects of speaker normalization based on vocal tract length ratios on word recognition using compound parameters

Systems and Computers in Japan

Naomitsu Ikeda
Tadashi Sakata
Tomoaki Hirayama
Yuichi Ueda
Akira Watanabe

巻: 36
号: 12
開始ページ: 51
終了ページ: 62
記述言語: 英語
掲載種別: 研究論文（学術雑誌）
DOI: 10.1002/scj.20339

This paper describes effectiveness in applying speaker normalization based on a vocal tract length ratio between two speakers to spoken word recognition. One of the two speakers is a speaker who utters unknown words to be recognized and the other is a standard speaker. The vocal tract length ratio between them is estimated, by using the method we proposed previously, from formant trajectories of the same words uttered by them. Speech parameters of the speaker for recognition are normalized into those of the standard speaker's vocal tract length by the estimated ratio. Speech recognition system in this research is featured by making use of compound parameters. When recognizing words uttered by diverse speakers in terms of age and sex using a phoneme template of a mixed speaker set (adults and children), the recognition rates after normalization are somewhat higher than those using advantageous templates constructed from the respective sets of adults and children without normalization. The same tendency is observed for both a single parameter and compound parameters. Thus, it is verified that the proposed normalization method is effective in recognition when speakers are unknown in age and sex. In addition, it is seen that the use of compound parameters is very effective regardless of whether or not vocal tract length ratio normalization is applied. When the IPA 5000-word dictionary is used, the recognition rate is improved by approximately 7% or more by the use of compound parameters compared to the case of a single parameter. © 2005 Wiley Periodicals, Inc.

リンク情報

DOI: https://doi.org/10.1002/scj.20339
J-GLOBAL: https://jglobal.jst.go.jp/detail?JGLOBAL_ID=201502843105883664

ID情報

DOI : 10.1002/scj.20339
ISSN : 0882-1666
J-Global ID : 201502843105883664
SCOPUS ID : 27144479874

エクスポート: BibTeX RIS

坂田聡

論文

Effects of speaker normalization based on vocal tract length ratios on word recognition using compound parameters

メニュー