論文

査読有り
2007年

PLSA-based Topic Detection in Meetings for Adaptation of Lexicon and Language Model

INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4
  • Yuya Akita
  • ,
  • Yusuke Nemoto
  • ,
  • Tatsuya Kawahara

開始ページ
1321
終了ページ
1324
記述言語
英語
掲載種別
研究論文(国際会議プロシーディングス)
出版者・発行元
ISCA-INST SPEECH COMMUNICATION ASSOC

A topic detection approach based on a probabilistic framework is proposed to realize topic adaptation of speech recognition systems for long speech archives such as meetings. Since topics in such speech are not clearly defined unlike news stories, we adopt a probabilistic representation of topics based on probabilistic latent semantic analysis (PLSA). A topical sub-space is constructed by PLSA, and speech segments are projected to the sub-space, then each segment is represented by a vector which consists of topic probabilities obtained by the projection. Topic detection is performed by clustering these vectors, and topic adaptation is done by collecting relevant texts based on the similarity in this probabilistic representation. In experimental evaluations, the proposed approach demonstrated significant reduction of perplexity and out-of-vocabulary rates as well as robustness against ASR errors.

リンク情報
Web of Science
https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000269998600331&DestApp=WOS_CPL
ID情報
  • Web of Science ID : WOS:000269998600331

エクスポート
BibTeX RIS