論文

査読有り
2012年10月

A monotonic statistical machine translation approach to speaking style transformation

COMPUTER SPEECH AND LANGUAGE
  • Graham Neubig
  • ,
  • Yuya Akita
  • ,
  • Shinsuke Mori
  • ,
  • Tatsuya Kawahara

26
5
開始ページ
349
終了ページ
370
記述言語
英語
掲載種別
研究論文(学術雑誌)
DOI
10.1016/j.csl.2012.02.003
出版者・発行元
ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

This paper presents a method for automatically transforming faithful transcripts or ASR results into clean transcripts for human consumption using a framework we label speaking style transformation (SST). We perform a detailed analysis of the types of corrections performed by human stenographers when creating clean transcripts, and propose a model that is able to handle the majority of the most common corrections. In particular, the proposed model uses a framework of monotonic statistical machine translation to perform not only the deletion of disfluencies and insertion of punctuation, but also correction of colloquial expressions, insertions of omitted words, and other transformations. We provide a detailed description of the model implementation in the weighted finite state transducer (WFST) framework. An evaluation of the proposed model on both faithful transcripts and speech recognition results of parliamentary and lecture speech demonstrates the effectiveness of the proposed model in performing the wide variety of corrections necessary for creating clean transcripts. (C) 2012 Elsevier Ltd. All rights reserved.

リンク情報
DOI
https://doi.org/10.1016/j.csl.2012.02.003
J-GLOBAL
https://jglobal.jst.go.jp/detail?JGLOBAL_ID=201202281447754750
CiNii Articles
http://ci.nii.ac.jp/naid/120004247251
Web of Science
https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000305307300004&DestApp=WOS_CPL
ID情報
  • DOI : 10.1016/j.csl.2012.02.003
  • ISSN : 0885-2308
  • eISSN : 1095-8363
  • J-Global ID : 201202281447754750
  • CiNii Articles ID : 120004247251
  • Web of Science ID : WOS:000305307300004

エクスポート
BibTeX RIS