講演・口頭発表等

国際会議
2019年10月25日

Automatic Conversion of Written Language into Spoken Language Using a Sequence-to-Sequence Model Trained with a Parallel Corpus

Proceedings of The 22nd Conference of the Oriental COCOSDA (Oriental-COCOSDA2019)
  • 小橋 優矢
  • ,
  • 西村 良太
  • ,
  • 北岡 教英

記述言語
英語
会議種別
口頭発表(一般)
開催地
Sebu

In this study we proposed using a sequence-to-sequence,RNN based model to convert Japanese written language intoa text representation of Japanese spoken language. If thisprocess could be accomplished accurately and efficiently, itwould become possible to create a large, spoken languagetext corpus for improving the accuracy of speech recognition.We first created a written language-spoken language parallelcorpus based on the Nagoya University Conversation Corpusand the Transcribed Corpus of Elderly Dialog. Using thismanually constructed parallel corpus, we devised conversionmodel and converted the BCCWJ corpus into spoken languagetext. Although conversion accuracy was not impressive overall,some short sentences were converted accurately. Moreover,even if whole sentences could not be converted accurately,the statistics of the spoken language were well expressed inthe converted sentences. Thus, a language model trained witha corpus of spoken language, created by converting writtenlanguage into spoken language, was shown to be effectivefor speech recognition. Because of an insufficient amount oftraining data, we were only able to accurately convert someof the shorter sentences in the BCCWJ corpus, accounting foronly a small percentage of the data. This prevented the creationof a large corpus of spoken language text data. Therefore,it will be necessary to devise a method which can correctlyconvert longer sentences as well. As a solution to this problem,we plan to introduce an attention mechanism to our sequenceto-sequence model.

リンク情報
URL
https://web.db.tokushima-u.ac.jp/cgi-bin/edb_browse?EID=370057