2010年9月

HMM-Based Voice Conversion Using Quantized F0 Context

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS

Takashi Nose
Yuhei Ota
Takao Kobayashi

巻: E93D
号: 9
開始ページ: 2483
終了ページ: 2490
記述言語: 英語
掲載種別
DOI: 10.1587/transinf.E93.D.2483
出版者・発行元: IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG

We propose a segment-based voice conversion technique using hidden Markov model (HMM)-based speech synthesis with nonparallel training data. In the proposed technique, the phoneme information with durations and a quantized F0 contour are extracted from the input speech of a source speaker, and are transmitted to a synthesis part. In the synthesis part, the quantized F0 symbols are used as prosodic context. A phonetically and prosodically context-dependent label sequence is generated from the transmitted phoneme and the F0 symbols. Then, converted speech is generated from the label sequence with durations using the target speaker's pre-trained context-dependent HMMs. In the model training, the models of the source and target speakers can be trained separately, hence there is no need to prepare parallel speech data of the source and target speakers. Objective and subjective experimental results show that the segment-based voice conversion with phonetic and prosodic contexts works effectively even if the parallel speech data is not available.

リンク情報

DOI: https://doi.org/10.1587/transinf.E93.D.2483
CiNii Articles: http://ci.nii.ac.jp/naid/10027640446
Web of Science: https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000282245100015&DestApp=WOS_CPL

ID情報

DOI : 10.1587/transinf.E93.D.2483
ISSN : 0916-8532
CiNii Articles ID : 10027640446
Web of Science ID : WOS:000282245100015

エクスポート: BibTeX RIS

小林隆夫

MISC

HMM-Based Voice Conversion Using Quantized F0 Context

メニュー