MISC

1995年

SPEED INVARIANT SPEECH RECOGNITION USING VARIABLE VELOCITY DELAY-LINES

NEURAL NETWORKS
  • K YAMAUCHI
  • ,
  • M FUKUDA
  • ,
  • K FUKUSHIMA

8
2
開始ページ
167
終了ページ
177
記述言語
英語
掲載種別
DOI
10.1016/0893-6080(94)00069-X
出版者・発行元
PERGAMON-ELSEVIER SCIENCE LTD

A neural network model for speech recognition is proposed, based on neurophysiological findings of the auditory system. The first stage of the system is a feature-extracting module that is a model of the auditory pathway between the cochlea and the auditory cortex. The feature-extracting module extracts constant-frequency (CF), FM-ascending (FM-A), and FM-descending (FM-D) components. The second stage is a recognition module that is able to perform time-distortion invariant recognition without ignoring information concerning the relative lengths of each feature. This module consists of a main block and two subblocks. The recognition results are obtained from the main block. The two subblocks are used for monitoring the speed of the input pattern. Each block is a neocognitron-like network for which the first layer consists of variable-velocity delay lines. The propagation velocities of the delay lines of the upper and lower blocks are faster and slower, respectively, than that of the main block. The propagation velocities of these delay lines are controlled in such a way that the duration of the feature on the delay line of the main block is the same as the duration of a similar feature of a training pattern. This velocity control is accomplished by comparing the outputs of the two subblocks. The propagation velocities of these three delay lines are variable but the ratio of velocities is kept constant. The computer-simulated system was trained using several Japanese words. After the training was completed, the system recognized each of the words correctly without being affected by their spoken speeds.

リンク情報
DOI
https://doi.org/10.1016/0893-6080(94)00069-X
Web of Science
https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:A1995QN84400001&DestApp=WOS_CPL
ID情報
  • DOI : 10.1016/0893-6080(94)00069-X
  • ISSN : 0893-6080
  • Web of Science ID : WOS:A1995QN84400001

エクスポート
BibTeX RIS