1995年

SPEED INVARIANT SPEECH RECOGNITION USING VARIABLE VELOCITY DELAY-LINES

NEURAL NETWORKS

K YAMAUCHI
M FUKUDA
K FUKUSHIMA

巻: 8
号: 2
開始ページ: 167
終了ページ: 177
記述言語: 英語
掲載種別
DOI: 10.1016/0893-6080(94)00069-X
出版者・発行元: PERGAMON-ELSEVIER SCIENCE LTD

A neural network model for speech recognition is proposed, based on neurophysiological findings of the auditory system. The first stage of the system is a feature-extracting module that is a model of the auditory pathway between the cochlea and the auditory cortex. The feature-extracting module extracts constant-frequency (CF), FM-ascending (FM-A), and FM-descending (FM-D) components. The second stage is a recognition module that is able to perform time-distortion invariant recognition without ignoring information concerning the relative lengths of each feature. This module consists of a main block and two subblocks. The recognition results are obtained from the main block. The two subblocks are used for monitoring the speed of the input pattern. Each block is a neocognitron-like network for which the first layer consists of variable-velocity delay lines. The propagation velocities of the delay lines of the upper and lower blocks are faster and slower, respectively, than that of the main block. The propagation velocities of these delay lines are controlled in such a way that the duration of the feature on the delay line of the main block is the same as the duration of a similar feature of a training pattern. This velocity control is accomplished by comparing the outputs of the two subblocks. The propagation velocities of these three delay lines are variable but the ratio of velocities is kept constant. The computer-simulated system was trained using several Japanese words. After the training was completed, the system recognized each of the words correctly without being affected by their spoken speeds.

リンク情報

DOI: https://doi.org/10.1016/0893-6080(94)00069-X
Web of Science: https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:A1995QN84400001&DestApp=WOS_CPL

ID情報

DOI : 10.1016/0893-6080(94)00069-X
ISSN : 0893-6080
Web of Science ID : WOS:A1995QN84400001

エクスポート: BibTeX RIS

山内康一郎

MISC

SPEED INVARIANT SPEECH RECOGNITION USING VARIABLE VELOCITY DELAY-LINES

メニュー