論文

査読有り
2015年

Audio-visual scene understanding utilizing text information for a cooking support robot

2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)
  • Ryosuke Kojima
  • ,
  • Osamu Sugiyama
  • ,
  • Kazuhiro Nakadai

開始ページ
4210
終了ページ
4215
記述言語
英語
掲載種別
研究論文(国際会議プロシーディングス)
出版者・発行元
IEEE

This paper addresses multimodal "scene understanding" for a robot using audio-visual and text information. Scene understanding is defined by extracting six-W information such as What, When, Where, Who, Why, and hoW on the surrounding environment. Although scene understanding for a robot has been studied in the fields of robot vision and audition, only the first four Ws except for why and how information were considered. We, thus, focus on extracting how information, in particular, on cooking scenes. In cooking scenes, we define how information as a cooking procedure, and it is useful that a robot gives appropriate advice for cooking. To realize such cooking support, we propose a multimodal cooking procedure recognition framework consisting of Convolutional Neural Network (CNN), and Hierarchical Hidden Markov Model (HHMM). CNN is knows as one of the most advanced classifiers, and it is applied to recognize a cooking events from audio and visual information. HHMM models a cooking procedure represented by a sequence of cooking events, which is defined as a relationship between cooking events using text data obtained from web, and the cooking events classified with CNN. Therefore, our proposed framework integrates these three types of modalities. We constructed an interactive cooking support system based on the proposed framework, which advice a next step in the current cooking procedure through human-robot communication. Preliminary results with simulated and real recorded multi-modal scenes showed the robustness of the proposed framework in a noisy and/or occluded situation.

リンク情報
Web of Science
https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000371885404059&DestApp=WOS_CPL
URL
http://honda-ri.jp/publications/1160
ID情報
  • ISSN : 2153-0858
  • Web of Science ID : WOS:000371885404059

エクスポート
BibTeX RIS