論文

本文へのリンクあり
2023年

M3IL: Multi-Modal Meta-Imitation Learning

Transactions of the Japanese Society for Artificial Intelligence
  • Xin Zhang
  • ,
  • Tatsuya Matsushima
  • ,
  • Yutaka Matsuo
  • ,
  • Yusuke Iwasawa

38
2
記述言語
掲載種別
研究論文(学術雑誌)
DOI
10.1527/tjsai.38-2_A-LB3

Imitation Learning (IL) is anticipated to achieve intelligent robots since it allows the user to teach the various robot tasks easily.In particular, Few-Shot Imitation Learning (FSIL) aims to infer and adapt fast to unseen tasks with a small amount of data. Though FSIL requires few-shot of data, the high cost of demonstrations in IL is still a critical problem. Especially when we want to teach the robot a new task, we need to execute the task for the assignment every time. Inspired by the fact that humans specify tasks using language instructions without executing them, we propose a multi-modal FSIL setting in this work. The model leverages image and language information in the training phase and utilizes both image and language or only language information in the testing phase. We also propose a Multi-Modal Meta-Imitation Learning or M3IL, which can infer with only image or language information. The result of M3IL outperforms the baseline in the standard and proposed settings. Our result shows the effectiveness of M3IL and the importance of language instructions in the FSIL setting.

リンク情報
DOI
https://doi.org/10.1527/tjsai.38-2_A-LB3
Scopus
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85149497527&origin=inward 本文へのリンクあり
Scopus Citedby
https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85149497527&origin=inward
ID情報
  • DOI : 10.1527/tjsai.38-2_A-LB3
  • ISSN : 1346-0714
  • eISSN : 1346-8030
  • SCOPUS ID : 85149497527

エクスポート
BibTeX RIS