論文

査読有り
2017年8月28日

Key frame extraction from first-person video with multi-sensor integration

Proceedings - IEEE International Conference on Multimedia and Expo
  • Yujie Li
  • ,
  • Atsunori Kanemura
  • ,
  • Hideki Asoh
  • ,
  • Taiki Miyanishi
  • ,
  • Motoaki Kawanabe

開始ページ
1303
終了ページ
1308
記述言語
英語
掲載種別
研究論文(国際会議プロシーディングス)
DOI
10.1109/ICME.2017.8019352
出版者・発行元
IEEE Computer Society

First-person videos (FPVs) in daily living help us to memorize our life experience and information systems to process daily activities. Summarizing FPVs into key frames that represent the entire data would allow us to remember our memory in the past and computers to efficiently process the data. However, most video summarization approaches only use visual information, even though our daily activities consist of multiple modalities such as movements and sounds. FPVs are not as stable as movies or sport scenes since the camera attached to the head shakes frequently, and key frame extraction methods rely only on video frames do not always produce satisfactory results. In this paper, we introduce a novel key frame extraction method for FPVs using multiple wearable sensors. To efficiently integrate multimodal sensor signals, our formulation uses sparse dictionary selection, which minimizes a reconstruction error with a subset (key frames) of the original data. We present experimental results with multimodal datasets captured by wearable sensors in a natural environment. The results suggest multi-sensor information improves the precision of extracted key frames as well as the coverage of an entire video sequence.

リンク情報
DOI
https://doi.org/10.1109/ICME.2017.8019352
URL
http://dblp.uni-trier.de/db/conf/icmcs/icme2017.html#conf/icmcs/LiKAMK17
ID情報
  • DOI : 10.1109/ICME.2017.8019352
  • ISSN : 1945-788X
  • ISSN : 1945-7871
  • DBLP ID : conf/icmcs/LiKAMK17
  • SCOPUS ID : 85030240716

エクスポート
BibTeX RIS