MISC

2015年9月10日

An information criterion for model selection with missing data via complete-data divergence

  • Hidetoshi Shimodaira
  • ,
  • Haruyoshi Maeda

記述言語
掲載種別
機関テクニカルレポート,技術報告書,プレプリント等

We derive an information criterion to select a parametric model of<br />
complete-data distribution when only incomplete or partially observed data is<br />
available. Compared with AIC, our new criterion has an additional penalty term<br />
for missing data, which is expressed by the Fisher information matrices of<br />
complete data and incomplete data. We prove that our criterion is an<br />
asymptotically unbiased estimator of complete-data divergence, namely, the<br />
expected Kullback-Leibler divergence between the true distribution and the<br />
estimated distribution for complete data, whereas AIC is that for the<br />
incomplete data. Information criteria PDIO (Shimodaira 1994) and AICcd<br />
(Cavanaugh and Shumway 1998) have been previously proposed to estimate<br />
complete-data divergence, and they have the same penalty term. The additional<br />
penalty term of our criterion for missing data turns out to be only half the<br />
value of that in PDIO and AICcd. The difference in the penalty term is<br />
attributed to the fact that our criterion is derived under a weaker assumption.<br />
A simulation study with the weaker assumption shows that our criterion is<br />
unbiased while the other two criteria are biased. In addition, we review the<br />
geometrical view of alternating minimizations of the EM algorithm. This<br />
geometrical view plays an important role in deriving our new criterion.

リンク情報
arXiv
http://arxiv.org/abs/arXiv:1509.02870
URL
http://arxiv.org/abs/1509.02870v4
ID情報
  • arXiv ID : arXiv:1509.02870

エクスポート
BibTeX RIS