2017年4月

# Representation of compounds for machine-learning prediction of physical properties

PHYSICAL REVIEW B
• Atsuto Seko
• ,
• Hiroyuki Hayashi
• ,
• Keita Nakayama
• ,
• Akira Takahashi
• ,
• Isao Tanaka

95
14

144110

DOI
10.1103/PhysRevB.95.144110

AMER PHYSICAL SOC

The representations of a compound, called "descriptors" or "features", play an essential role in constructing a machine-learning model of its physical properties. In this study, we adopt a procedure for generating a set of descriptors from simple elemental and structural representations. First, it is applied to a large data set composed of the cohesive energy for about 18 000 compounds computed by density functional theory calculation. As a result, we obtain a kernel ridge prediction model with a prediction error of 0.041 eV/atom, which is close to the "chemical accuracy" of 1 kcal/mol (0.043 eV/atom). A prediction model with an error of 0.071 eV/atom of the cohesive energy is obtained for the normalized prototype structures, which can be used for the practical purpose of searching for as-yet-unknown structures. The procedure is also applied to two smaller data sets, i.e., a data set of the lattice thermal conductivity for 110 compounds computed by density functional theory calculation and a data set of the experimental melting temperature for 248 compounds. We examine the effect of the descriptor sets on the efficiency of Bayesian optimization in addition to the accuracy of the kernel ridge regression models. They exhibit good predictive performances.

Web of Science ® 被引用回数 : 138

リンク情報
DOI
https://doi.org/10.1103/PhysRevB.95.144110
Web of Science