AlphaZero for a Non-Deterministic Game

2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)

Chu-Hsuan Hsueh
I-Chen Wu
Jr-Chang Chen
Tsan-sheng Hsu

開始ページ: 116
終了ページ: 121
記述言語: 英語
掲載種別: 研究論文（国際会議プロシーディングス）
DOI: 10.1109/taai.2018.00034
出版者・発行元: IEEE

The AlphaZero algorithm, developed by DeepMind, achieved superhuman levels of play in the games of chess, shogi, and Go, by learning without domain-specific knowledge except game rules. This paper investigates whether the algorithm can also learn theoretical values and optimal plays for non-deterministic games. Since the theoretical values of such games are expected win rates, not a simple win, loss, or draw, it is worthy investigating the ability of the AlphaZero algorithm to approximate expected win rates of positions. This paper also studies how the algorithm is influenced by a set of hyper-parameters. The tested non-deterministic game is a reduced and solved version of Chinese dark chess (CDC), called 2×4 CDC. The experiments show that the AlphaZero algorithm converges nearly to the theoretical values and the optimal plays in many of the settings of the hyper-parameters. To our knowledge, this is the first research paper that applies the AlphaZero algorithm to non-deterministic games.

リンク情報

DOI: https://doi.org/10.1109/taai.2018.00034
URL: https://ieeexplore.ieee.org/document/8588490

ID情報

DOI : 10.1109/taai.2018.00034

エクスポート: BibTeX RIS

HSUEH Chu-Hsuan

論文

AlphaZero for a Non-Deterministic Game

メニュー

共著者の一覧