Reinforcement-learning agents with different temperature parameters explain the variety of human action-selection behavior in a Markov decision process task

NEUROCOMPUTING

Fumihiko Ishida
Takahiro Sasaki
Yutaka Sakaguchi
Hiroyuki Shimai

巻: 72
号: 7-9
開始ページ: 1979
終了ページ: 1984
記述言語: 英語
掲載種別: 研究論文（学術雑誌）
DOI: 10.1016/j.neucom.2008.04.009
出版者・発行元: ELSEVIER SCIENCE BV

We investigated the characteristics of the human action-selection in performing a Markov decision process (MDP) task, and compared them to those of reinforcement-learning (RL) agents. The behavior of human participants was roughly classified into two qualitatively different types. On the other hand, surprisingly, the variety of human behavior could be explained simply by a single parameter of the degree of randomness (i.e., the temperature parameter) in the action-selection rules of the RL agents. This result implies that the various behaviors of human action-selection may be determined by a simple mechanism in the brain. (c) 2008 Elsevier B.V. All rights reserved.

リンク情報

DOI: https://doi.org/10.1016/j.neucom.2008.04.009
J-GLOBAL: https://jglobal.jst.go.jp/detail?JGLOBAL_ID=201302259451895166
Web of Science: https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000264993200062&DestApp=WOS_CPL

ID情報

DOI : 10.1016/j.neucom.2008.04.009
ISSN : 0925-2312
J-Global ID : 201302259451895166
Web of Science ID : WOS:000264993200062

エクスポート: BibTeX RIS

石田文彦

論文

Reinforcement-learning agents with different temperature parameters explain the variety of human action-selection behavior in a Markov decision process task

メニュー

共著者の一覧