Abstract Archive

Neuroscience 2005 Abstract

Presentation Number:	400.10
Abstract Title:	A reinforcement learning model predicts monkey's choice and dorsal striatal activities.
Authors:	Samejima, K.*¹ ; Ueda, Y. ; Doya, K.¹ ; Kimura, M. ¹CNB, ATR-CNS, Kyoto, Japan
Primary Theme and Topics	Sensory and Motor Systems - Basal Ganglia -- Systems physiology and behavior
Secondary Theme and Topics	Cognition and Behavior<br />- Motivation and Emotion<br />-- Learning
Session:	400. Basal Ganglia Systems and Behavior I Poster
Presentation Time:	Monday, November 14, 2005 9:00 AM-10:00 AM
Location:	Washington Convention Center - Hall A-C, Board # Z12
Keywords:	REWARD, LEARNING, STRIATUM, BASAL GANGLIA

To make decision in dynamic environment, an animal must update evaluation of candidate actions based on past experiences of actions and rewards. Reinforcement learning (RL) explains reward-based decision-making and adaptive choice of actions by the three steps: i) estimate the action value, i.e., how much reward value an action will yield; ii) compare the action values of alternatives to select an action; and iii) update the action value by the discrepancy between expected and acquired reward after the action. To test whether the striatum encodes action value, we recorded activity of striatal projection neurons of two monkeys making free choices between left- and right-turn of a handle. Reward probability of each of the two actions was fixed at either 10, 50 or 90% in a block of trials but varied between blocks. In the previous report (Samejima et al, 2003 SfN abstract), we showed that striatal activity were modulated by reward probability for a particular action in block-by-block comparisons. To study adaptive process of striatal activity on trial-by-trial basis, we developed a novel RL model-based method to estimate action values upon each trial using a Baysian inference method ‘particle filter’. The RL model with estimated action values from the past history of choices and rewards successfully predicted monkey’s subsequent choices. Regression analysis of striatal neuronal discharge rate before the action choices with the estimated action value showed that about half of neurons (66/142) correlated to the estimated action values. Three-fourths of them (48/66) specifically encoded action value for one of two actions rather than difference of action values (10/66) or action-independent value (8/66). This result support the RL model of basal ganglia in which striatum projection neurons encode action values, which are updated by reward expectation error signal carried by the midbrain dopamine neurons and are used for decision and action selection.

Supported by grants from MEXT, CREST/JST, and by NICT

Sample Citation:

[Authors]. [Abstract Title]. Program No. XXX.XX. 2005 Neuroscience Meeting Planner. Washington, DC: Society for Neuroscience, 2005. Online.

Copyright © 2005-2026 Society for Neuroscience; all rights reserved. Permission to republish any abstract or part of any abstract in any form must be obtained in writing by SfN office prior to publication.

Neuroscience 2005 Abstract

Sample Citation:

Engage with SfN

Quick Links

Neuroscience 2005 Abstract

Sample Citation:

Engage with SfN

Quick Links

Follow SfN