Reward rate optimization in two-alternative decision making: empirical tests of theoretical predictions

J Exp Psychol Hum Percept Perform. 2009 Dec;35(6):1865-97. doi: 10.1037/a0016926.

Abstract

The drift-diffusion model (DDM) implements an optimal decision procedure for stationary, 2-alternative forced-choice tasks. The height of a decision threshold applied to accumulating information on each trial determines a speed-accuracy tradeoff (SAT) for the DDM, thereby accounting for a ubiquitous feature of human performance in speeded response tasks. However, little is known about how participants settle on particular tradeoffs. One possibility is that they select SATs that maximize a subjective rate of reward earned for performance. For the DDM, there exist unique, reward-rate-maximizing values for its threshold and starting point parameters in free-response tasks that reward correct responses (R. Bogacz, E. Brown, J. Moehlis, P. Holmes, & J. D. Cohen, 2006). These optimal values vary as a function of response-stimulus interval, prior stimulus probability, and relative reward magnitude for correct responses. We tested the resulting quantitative predictions regarding response time, accuracy, and response bias under these task manipulations and found that grouped data conformed well to the predictions of an optimally parameterized DDM.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Decision Making*
  • Differential Threshold
  • Humans
  • Models, Psychological
  • Probability Learning
  • Psychomotor Performance*
  • Reaction Time
  • Reinforcement Schedule
  • Reward*