RT Journal Article
SR Electronic
T1 Mapping the unknown: The spatially correlated multi-armed bandit
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 106286
DO 10.1101/106286
A1 Charley M. Wu
A1 Eric Schulz
A1 Maarten Speekenbrink
A1 Jonathan D. Nelson
A1 Björn Meder
YR 2017
UL http://biorxiv.org/content/early/2017/02/06/106286.abstract
AB We introduce the spatially correlated multi-armed bandit as a task coupling function learning with exploration-exploitation. Participants interact with bi-variate reward functions on a two-dimensional grid, with the goal of either gaining the largest average score or finding the largest payoff. By providing an opportunity to learn the underlying reward function through spatial correlations, we model to what extent people form beliefs about unexplored payoffs and how that guides search behavior. Participants adapted to assigned payoff conditions, performed better in smooth than in rough environments, and—surprisingly—sometimes seemed to perform equally well in short as in long search horizons. Our modeling results indicate a tendency towards local search options, which when accounted for, suggests participants were best-described as forming only very local inferences about unexplored regions, combined with a search strategy that directly trades off between exploiting high expected rewards and exploring to reduce uncertainty.