Understanding the behavioral and psychological mechanisms underlying social behaviors is one of the major outstanding goals of social evolutionary theory. In particular, a persistent question about human--and animal--cooperation is to what extent it is supported by other-regarding preferences--the motivation to increase the welfare of others. In real-world situations, individuals have the opportunity to learn from past interactions, so we may ask how individuals evolve to learn to satisfy their social preferences during the course of an interaction. In this context, the rewards an individual receives from his social behaviors capture his preferences. In this paper, we develop a mathematical model in order to ask whether the mere act of cooperating with a social partner will evolve to be inherently rewarding. In our model, individuals interact repeatedly in pairs and adjust their behaviors through reinforcement learning. Individuals associate to each game outcome a subjective utility, which constitute the reward for that particular outcome. We find that utilities that value mutual cooperation positively but the sucker's outcome negatively are evolutionarily stable. In a reduced model, other-regarding preferences can co-exist with preferences that match the sign of the material payoffs under narrow conditions. In simulations of the full model, we find that selfish preferences that always learn pure defection are also evolutionarily successful. These findings are consistent with empirical observations showing that humans tend to behave according to distinct behavioral types, and call for further integration of different levels of biological and social determinants of behavior.