Our brains allow us to reason about alternatives and to make choices that are likely to pay off. Often there is no one correct answer, but instead one that is favoured simply because it is more likely to lead to reward. A variety of probabilistic classification tasks probe the covert strategies that humans use to decide among alternatives based on evidence that bears only probabilistically on outcome. Here we show that rhesus monkeys can also achieve such reasoning. We have trained two monkeys to choose between a pair of coloured targets after viewing four shapes, shown sequentially, that governed the probability that one of the targets would furnish reward. Monkeys learned to combine probabilistic information from the shape combinations. Moreover, neurons in the parietal cortex reveal the addition and subtraction of probabilistic quantities that underlie decision-making on this task.
The authors argue that the brain reasons probabilistically, because they find that single neuron responses (firing rates) correlate with a measure of probabilistic evidence derived from the probabilistic task setup. It is certainly true that the monkeys could learn the task (a variant of the weather prediction task) and I also find the evidence presented in the paper generally compelling, but the authors note themselves that similar correlations with firing rate may result from other quantitative measures with similar properties as the one considered here. May, for example, firing rates correlate similarly with a measure of expected value of a shape combination as derived from a reinforcement learning model?
What did they do in detail? They trained monkeys on a task in which they had to predict which of two targets will be rewarded based on a set of four shapes presented on the screen. Each shape contributed a certain weight to the probability of rewarding a target as defined by the experimenters. The monkeys had to learn these weights. Then they also had to learn (implicitly) how the weights of shapes are combined to produce the probability of reward. After about 130,000 trials the monkeys were good enough to be tested. The trick in the experiment was that the four shapes were not presented simultaneously, but appeared one after the other. The question was whether neurons in lateral intraparietal (LIP) area of the monkeys’ brains would represent the updated probabilities of reward after addition of each new shape within a trial. That the neurons would do that was hypothesised, because results from previous experiments suggested (see Gold & Shalden, 2007 for review) that neurons in LIP represent accumulated evidence in a perceptual decision making paradigm.
Now Shadlen seems convinced that these neurons do not directly represent the relevant probabilities, but rather represent the log likelihood ratio (logLR) of one choice option over the other (see, e.g., Gold & Shadlen, 2001 and Shadlen et al., 2008). Hence, these ‘posterior’ probabilities play no role in the paper. Instead all results are obtained for the logLR. Funnily the task is defined solely in terms of the posterior probability of reward for a particular combination of four shapes and the logLR needs to be computed from the posterior probabilities (Yang & Shadlen don’t lay out this detail in the paper or the supplementary information). I’m more open about the representation of posterior probabilities directly and I wondered how the correlation with logLR would look like, if the firing rates would respresent posterior probabilities. This is easy to simulate in Matlab (see Yang2007.m). Such a simulation shows that, as a function of logLR, the firing rate (representing posterior probabilities) should follow a sigmoid function. Compare this prediction to Figures 2c and 3b for epoch 4. Such a sigmoidal relationship derives from the boundedness of the posterior probabilities which is obviously reflected in firing rates of neurons as they cannot drop or rise indefinitely. So there could be simple reasons for the boundedness of firing rates other than that they represent probabilities, but in any case it appears unlikely that they represent unbounded log likelihood ratios.