A supramodal accumulation-to-bound signal that determines perceptual decisions in humans.

O’Connell, R. G., Dockree, P. M., and Kelly, S. P.
Nat Neurosci, 15:1729–1735, 2012
DOI, Google Scholar

Abstract

In theoretical accounts of perceptual decision-making, a decision variable integrates noisy sensory evidence and determines action through a boundary-crossing criterion. Signals bearing these very properties have been characterized in single neurons in monkeys, but have yet to be directly identified in humans. Using a gradual target detection task, we isolated a freely evolving decision variable signal in human subjects that exhibited every aspect of the dynamics observed in its single-neuron counterparts. This signal could be continuously tracked in parallel with fully dissociable sensory encoding and motor preparation signals, and could be systematically perturbed mid-flight during decision formation. Furthermore, we found that the signal was completely domain general: it exhibited the same decision-predictive dynamics regardless of sensory modality and stimulus features and tracked cumulative evidence even in the absence of overt action. These findings provide a uniquely clear view on the neural determinants of simple perceptual decisions in humans.

Review

The authors report EEG signals which may represent 1) instantaneous evidence and 2) accumulated evidence (decision variable) during perceptual decision making. The result promises a big leap for experiments in perceptual decision making with humans, because it is the first time that we can directly observe the decision process as it accumulates evidence with reasonable temporal resolution without sticking needles in participant’s brains. Furthermore, one of the found signals appears to be sensory and response modality independent, i.e., it appears to reflect the decision process alone – something that has not been clearly found in species other than humans, but let’s discuss the study in more detail.

The current belief about the perceptual decision making process is formalised in accumulation to bound models: When presented with a stimulus, the decision maker determines at each time point of the presentation the current amount of evidence for all possible alternatives. This estimate of “instantaneous evidence” is noisy, because of either the noise within the stimulus itself, or because of internal processing noise. Therefore, the decision maker does not immediately make a decision between alternatives, but accumulates evidence over time until the accumulated evidence for one of the alternatives reaches a threshold which is internally set by the decision maker itself and indicates a certain level of certainty, or response urgency. The alternative, for which the threshold was crossed, is the decision outcome and the time the threshold was crossed is the decision time (potentially including an additional delay). The authors argue that they have found signals in the EEG of humans which can be associated with the instantaneous and accumulated evidence variables of these kinds of models.

The paradigm used in this study was different from the perceptual decision making paradigm popular in monkeys (random dot stimuli). Here the authors used stimuli which did not move, but rather gradually changed their intensity or contrast: In the experiments with visual stimuli, participants were continuously viewing a flickering disk which from time to time gradually changed its contrast with the background (the contrast gradually went back to base level after 1.6s). So the participants had to decide whether they observe a contrast different from baseline at the current time. Note that this setup is slightly different from usual trial-based perceptual decision making experiments where a formally new trial begins after a participant’s response. The disk also had a pattern, but it’s unclear why the pattern was necessary. On the other hand, using the other stimulus properties seems reasonable: The flickering induced something like continuous evoked potentials in the EEG ensuring that something stimulus-related could be measured at all times, but the gradual change of contrast “successfully eliminated sensory-evoked deflections from the ERP trace” such that the more subtle accumulated evidence signals were not masked by large deflections solely due to stimulus onsets. In the experiments with sounds, equivalent stimulus changes were implemented by either gradually changing the volume of a presented, envelope-modulated tone or its frequency.

The authors report 4 EEG signals related to perceptual decision making. They argue that the occipital steady-state visual-evoked potential (SSVEP) indicated the estimated instantaneous evidence when visual stimuli were used, because its trajectories directly reflected the changes in constrast. For auditory stimuli, the authors found a corresponding steady-state auditory-evoked potential (SSAEP) which was located at more central EEG electrodes and at 40Hz instead of 20Hz (SSVEP). Further, the authors argue that a left-hemisphere beta (LHB, 22-30Hz) and a centro-parietal potential (CPP, direct electrode measurements) could be interpreted as evidence accumulation signals, because the time of their peaks tightly predicted reaction times and their time courses were better predicted by the cumulative SSVEP instead of the original SSVEP. LHB and CPP also (roughly) showed the expected dependency on whether the participant correctly identified the target, or missed it (lower signals for misses). Furthermore, they reacted expectedly, when contrast varied in more complex ways than just a linear decrease (decrease followed by short increase followed by decrease). CPP was different from LHB by also showing the expected changes when the task did not require an overt response at target detection time whereas LHB showed no relation to the present evidence in this task indicating that it may have something to do with motor preparation of the response while CPP is a more abstract decision signal. Additionally, the CPP showed the characteristic changes measured with visual stimuli also with auditory stimuli and it depended on attentional focus: In one experimental condition the task of the participants was altered (‘detect a transient size change of a central fixation square’), but the original disk stimulus was still presented including the gradual contrast changes. In this ‘non-attend’ condition the SSVEP decreased with contrast as before, but the CPP showed no response reinforcing the idea that the CPP is an abstract decision signal. On a final note, the authors speculate that the CPP could be equal to the standard P300 signal, when transient stimuli need to be detected instead of gradual stimulus changes. This connection, if true, would be a nice functional explanation of the P300.

Open Questions

Despite the generally intriguing results presented in the paper a few questions remain. These predominantly regard details.

1) omission of data

In Figs. 2 and 3 the SSVEP is not shown anymore, presumably because of space restrictions. Similarly, the LHB is not presented in Fig. 4. I can believe that the SSVEP behaved expectedly in the different conditions of Figs. 2 and 3 such that not much information would have been added by providing the plots, but it would at least be interesting to know whether the accumulated SSVEP still predicted the LHB and CCP better than the original SSVEP in these conditions. Likewise, the authors do not report the equivalent analysis for the SSAEP in the auditory conditions. Regarding the omission of the LHB in Fig. 4, I’m not so certain about the behaviour of the LHB in the auditory conditions. It seems possible that the LHB shows different behaviour with different modalities. There is no mention of this in the text, though.

2) Is there a common threshold level?

The authors argue that the LHB and CCP reached a common threshold level just before response initiation (a prediction of accumulation to bound models, Fig. 1c), but the used test does not entirely convince me: They compared the variance just before response initiation with the variance of measurements across different time points (they randomly assigned the RT of one trial to another trial and computed variance of measurements at the shuffled time points). For a strongly varying function of time, it is no surprise that the measurements at a consistent time point vary less than the measurements made across many different time points as long as the measurement noise is small enough. Based on this argument, it is strange that they did not find a significant difference for the SSVEP which also varies strongly across time (this fits into their interpretation, though), but this lack of difference could be explained by larger measurement noise associated with the SSVEP.

Furthermore, the authors report themselves that they found a significant difference between the size of CPP peaks around decision time for varying contrast levels (Fig. 2c). Especially, the CPP peak for false alarms (no contrast change, but participant response) was lower than the other peaks. If the CPP really is the decision variable predicted by the models, then these differences should not have occurred. So where do they come from? The authors provide arguments that I cannot follow without further explanations.

3) timing of peaks

It appears that the mean reaction time precedes the peaks of the mean signals slightly. The effect is particularly clear in Fig. 3b (CPP), Fig. 4d (CPP) and Fig. 5a, but is also slightly visible in the averages centred at the time of response in Figs. 1c and 2c. Presuming a delay from internal decision time to actual response, the time of the peak of the decision variable should precede the reaction time, especially when reaction time is measured from button presses (here) compared to saccade initiation (typical monkey experiments). So why does it here appear to be the other way round?

4) variance of SSVEP baseline

The SSVEP in Fig. 4a is in a different range (1.0-1.3) than the SSVEP in Fig. 4d (1.7-2.5) even though the two plots should each contain a time course for the same experimental condition. Where does the difference come from?

5) multiple alternatives

The CPP, as described by the authors, is a single, global signal of a decision variable. If the decision problem is composed of only two decision alternatives, a single decision variable is indeed sufficient for decision making, but if more alternatives are considered, several evidence accumulating variables are needed. What would the CPP then signal? One of the decision variables? The total amount of certainty of the upcoming decision?

Conclusion

I do like the results in the paper. If they hold up, the CPP may provide a high temporal resolution window into the decision processes of humans. As a result, it may allow us to investigate decision processes for more complex situations than those which animals can master, but maybe it’s only a signal for the simple, perceptual decisions investigated here. Based on the above open questions I also guess that the reported signals were noisier than the plots make us belief and the correspondence of the CPP with theoretical decision variables should be further examined.

Category-specific versus category-general semantic impairment induced by transcranial magnetic stimulation.

Pobric, G., Jefferies, E., and Ralph, M. A. L.
Curr Biol, 20:964–968, 2010
DOI, Google Scholar

Abstract

Semantic cognition permits us to bring meaning to our verbal and nonverbal experiences and to generate context- and time-appropriate behavior. It is core to language and nonverbal skilled behaviors and, when impaired after brain damage, it generates significant disability. A fundamental neuroscience question is, therefore, how does the brain code and generate semantic cognition? Historical and some contemporary theories emphasize that conceptualization stems from the joint action of modality-specific association cortices (the “distributed” theory) reflecting our accumulated verbal, motor, and sensory experiences. Parallel studies of semantic dementia, rTMS in normal participants, and neuroimaging indicate that the anterior temporal lobe (ATL) plays a crucial and necessary role in conceptualization by merging experience into an amodal semantic representation. Some contemporary computational models suggest that concepts reflect a hub-and-spoke combination of information–modality-specific association areas support sensory, verbal, and motor sources (the spokes) while anterior temporal lobes act as an amodal hub. We demonstrate novel and striking evidence in favor of this hypothesis by applying rTMS to normal participants: ATL stimulation generates a category-general impairment whereas IPL stimulation induces a category-specific deficit for man-made objects, reflecting the coding of praxis in this neural region.

Review

This is a short TMS experiment investigating the role of the left anterior temporal lobe (ATL) in semantic processing of stimuli. Semantics here is practically defined as the association to a high-level category defining an object. The task was simply to name the object shown on a picture. Involvement of ATL in this task is indicated by patients with semantic dementia who forget the meaning of categories/objects, i.e., they cannot associate a perceived object with its category/class (example: they see a sheep and don’t know what it is – do they still know what a sheep is, if you tell them that it is a sheep?).

The experiment is supposed to differentiate between 3 hypothesis: 1) object meaning results from a distributed representation of a stimulus between all modalities, 2) object meaning is only generated in ATL, other areas provide only sensory input and 3) part of the object meaning is generated already in single modal areas and ATL acts as an amodal integration hub. These hypothesis are only verbally described and indeed it seems difficult to differentiate between 2) and 3).

The experiment shows that 10min of repetitive TMS can increase response times of subjects in the picture naming, but not a number reading task, if TMS was applied to the left ATL. In a post-hoc analysis the authors then devided the shown pictures into living-nonliving and low-high manipulable objects and again looked for interactions with TMS stimulation. They found that stimulation of left IPL, an area associated with manipulable objects, had an effect on nonliving and high-manipulable objects while having no effect on the others. Stimulation of ATL, however, had a (smaller) effect on all categories. Furthermore, stimulation in occipital lobe had no effect with respect to taks or stimulus at all. The authors conclude that this is evidence for hypothesis 3) above.

A major concern with the study is that the main result has been obtained with a post-hoc analysis and the authors did not even specify more precisely which pictures they used in this analysis, e.g., we don’t know which objects were among them. Furthermore, the results do not really allow to make any conclusions about the connectivity of the different regions. Hypotheses 2) and 3) cannot be discerned with the given results. Even hypotheses 1) could still be true, if one assumes that ATL is a region mainly for producing verbal output of a category – something necessary for the task, but not necessarily involved in associating with a category. However, Katharina mentioned that ATL was also implicated in experiments with other output modalities (e.g. drawing). So, what stays, if one believes the post-hoc analysis, is that TMS on ATL disrupts picture naming in general while TMS on IPL disrupts picture naming selectively for nonliving, high-manipulable objects. We cannot rule out any of the hypotheses above completely.

Bayesian estimation of dynamical systems: an application to fMRI.

Friston, K. J.
Neuroimage, 16:513–530, 2002
DOI, Google Scholar

Abstract

This paper presents a method for estimating the conditional or posterior distribution of the parameters of deterministic dynamical systems. The procedure conforms to an EM implementation of a Gauss-Newton search for the maximum of the conditional or posterior density. The inclusion of priors in the estimation procedure ensures robust and rapid convergence and the resulting conditional densities enable Bayesian inference about the model parameters. The method is demonstrated using an input-state-output model of the hemodynamic coupling between experimentally designed causes or factors in fMRI studies and the ensuing BOLD response. This example represents a generalization of current fMRI analysis models that accommodates nonlinearities and in which the parameters have an explicit physical interpretation. Second, the approach extends classical inference, based on the likelihood of the data given a null hypothesis about the parameters, to more plausible inferences about the parameters of the model given the data. This inference provides for confidence intervals based on the conditional density.

Review

I presented the algorithm which underlies various forms of dynamic causal modeling and which we use to estimate RNN parameters. At the core of it is an iterative computation of the posterior of the parameters of a dynamical model based on a first-order Taylor series approximation of a meta-function mapping parameter values to observations, i.e., the dynamical system is hidden in this function such that the probabilistic model does not have to care about it. This is possible, because the dynamics is assumed to be deterministic and noise only contributes at the level of observations. It can be shown that the resulting update equations for the posterior mode are equivalent with a Gauss-Newton optimisation of the log-joint probability of observations and parameters (this is MAP estimation of the parameters). Consequently, the rate of convergence of the posterior may be up to quadratic, but it is not guaranteed to increase the likelihood at every step or actually converge at all. It should work well close to an optimum (when observations are well fitted), or if the dynamics is close to linear with respect to parameters. Because the dynamical system is integrated numerically to get observation predictions and the Jacobian of the observations with respect to parameters is also obtained numerically, this algorithm may be very slow.

This algorithm is described in Friston2002 embedded into an application to fMRI. I did not present the specifics of this application and, particularly, ignored the influence of the there defined inputs u. The derivation of the parameter posterior described above is embedded in an EM algorithm for hyperparameters on the covariance of observations. I will discuss this in a future session.

Expectation and surprise determine neural population responses in the ventral visual stream.

Egner, T., Monti, J. M., and Summerfield, C.
J Neurosci, 30:16601–16608, 2010
DOI, Google Scholar

Abstract

Visual cortex is traditionally viewed as a hierarchy of neural feature detectors, with neural population responses being driven by bottom-up stimulus features. Conversely, “predictive coding” models propose that each stage of the visual hierarchy harbors two computationally distinct classes of processing unit: representational units that encode the conditional probability of a stimulus and provide predictions to the next lower level; and error units that encode the mismatch between predictions and bottom-up evidence, and forward prediction error to the next higher level. Predictive coding therefore suggests that neural population responses in category-selective visual regions, like the fusiform face area (FFA), reflect a summation of activity related to prediction (“face expectation”) and prediction error (“face surprise”), rather than a homogenous feature detection response. We tested the rival hypotheses of the feature detection and predictive coding models by collecting functional magnetic resonance imaging data from the FFA while independently varying both stimulus features (faces vs houses) and subjects’ perceptual expectations regarding those features (low vs medium vs high face expectation). The effects of stimulus and expectation factors interacted, whereby FFA activity elicited by face and house stimuli was indistinguishable under high face expectation and maximally differentiated under low face expectation. Using computational modeling, we show that these data can be explained by predictive coding but not by feature detection models, even when the latter are augmented with attentional mechanisms. Thus, population responses in the ventral visual stream appear to be determined by feature expectation and surprise rather than by stimulus features per se.

Review

In general the design of the study is interesting as it is a fMRI study investigating the effects of a stimulus that is presented immediately before the actually analysed stimulus, i.e. temporal dependencies between sequentially presented stimuli of which predictability is a subset (actually priming studies would also fall into this category, don’t know how well they are studied with fMRI).

While the original predictive coding and feature detection models are convincing, the feature detection + attention models are confusing. All models seem to lack a baseline. The attention models are somehow defined on the “differential FFA response” and this is not further explained. The f b_1 part of the attention models can actually be reduced to b_1.

Katharina noted that, in contrast to here where they didn’t do it, you should do a small sample correction, if you want to do the ROI analysis properly.

They do not differentiate between prediction error and surprise in the paper. Surprise is the precision-weighted prediction error.