The Neural Costs of Optimal Control.

Gershman, S. J. and Wilson, R. C.
in: Advances in Neural Information Processing Systems 23, 2010
Google Scholar


Optimal control entails combining probabilities and utilities. However, for most practical problems, probability densities can be represented only approximately. Choosing an approximation requires balancing the benefits of an accurate approximation against the costs of computing it. We propose a variational framework for achieving this balance and apply it to the problem of how a neural population code should optimally represent a distribution under resource constraints. The essence of our analysis is the conjecture that population codes are organized to maximize a lower bound on the log expected utility. This theory can account for a plethora of experimental data, including the reward-modulation of sensory receptive fields, GABAergic effects on saccadic movements, and risk aversion in decisions under uncertainty.


Within the area of decision theory they consider the problem of evaluating the expected utility of an action given a posterior. They propose a variational framework analogously to the one used for the EM algorithm in which the utility replaces the likelihood and the posterior replaces the prior. Their main contribution is to include a cost penalising the complexity of the approximation of the posterior and use the so defined lower bound on the expected utility to simultaneously optimise the density used to approximate the posterior. As the utility does not only contain the cost of the approximation, but also the actual utility of an action in the considered states, this model predicts that the approximating density should also reflect what is behaviourally relevant instead of only trying to represent the natural posterior distribution optimally.

In the results section they then show that under this model the approximated posterior can (and will) indeed put more probability mass on states with larger utility, something which has apparently been found in grasshoppers. Additionally they show that increasing the cost of spikes results in smaller firing rates which, as they argue, leads to response latencies as seen in experiments. Finally, they show that under the assumption that high utility or very costly states are rare, the model will automatically account for the nonlinear weighting of probabilities in risky choice observed in humans. The model therefore explains this irrational behaviour by noting that “under neural resource constraints, the approximate density will be biased towards high reward regions of the state space.”

I don’t know enough to judge the correspondence of the model behaviour/predictions with the experimental results, or whether the model contradicts some other results. However, the paper is quite inspiring in the sense that it presents an intuitive idea which potentially has big implications for how populations of neurons code probability distributions, namely that the neuronal codes are influenced as much by expected rewards as by the natural distribution. Of course, the paper leaves many questions open. The authors only show results for when the approximate distributions are optimised alone, but what happens when actions and distribution are optimised simultaneously? What are the timescales of the distribution optimisation? Is it really instantanious (on the same timescale as action selection) as the authors indicate, or is it rather a slower process? Their proposal also has the potential to explain how the dimensionality of the state space can be reduced by only considering states which are behaviourally relevant. However, it remains unclear to what extent this specialisation should be implemented. In other words, is the posterior dependent, e.g., on the precise goal in the task, or is it rather only dependent on the selected task? Especially the cost of spiking example suggests a connection between the proposed mechanism and attention. Can attention be explained by this low-level description of biased representations of the posterior distribution?

The paper is quite inspiring and you kind of wonder why nobody has made these ideas explicit before, or maybe somebody has? Actually Maneesh Sahani had a NIPS paper in 2004 which they cite, but not comment on, and which looks very similar to what they do here (from the abstract).

The role of sensory network dynamics in generating a motor program.

Levi, R., Varona, P., Arshavsky, Y. I., Rabinovich, M. I., and Selverston, A. I.
J Neurosci, 25:9807–9815, 2005
DOI, Google Scholar


Sensory input plays a major role in controlling motor responses during most behavioral tasks. The vestibular organs in the marine mollusk Clione, the statocysts, react to the external environment and continuously adjust the tail and wing motor neurons to keep the animal oriented vertically. However, we suggested previously that during hunting behavior, the intrinsic dynamics of the statocyst network produce a spatiotemporal pattern that may control the motor system independently of environmental cues. Once the response is triggered externally, the collective activation of the statocyst neurons produces a complex sequential signal. In the behavioral context of hunting, such network dynamics may be the main determinant of an intricate spatial behavior. Here, we show that (1) during fictive hunting, the population activity of the statocyst receptors is correlated positively with wing and tail motor output suggesting causality, (2) that fictive hunting can be evoked by electrical stimulation of the statocyst network, and (3) that removal of even a few individual statocyst receptors critically changes the fictive hunting motor pattern. These results indicate that the intrinsic dynamics of a sensory network, even without its normal cues, can organize a motor program vital for the survival of the animal.


The authors investigate the neural mechanisms of hunting behaviour in a mollusk. It’s simplicity allows that the nervous system can be completely stripped apart from the rest of the body and be investigated in isolation from the body, but as a whole. In particular, the authors are interested in the causal influence of sensory neurons on motor activity.

The mollusk has two types of behaviour for positioning its body in the water: 1) it uses gravitational sensors (statocysts) to maintain a head-up position in the water under normal circumstances and 2) it swims in apparently chaotic, small loops when it suspects prey in its vicinity (searching). In this paper the authors present evidence that the searching behaviour 2) is still largely dependent on the (internal) dynamics of the statocysts.

The model is as follows (see Varona2002): without prey inhibitory connections between sensory cells in the stratocysts make sure that only a small proportion of cells are firing (those that are activated by mechanoreceptors according to gravitation acting on a stone-like structure in the statocysts), but when prey is in the vicinity of the mollusk (as indicated by e.g. chemoreceptors) cerebral hunting neurons additionally excite the statocyst cells inducing chaotic dynamics between them. The important thing to note is that then the statocysts still influence motor behaviour as shown in the paper. So the argument is that the same mechanism for producing motor output dependent on statocyst signals can be used to generate searching just through changing the activity of the sensory neurons.

Overall the evidence presented in the paper is convincing that statocyst activity influences the activity of the motor neurons also in the searching behaviour, but it cannot be said concludingly that the statocysts are necessary for producing the swimming, because the setup allowed only the activity of motor neurons to be observed without actually seeing the behaviour (actually Levi2004 show that the typical searching behaviour cannot be produced when the statocysts are removed). For the same reason, the experiments also neglected possible feedback mechanisms between body/mollusk and environment, e.g. in the statocyst activity due to changing gravitational state, i.e. orientation. The argument there is, though not explicitly stated, that the statocyst stops computing the actual orientation of the body, but is purely driven through its own dynamics. Feedback from the peripheral motor system is not modelled (Varona2002, argueing that for determining the origin of the apparent chaotic behaviour this is not necessary).

For us this is a nice example for how action can be a direct consequence of perception, but even more so that internal sensory dynamics can produce differentiated motor behaviour. The connection between sensory states and motor activity is relatively fixed, but different motor behaviour may be generated by different processing in the sensory system. The autonomous dynamics of the statocysts in searching behaviour may also be interpreted as being induced from different, high-precision predictions on a higher level. It may be questioned how good a model the mollusk nervous system is for information processing in the human brain, but maybe they share these principles.

Winnerless competition between sensory neurons generates chaos: A possible mechanism for molluscan hunting behavior.

Varona, P., Rabinovich, M. I., Selverston, A. I., and Arshavsky, Y. I.
Chaos: An Interdisciplinary Journal of Nonlinear Science, 12:672–677, 2002
DOI, Google Scholar


In the presence of prey, the marine mollusk Clione limacina exhibits search behavior, i.e., circular motions whose plane and radius change in a chaotic-like manner. We have formulated a dynamical model of the chaotic hunting behavior of Clione based on physiological in vivo and in vitro experiments. The model includes a description of the action of the cerebral hunting interneuron on the receptor neurons of the gravity sensory organ, the statocyst. A network of six receptor model neurons with Lotka-Volterra-type dynamics and nonsymmetric inhibitory interactions has no simple static attractors that correspond to winner take all phenomena. Instead, the winnerless competition induced by the hunting neuron displays hyperchaos with two positive Lyapunov exponents. The origin of the chaos is related to the interaction of two clusters of receptor neurons that are described with two heteroclinic loops in phase space. We hypothesize that the chaotic activity of the receptor neurons can drive the complex behavior of Clione observed during hunting.


see Levi2005 for short summary in context

My biggest concern with this paper is that the changes in direction of the mollusc may also result from feedback from the body and especially the stratocysts during its accelerated swimming. The question is, are these direction changes a result of chaotic, but deterministic dynamics in the sensory network as suggested by the model, or are they a result of essentially random processes which may be influenced by feedback from other networks? The authors note that in their model “The neurons keep the sequence of activation but the interval in which they are active is continuously changing in time”. After a day of search for papers which have investigated the swimming behaviour of Clione limacina (the mollusc in question) I came to the conclusion that the data schown in Fig. 1 likely is the only data set of swimming behaviour that was published. This small data set suggests random changes in direction, in contrast to the model, but it does not allow to draw any definite conclusions about the repetitiveness of direction changes.

Action generation and action perception in imitation: an instance of the ideomotor principle.

Wohlschläger, A., Gattis, M., and Bekkering, H.
Philos Trans R Soc Lond B Biol Sci, 358:501–515, 2003
DOI, Google Scholar


We review a series of behavioural experiments on imitation in children and adults that test the predictions of a new theory of imitation. Most of the recent theories of imitation assume a direct visual-to-motor mapping between perceived and imitated movements. Based on our findings of systematic errors in imitation, the new theory of goal-directed imitation (GOADI) instead assumes that imitation is guided by cognitively specified goals. According to GOADI, the imitator does not imitate the observed movement as a whole, but rather decomposes it into its separate aspects. These aspects are hierarchically ordered, and the highest aspect becomes the imitator’s main goal. Other aspects become sub-goals. In accordance with the ideomotor principle, the main goal activates the motor programme that is most strongly associated with the achievement of that goal. When executed, this motor programme sometimes matches, and sometimes does not, the model’s movement. However, the main goal extracted from the model movement is almost always imitated correctly.


The authors report about a series of experiments which led them to propose a theory for imitation which gives the goal of a demonstrated movement a central role for imiation: GOADI – goal directed imitation. In particular they were looking at the errors made by children when imitating hand movements of a model. These movements were: touching your ear with your hand, touching spots on a table, pointing at or picking up an object. These experiments allowed to dissociate the goal of the corresponding movement from the way it is executed. In the ear touching movements, for example, the model touches her right ear with her left hand (a contralateral movement), but the child might imitate by touching its left ear with its left hand (an ipsilateral movement). In the whole paper they assume that people naturally imitate in a mirror fashion, i.e. you would touch your left ear when the model sitting opposite of you touches her right ear. This is backed up by the data in the sense that this is what people do in the vast majority of times.

Their theory is motivated by frequent CI errors of the children in which a contralateral movement is imitated with an ipsilateral movement, but the target of the movement is chosen correctly, i.e. the correct ear is touched. The authors conclude that the children determine the goal correctly, but don’t have enough working memory / attention to process all aspects of the demonstrated movements and simply execute the movement that achieves that goal and is most natural to them. In adults these kinds of errors are greatly reduced which is perhaps a result of greater attention abilities, but when the imitation task is made slightly more complicated similar errors can be observed.

Another important part of the theory suggests that demonstrated movements are decomposed into separate aspects and that these are ordered in a hierarchy (a goal and subgoals) such that aspects higher in the hierarchy are imitated with greater care. They report about experiments in which such a hierarchy seems to be observed for aspects object identitiy, object treatment, use of effector and movement (in this order). While there is a certain difference between object specific aspects and movement specific aspects, I’m not so certain about the strict hierarchy.

Anyway, the experiments are pretty convincing and strongly support a goal directed theory of imitation in contrast to theories which propose a direct mapping from sensory input to motor output.

Real-time Motion Retargeting to Highly Varied User-Created Morphologies.

Hecker, C., Raabe, B., Enslow, R. W., DeWeese, J., Maynard, J., and van Prooijen, K.
in: Proceedings of ACM SIGGRAPH ’08, 2008
URL, Google Scholar


Character animation in video games””whether manually key-framed or motion captured””has traditionally relied on codifying skeletons early in a game’s development, and creating animations rigidly tied to these fixed skeleton morphologies. This paper introduces a novel system for animating characters whose morphologies are unknown at the time the animation is created. Our authoring tool allows animators to describe motion using familiar posing and key-framing methods. The system records the data in a morphology-independent form, preserving both the animation’s structural relationships and its stylistic information. At runtime, the generalized data are applied to specific characters to yield pose goals that are supplied to a robust and efficient inverse kinematics solver. This system allows us to animate characters with highly varying skeleton morphologies that did not exist when the animation was authored, and, indeed, may be radically different than anything the original animator envisioned.


The paper explains how motion retargeting to wildly varying creatures is done in Electronic Arts’ game Spore. The crucial point is that they devised an animation system (Spasm) in which animators do not work on a fixed body, but on meta-level descriptions of body parts which are showcased on a small set of example bodies in the program window. Animators first select body parts by choosing descriptors like “grasper in front”. Then they can define movements in different modes e.g. relative to rest position, relative to external target, relative to limb length and similar others. The authors say in the discussion: “However, it takes weeks to build up an intuition about which kinds of motions generalize across a wide range of characters and which don’t.”

In principal they have devised an animation system in which motions are described in task space instead of in joint space (the standard in animation with e.g. Maya). Well, it’s some kind of hybrid. In general, everything in the paper is very ad hoc as the main objective is to make it work in real-time for the game. Anyway, the paper is not really addressing the problem of motion retargeting where you observe motion on one body and try to get it on another. Rather here they are concerned with representing motion from the beginning in such a way that it is easily transferred to a wide range of very different bodies.

A large part of the paper is about playing the so stored motion on a particular body (“specialization”). For this they use their own ad hoc IK solver. I didn’t find any interesting principles here (they sort out position of the spine first and only then solve for constraint satisfaction of the limbs), but I also didn’t put a lot of effort to understand what’s going on.