robotics – Sebastian Bitzer's Homepage

Sandamirskaya, Y. and Schöner, G.
Neural Networks, 23:1164–1179, 2010
DOI, Google Scholar

Abstract

Learning and generating serially ordered sequences of actions is a core component of cognition both in organisms and in artificial cognitive systems. When these systems are embodied and situated in partially unknown environments, specific constraints arise for any neural mechanism of sequence generation. In particular, sequential action must resist fluctuating sensory information and be capable of generating sequences in which the individual actions may vary unpredictably in duration. We provide a solution to this problem within the framework of Dynamic Field Theory by proposing an architecture in which dynamic neural networks create stable states at each stage of a sequence. These neural attractors are destabilized in a cascade of bifurcations triggered by a neural representation of a condition of satisfaction for each action. We implement the architecture on a robotic vehicle in a color search task, demonstrating both sequence learning and sequence generation on the basis of low-level sensory information.

Review

The paper presents a dynamical model of the execution of sequential actions driven by sensory feedback which allows variable duration of individual actions as signalled by external cues of subtask fulfillment (i.e. end of action). Therefore, it is one of the first functioning models with continuous dynamics which truly integrates action and perception. The core technique used is dynamic field theory (DFT) which implements winner-take-all dynamics in the continuous domain, i.e. the basic dynamics stays at a uniform baseline until a sufficiently large input at a certain position drives activity over a threshold and produces a stable single peak of activity around there. The different components of the model all run with dynamics using the same principle and are suitably connected such that stable peaks in activity can be destabilised to allow moving the peak to a new position (signalling something different).

The aim of the excercise is to show that varying length sequential actions can be produced by a model of continuous neuronal population dynamics. Sequential structure is induced in the model by a set of ordinal nodes which are coupled via additional memory nodes such that they are active one after the after. However, the switch to the next ordinal node in the sequence needs to be triggered by sensory input which indicates that the aim of an action has been achieved. Activity of an ordinal node then directly induces a peak in the action field at a location determined by a set of learnt weights. In the robot example the action space is defined over the hue value, i.e. each action selects a certain colour. The actual action of the robot (turning and accelerating) is controlled by an additional color-space field and some motor dynamics not part of the sequence model. Hence, their sequence model as such only prescribes discrete actions. To decide whether an action has been successfully completed the action field increases activity in a particular spot in a condition of satisfaction field which only peaks at that spot, if suitable sensory input drives the activity at the spot over the threshold. Which spot the action field selects is determined by hand here (in the example it’s an identity function). A peak in the condition of satisfaction field then triggers a switch to the next ordinal node in the sequence. We don’t really see an evaluation of system performance (by what criterion?), but their system seems to work ok, at least producing the sequences in the order demonstrated during learning.

The paper is quite close to what we are envisaging. The free energy principle could add a Bayesian perspective (we would have to find a way to implement the conditional progression of a sequence, but I don’t see a reason why this shouldn’t be possible). Apart from that the function implemented by the dynamics is extremely simple. In fact, the whole sequential system could be replaced with simple, discrete if-then logic without having to change the continuous dynamics of the robot implementation layer (color-space field and motor dynamics). I don’t see how continuous dynamics here helps except that it is more biologically plausible. This is also a point on which the authors focus in the introduction and discussion. Something else that I noticed: all dynamic variables are only 1D (except for the colour-space field which is 2D). This is probably because the DFT formalism requires that the activity over the field is integrated for each position in the field every simulation step to compute the changes in activity (cf. computation of expectations in Bayesian inference) which is probably infeasible when the representations contain several variables.

Khansari-Zadeh, S. M. and Billard, A.
in: Proc. IEEE Int Robotics and Automation (ICRA) Conf, pp. 2381–2388, 2010
DOI, Google Scholar

Abstract

We model the dynamics of non-linear point-topoint robot motions as a time-independent system described by an autonomous dynamical system (DS). We propose an iterative algorithm to estimate the form of the DS through a mixture of Gaussian distributions. We prove that the resulting model is asymptotically stable at the target. We validate the accuracy of the model on a library of 2D human motions and to learn a control policy through human demonstrations for two multidegrees of freedom robots. We show the real-time adaptation to perturbations of the learned model when controlling the two kinematically-driven robots.

Review

The authors describe a system for learning nonlinear, multivariate dynamical systems based on Gaussian mixture regression (GMR). The difference to previous approaches using GMR (e.g. Gribovskaya2010) is that the GMR is done by pruning a Gaussian mixture model which has a Gaussian at each time point such that accuracy and stability criteria are adhered to. Pruning here actually means that two neighbouring Gaussians are merged. Consequently, the main contribution from the paper is the derivation and proof of the corresponding stability criteria – something that I haven’t checked properly.

They make a quantitative comparison between their binary merging approach, original EM learning of GMR, using LWPR to learn the dynamics and using DMPs. However, they do not tell the precise procedures. I am particular surprised about the very low accuracy of the DMPs compared to the other approaches. Unless they have done something special (such as introduce large temporal deviations as done for Fig. 2) I don’t see why the accuracy for DMPs should be so low.

They argue that the main advantages of their approach are that a minimal number of used Gaussians is automatically determined while the resulting dynamics is stable at all times, that the multivariate Gaussians can capture correlations between dimensions (in contrast to DMPs) and that the computations are less costly than when using Gaussian Process Regression. The disadvantages are that the number of parameters increases quadratically with the dimensionality (curse of dimensionality, not so crucial for their 2, 4 or 6D examples, but then?), but, in particular, that the pruning procedure is highly susceptible to local minima issues and results depend on the order in which Gaussians are merged. In the extreme case, imagine that through the presence of noise none of the initial Gaussians can be merged without violating the accuracy constraint. Again, this might not be a problem for their very smooth data, but it will become problematic for more noisy data. Similar problems lead to the dependency on the order of merges (which are selected randomly). To overcome the order dependency they suggest to restart the algorithm several times and then select the result with the smallest number of Gaussians. Note that this compromises their computational advantages over GPs. While computing a GP mapping is cubic in the number of data points, merging the Gaussians is quadratic in the number of time points, but if you consider that different merge orders need to be checked, you’ll notice that there are 2 to the power of time points possible merge sequences, meaning that your computational costs can increase exponentially in the worst case when really the best solution is supposed to be found (if you optimise the hyperparameters in GPs you’re in a similar situation in a continuous space, though).

Tag: robotics

An embodied account of serial order: How instabilities drive sequence generation.

Abstract

Review

BM: An iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models.

Abstract

Review