Khansari-Zadeh, S. M. and Billard, A.
in: Proc. IEEE Int Robotics and Automation (ICRA) Conf, pp. 2381–2388, 2010
DOI, Google Scholar
Abstract
We model the dynamics of non-linear point-topoint robot motions as a time-independent system described by an autonomous dynamical system (DS). We propose an iterative algorithm to estimate the form of the DS through a mixture of Gaussian distributions. We prove that the resulting model is asymptotically stable at the target. We validate the accuracy of the model on a library of 2D human motions and to learn a control policy through human demonstrations for two multidegrees of freedom robots. We show the real-time adaptation to perturbations of the learned model when controlling the two kinematically-driven robots.
Review
The authors describe a system for learning nonlinear, multivariate dynamical systems based on Gaussian mixture regression (GMR). The difference to previous approaches using GMR (e.g. Gribovskaya2010) is that the GMR is done by pruning a Gaussian mixture model which has a Gaussian at each time point such that accuracy and stability criteria are adhered to. Pruning here actually means that two neighbouring Gaussians are merged. Consequently, the main contribution from the paper is the derivation and proof of the corresponding stability criteria – something that I haven’t checked properly.
They make a quantitative comparison between their binary merging approach, original EM learning of GMR, using LWPR to learn the dynamics and using DMPs. However, they do not tell the precise procedures. I am particular surprised about the very low accuracy of the DMPs compared to the other approaches. Unless they have done something special (such as introduce large temporal deviations as done for Fig. 2) I don’t see why the accuracy for DMPs should be so low.
They argue that the main advantages of their approach are that a minimal number of used Gaussians is automatically determined while the resulting dynamics is stable at all times, that the multivariate Gaussians can capture correlations between dimensions (in contrast to DMPs) and that the computations are less costly than when using Gaussian Process Regression. The disadvantages are that the number of parameters increases quadratically with the dimensionality (curse of dimensionality, not so crucial for their 2, 4 or 6D examples, but then?), but, in particular, that the pruning procedure is highly susceptible to local minima issues and results depend on the order in which Gaussians are merged. In the extreme case, imagine that through the presence of noise none of the initial Gaussians can be merged without violating the accuracy constraint. Again, this might not be a problem for their very smooth data, but it will become problematic for more noisy data. Similar problems lead to the dependency on the order of merges (which are selected randomly). To overcome the order dependency they suggest to restart the algorithm several times and then select the result with the smallest number of Gaussians. Note that this compromises their computational advantages over GPs. While computing a GP mapping is cubic in the number of data points, merging the Gaussians is quadratic in the number of time points, but if you consider that different merge orders need to be checked, you’ll notice that there are 2 to the power of time points possible merge sequences, meaning that your computational costs can increase exponentially in the worst case when really the best solution is supposed to be found (if you optimise the hyperparameters in GPs you’re in a similar situation in a continuous space, though).