Oxford A2I, DRS
Oxford A2I
Oxford A2I
Oxford DRS
Oxford A2I
Abstract
End-to-end visuomotor control is emerging as a compelling solution for robot manipulation tasks. However, imitation learning-based visuomotor control approaches tend to suffer from a common limitation, lacking the ability to recover from an out-of-distribution state caused by compounding errors. In this paper, instead of using tactile feedback or explicitly detecting the failure through vision, we investigate using the uncertainty of a policy neural network. We propose a novel uncertainty-based approach to detect and recover from failure cases. Our hypothesis is that policy uncertainties can implicitly indicate the potential failures in the visuomotor control task and that robot states with minimum uncertainty are more likely to lead to task success. To recover from high uncertainty cases, the robot monitors its uncertainty along a trajectory and explores possible actions in the state-action space to bring itself to a more certain state. Our experiments verify this hypothesis and show a significant improvement on task success rate: 12% in pushing, 15% in pick-and-reach and 22% in pick-and-place.
Motivation
Deep visuomotor control (VMC) is an emerging research area for closed-loop robot manipulation. Compared to conventional vision-based manipulation approaches, deep VMC aims to learn an end-to-end policy to bridge the gap between robot perception and control, as an alternative to explicitly modelling the object position and planning the trajectories in Cartesian space. In existing works, imitation learning is used to train a policy network to predict motor commands or end-effector actions from raw image observations. Consequently, continuous motor commands can be generated, closing the loop of perception and manipulation. However, with imitation learning, the robot may fall into an unknown state-space to which the policy does not generalise, where it is likely to fail, as seen in the examples below. Early diagnosis of failure cases is thus important for policy generalisation, but an open question in deep VMC research.
Qualitative results
Recovery comparison
BVMC (no recovery)
BVMC + MIN UNC (ours)
Recovery: BVMC + MIN UNC (ours)
Pushing
Pick-and-place
Citation
@inproceedings{hung2021introspective,
title={Introspective visuomotor control: exploiting uncertainty in deep visuomotor control for failure recovery},
author={Hung, Chia-Man and Sun, Li and Wu, Yizhe and Havoutis, Ioannis and Posner, Ingmar},
booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)},
pages={6293--6299},
year={2021},
organization={IEEE}
}
Acknowledgement
Chia-Man Hung is funded by the Clarendon Fund and receives a Keble College Sloane Robinson Scholarship at the University of Oxford. Yizhe Wu is funded by the China Scholarship Council. This research is also supported by an EPSRC Programme Grant [EP/M019918/1] and a gift from Amazon Web Services (AWS). The authors acknowledge the use of the University of Oxford Advanced Research Computing (ARC) facility in carrying out this work (http://dx.doi.org/10.5281/zenodo.22558). We also thank Ruoqi He, Hala Lamdouar, Walter Goodwin and Oliver Groth for proofreading and useful discussions, and the reviewers for valuable feedback.