We propose a novel monocular visual odometry (VO) system called UnDeepVO in this paper. UnDeepVO is able to estimate the 6-DoF pose of a monocular camera and the depth of its view by using deep neural networks. There are two salient features of the proposed UnDeepVO: one is the unsupervised deep learning scheme, and the other is the absolute scale recovery. Specifically, we train UnDeepVO by using stereo image pairs to recover the scale but test it by using consecutive monocular images. Thus, UnDeepVO is a monocular system. The loss function defined for training the networks is based on spatial and temporal dense information. A system overview is shown in Fig. 1. The experiments on KITTI dataset show our UnDeepVO outperforms other monocular VO methods in terms of pose accuracy.

Demo Video


Download paper Ruihao Li, Sen Wang, Zhiqiang Long, and Dongbing Gu. UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning. In arXiv. [arXiv]

  title = { {UnDeepVO}: Monocular Visual Odometry through Unsupervised Deep Learning},
  author = {Li, Ruihao and Wang, Sen and Long, Zhiqiang and Gu, Dongbing},
  journal = {ArXiv e-prints},

Also Check Our DeepVO!