Learning Your Way Without Map or Compass: Panoramic Target
Driven Visual Navigation
David Watkins-Valls
∗,1
, Jingxi Xu
∗,1
, Nicholas Waytowich
2
and Peter Allen
1
Abstract— We present a robot navigation system that uses
an imitation learning framework to successfully navigate in
complex environments. Our framework takes a pre-built 3D
scan of a real environment and trains an agent from pre-
generated expert trajectories to navigate to any position given a
panoramic view of the goal and the current visual input without
relying on map, compass, odometry, or relative position of the
target at runtime. Our end-to-end trained agent uses RGB and
depth (RGBD) information and can handle large environments
(up to 1031m
2
) across multiple rooms (up to 40) and generalizes
to unseen targets. We show that when compared to several
baselines our method (1) requires fewer training examples and
less training time, (2) reaches the goal location with higher
accuracy, and (3) produces better solutions with shorter paths
for long-range navigation tasks.
I. INTRODUCTION
The ability to navigate efficiently and accurately within
an environment is fundamental to intelligent behavior and
has been a focus of research in robotics for many years.
Traditionally, robotic navigation is solved using model-based
methods with an explicit focus on position inference and
mapping, such as Simultaneous Localization and Mapping
(SLAM) [1]. These models use path planning algorithms,
such as Probabilistic Roadmaps (PRM) [2] and Rapidly
Exploring Random Trees (RRT) [3], [4], to plan a collision-
free path. These methods ignore the rich information from
visual input and are highly sensitive to robot odometry and
noise in sensor data. For example, a robot navigating through
a room may lose track of its position due to the navigation
software not properly modeling friction.
Model-free reinforcement learning (RL) agents have per-
formed well on many robotic tasks [5], [6], [7], [8], leading
researchers to rely on RL for robotic navigation tasks [9],
[10], [11], [12]. Recent work in robotic visual navigation
uses reinforcement learning which trains an agent to navigate
to a goal using only the current and goal RGB images [9].
∗
Authors have contributed equally and names are in alphabetical order.
1
Department of Computer Science, Columbia University, New York,
NY, USA. {davidwatkins, allen}@cs.columbia.edu,
jingxi.xu@columbia.edu
2
U.S. Army Research Laboratory, Baltimore, MD, USA.
nicholas.r.waytowich.civ@mail.mil
This work is supported by NSF Grant CMMI 1734557. This research
was sponsored by the Army Research Laboratory and was accomplished
under Cooperative Agreement Number W911NF-18-2-0244. The views and
conclusions contained in this document are those of the authors and should
not be interpreted as representing the official policies, either expressed or
implied, of the Army Research Laboratory or the U.S. Government. The
U.S. Government is authorized to reproduce and distribute reprints for
Government purposes notwithstanding any copyright notation herein.
The title is an homage to Harold Gatty’s book Finding Your Way Without
Map or Compass.
Fig. 1: A successful trajectory executed in house17 from
the Matterport3D dataset. The history buffer and current view
are the state of the pipeline. The panoramic goal is 8 RGBD
images each taken at a 45
◦
turn. The top-down view is the
agent moving through the trajectory with the blue sphere as
the start position and the green sphere as the goal position.
Smaller blue spheres are positions that the agent has been
to and the orange spheres are the remaining positions. The
images are taken at the current position of the robotic agent.
While reinforcement learning has the convenience of being
weakly supervised, it generally suffers from sparse rewards
in navigation, requires a huge number of training episodes
to converge, and struggles to generalize to unseen targets.
The problem is further exacerbated when the navigation
environment becomes large and complex (across multiple
rooms and scenes with various obstacles), leading to difficult
long-range path solutions.
New advancements in annotated 3D maps of real-world
data, such as Stanford2D3DS [13] and Matterport3D [14],
enable the collection of large amounts of trajectory data.
Simulators capable of collecting this data have arisen in the
past few years in the form of MINOS [15], Gibson [16],
Habitat [17], and THOR [9]. These systems enable simulta-
neous use of real and simulated environments for training,
2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
October 25-29, 2020, Las Vegas, NV, USA (Virtual)
978-1-7281-6211-9/20/$31.00 ©2020 IEEE 5816