Multi Object Exploration

Abstract

Recent advances in vision-based navigation and exploration have shown impressive capabilities in photorealistic indoor environments. However, these methods still struggle with long-horizon tasks and require large amounts of data to generalize to unseen environments. In this work, we present a novel reinforcement learning approach for multi-object search that combines short-term and long-term reasoning in a single model while avoiding the complexities arising from hierarchical structures. In contrast to existing multi-object search methods that act in granular discrete action spaces, our approach achieves exceptional performance in continuous action spaces. We perform extensive experiments and show that it generalizes to unseen apartment environments with limited data. Furthermore, we demonstrate zero-shot transfer of the learned policies to an office environment in real world experiments.

How Does It Work?

Figure: During training, the agent receives a state vector with either the groundtruth direction to the closest object or its prediction. At test time it always receives its prediction. It furthermore receives 16 of its previous predictions, the variances of its x and y-position, the circular variance of its predictions, a collision flag, the sum over the last 16 collisions, its previous action, and a binary vector indicating the objects the agents have to find.

Starting in an unexplored map and given a set of target objects, the robot faces the complex decision on how to most efficiently find these objects. Our approach continuously builds a semantic map of the environment and learns to combine long-term reasoning with short-term decision making into a single policy by predicting the direction of the path towards the closest target object. The mapping module aggregates depth and semantic information into a global map. The predictive module learns long-horizon relationships which are then interpreted by a reinforcement learning policy.

Publications

Fabian Schmalstieg, Daniel Honerkamp, Tim Welschehold and Abhinav Valada,
Learning Long-Horizon Robot Exploration Strategies for Multi-Object Search in Continuous Action Spaces
Proceedings of the International Symposium on Robotics Research (ISRR), 2022.

(Pdf) (Bibtex)

Learning Long-Horizon Robot Exploration

Strategies for Multi-Object Search in Continuous Action Spaces

Abstract

How Does It Work?

Videos

Code and Models

Publications

Acknowledgements

People

Fabian Schmalstieg

Daniel Honerkamp

Tim Welschehold

Abhinav Valada