Abstract
Object navigation is a fundamental task for autonomous robot control, self-driving cars and other applications. It requires the agent to take effective actions to navigate to a specific target semantic object in a previously unseen environment. We tackle this problem by using reinforcement learning and the proposed spatial-visual joint memory in a recursive quadtree representation. Compared to existing work, our recursive quadtree architecture leverages both visual and occupancy/spatial information. Maintaining an occupancy map also makes it possible to take advantage of deterministic path-planing techniques which lead to better training-sample efficiency and shorter navigation trajecto-ries. Additionally, our quadtree representation further improves efficiency by avoiding processing empty quadrants. We evaluate our proposed method on two publicly available simulators: Habitat and AI2thor in object navigation tasks. Experimental results show our method achieves state-of-the-art performance in both success rate and SPL (success weighted by path length) metrics.