Abstract :
[en] Robot self-localization is essential for operating autonomously in open environments. When cameras are the main source of information for retrieving the pose, numerous challenges are posed by the presence of dynamic objects, due to occlusion and continuous changes in the appearance. Recent research on global localization methods focused on using a single (or multiple) Convolutional Neural Network (CNN) to estimate the 6 Degrees of Freedom (6-DoF) pose directly from a monocular camera image. In contrast with the classical approaches using engineered feature detector, CNNs are usually more robust to environmental changes in light and to occlusions in outdoor scenarios. This paper contains an attempt to empirically demonstrate the ability of CNNs to ignore dynamic elements, such as pedestrians or cars, through learning. For this purpose, we pre-process a dataset for pose localization with an object segmentation network, masking potentially moving objects. Hence, we compare the pose regression CNN trained and/or tested on the set of masked images and the original one. Experimental results show that the performances of the two training approaches are similar, with a slight reduction of the error when hiding occluding objects from the views.
Scopus citations®
without self-citations
1