No document available.
Abstract :
[en] Robot navigation in unstructured environments requires multimodal perception
systems that can support safe navigation. Multimodality enables the integration
of complementary information collected by different sensors. However, this
information must be processed by machine learning algorithms specifically
designed to leverage heterogeneous data. Furthermore, it is necessary to
identify which sensor modalities are most informative for navigation in the
target environment. In Martian exploration, thermal imagery has proven valuable
for assessing terrain safety due to differences in thermal behaviour between
soil types. This work presents OmniUnet, a transformer-based neural network
architecture for semantic segmentation using RGB, depth, and thermal (RGB-D-T)
imagery. A custom multimodal sensor housing was developed using 3D printing and
mounted on the Martian Rover Testbed for Autonomy (MaRTA) to collect a
multimodal dataset in the Bardenas semi-desert in northern Spain. This location
serves as a representative environment of the Martian surface, featuring
terrain types such as sand, bedrock, and compact soil. A subset of this dataset
was manually labeled to support supervised training of the network. The model
was evaluated both quantitatively and qualitatively, achieving a pixel accuracy
of 80.37% and demonstrating strong performance in segmenting complex
unstructured terrain. Inference tests yielded an average prediction time of 673
ms on a resource-constrained computer (Jetson Orin Nano), confirming its
suitability for on-robot deployment. The software implementation of the network
and the labeled dataset have been made publicly available to support future
research in multimodal terrain perception for planetary robotics.
Title :
OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery