OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on  Planetary Rovers Using RGB, Depth, and Thermal Imagery

CASTILLA ARQUILLO, Raul; Perez-del-Pulgar, Carlos; Gerdes, Levin; Garcia-Cerezo, Alfonso; OLIVARES MENDEZ, Miguel Angel

No full text

Unpublished conference/Abstract (Scientific congresses, symposiums and conference proceedings)

OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery

CASTILLA ARQUILLO, Raul; Perez-del-Pulgar, Carlos; Gerdes, Levin et al.

2025 • Isparo 2025

Permalink
https://hdl.handle.net/10993/66162

arXiV
2508.00580v1

Files (0)Send to Details Statistics Bibliography Similar publications

Files

Full Text

No document available.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Computer Science - Robotics

Abstract :

[en] Robot navigation in unstructured environments requires multimodal perception systems that can support safe navigation. Multimodality enables the integration of complementary information collected by different sensors. However, this information must be processed by machine learning algorithms specifically designed to leverage heterogeneous data. Furthermore, it is necessary to identify which sensor modalities are most informative for navigation in the target environment. In Martian exploration, thermal imagery has proven valuable for assessing terrain safety due to differences in thermal behaviour between soil types. This work presents OmniUnet, a transformer-based neural network architecture for semantic segmentation using RGB, depth, and thermal (RGB-D-T) imagery. A custom multimodal sensor housing was developed using 3D printing and mounted on the Martian Rover Testbed for Autonomy (MaRTA) to collect a multimodal dataset in the Bardenas semi-desert in northern Spain. This location serves as a representative environment of the Martian surface, featuring terrain types such as sand, bedrock, and compact soil. A subset of this dataset was manually labeled to support supervised training of the network. The model was evaluated both quantitatively and qualitatively, achieving a pixel accuracy of 80.37% and demonstrating strong performance in segmenting complex unstructured terrain. Inference tests yielded an average prediction time of 673 ms on a resource-constrained computer (Jetson Orin Nano), confirming its suitability for on-robot deployment. The software implementation of the network and the labeled dataset have been made publicly available to support future research in multimodal terrain perception for planetary robotics.

Disciplines :

Aerospace & aeronautics engineering

Author, co-author :

CASTILLA ARQUILLO, Raul ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics ; UMA - University of Malaga > Department of Automation and Systems Engineering

Perez-del-Pulgar, Carlos

Gerdes, Levin

Garcia-Cerezo, Alfonso

OLIVARES MENDEZ, Miguel Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics

External co-authors :

yes

Language :

English

Title :

OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery

Publication date :

2025

Event name :

Isparo 2025

Event date :

1st - 4th December

Available on ORBilu :

since 28 October 2025

Statistics

Number of views

25 (1 by Unilu)

Number of downloads

0 (0 by Unilu)

More statistics