Abstract :
[en] Autonomous vehicles face critical perception challenges, when objects such as pedestrians or vehicles are hidden behind obstacles. Transformer-based models, such as BEVDet4D, achieve robust 360-Degree 4D object detection by leveraging multi-camera inputs. However, their performance significantly deteriorates for occluded objects. To overcome this limitation, we propose V2X-BEVDet4D, a cooperative perception framework built on BEVDet4D. It enhances the base model by fusing its outputs with object detections from roadside infrastructure, transmitted via ETSI-compliant Cooperative Perception Messages (CPMs) over ITS-G5. Results show up to a 183% improvement in nuScenes Detection Score (NDS) for objects located 100 m away from the vehicle, precisely where standalone BEVDet4D detection is most limited. Preliminary results also demonstrate a CPM transmission latency of 3.44 ms (±1.3 ms std), confirming the real-time feasibility. To our knowledge, this is the first framework to fuse ETSI-compliant V2X messages into a BEV-based 360-Degree 4D object detection pipeline, enabling temporal consistency across frames.