3D-Aware Object Localization using Gaussian Implicit Occupancy Function

[en] To automatically localize a target object in an image is crucial for many computer vision applications. To represent the 2D object, ellipse labels have recently been identified as a promising alternative to axis-aligned bounding boxes. This paper further considers 3D-aware ellipse labels, i.e., ellipses which are projections of a 3D ellipsoidal approximation of the object, for 2D target localization. Indeed, projected ellipses carry more geometric information about the object geometry and pose (3D awareness) than traditional 3D-agnostic bounding box labels. Moreover, such a generic 3D ellipsoidal model allows for approximating known to coarsely known targets. We then propose to have a new look at ellipse regression and replace the discontinuous geometric ellipse parameters with the parameters of an implicit Gaussian distribution encoding object occupancy in the image. The models are trained to regress the values of this bivariate Gaussian distribution over the image pixels using a statistical loss function. We introduce a novel non-trainable differentiable layer, E-DSNT, to extract the distribution parameters. Also, we describe how to readily generate consistent 3D-aware Gaussian occupancy parameters using only coarse dimensions of the target and relative pose labels. We extend three existing spacecraft pose estimation datasets with 3D-aware Gaussian occupancy labels to validate our hypothesis.

Disciplines :

Computer science

Author, co-author :

GAUDILLIERE, Vincent ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

PAULY, Leo ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

RATHINAM, Arunkumar ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

Garcia Sanchez, Albert

MOHAMED ALI, Mohamed Adel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

AOUADA, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

External co-authors :

Language :

English

Title :

3D-Aware Object Localization using Gaussian Implicit Occupancy Function

Publication date :

October 2023

Event name :

2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

Event date :

October 1 – 5, 2023

Audience :

International

Main work title :

2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Publisher :

IEEE

Peer reviewed :

Peer reviewed

Additional URL :

https://arxiv.org/abs/2303.02058

FnR Project :

FNR14755859 - Multi-modal Fusion Of Electro-optical Sensors For Spacecraft Pose Estimation Towards Autonomous In-orbit Operations, 2020 (01/01/2021-31/12/2023) - Djamila Aouada

Funding text :

This work was partly funded by the Luxembourg National Research Fund (FNR) under the project reference BRIDGES2020/IS/14755859/MEET-A/Aouada.

Available on ORBilu :

since 14 November 2023

Statistics

Number of views

214 (14 by Unilu)

Number of downloads

152 (3 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenAlex citations

Bibliography

L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu, and M. Pietikainen, "Deep learning for generic object detection: A survey," Int. J. Comput. Vis., vol. 128, no. 2, pp. 261-318, 2020.
W. Dong, P. Roy, C. Peng, and V. Isler, "Ellipse R-CNN: learning to infer elliptical object from clustering and occlusion," IEEE Trans. Image Process., vol. 30, pp. 2193-2206, 2021.
Z. Wang, N. Dong, S. D. Rosario, M. Xu, P. Xie, and E. P. Xing, "Ellipse detection of optic disc-and-cup boundary in fundus images," in 16th IEEE International Symposium on Biomedical Imaging (ISBI), 2019, pp. 601-604.
H. Lin, Z. Li, M. Shih, Y. Sun, and T. Shen, "Pupil localization for ophthalmic diagnosis using anchor ellipse regression," in 16th International Conference on Machine Vision Applications (MVA), 2019, pp. 1-5.
Y. Li, "Detecting lesion bounding ellipses with gaussian proposal networks," in MICCAI Workshops 2019, Proceedings. Springer, 2019, pp. 337-344.
S. Ren, K. He, R. B. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," in NIPS, 2015, pp. 91-99.
S. Pan, S. Fan, S. W. K. Wong, J. V. Zidek, and H. Rhodin, "Ellipse detection and localization with applications to knots in sawn lumber images," in IEEE Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 3891-3900.
T. Wang, C. Lu, M. Shao, X. Yuan, and S. Xia, "Eldet: An anchorfree general ellipse object detector," in Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 2580-2595.
M. Zins, G. Simon, and M.-O. Berger, "3d-aware ellipse prediction for object-based camera pose estimation," in 8th International Conference on 3D Vision, 3DV 2020. IEEE, 2020, pp. 281-290.
-, "Object-based visual camera pose estimation from ellipsoidal model and 3d-aware ellipse prediction," International Journal of Computer Vision, vol. 130, no. 4, pp. 1107-1126, 2022.
-, "Level set-based camera pose estimation from multiple 2d/3d ellipse-ellipsoid correspondences," in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.
P. F. Proenca and Y. Gao, "Deep learning for spacecraft pose estimation from photorealistic rendering," in IEEE International Conference on Robotics and Automation, (ICRA), 2020, pp. 6007-6013.
T. H. Park, M. Martens, G. Lecuyer, D. Izzo, and S. D'Amico, "Speed+: Next-generation dataset for spacecraft pose estimation across domain gap," in IEEE Aerospace Conference (AERO), 2022, pp. 1-15.
R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press, 2004.
T. Fan, G. Wang, Y. Li, and H. Wang, "Ma-net: A multi-scale attention network for liver and tumor segmentation," IEEE Access, vol. 8, pp. 179 656-179 665, 2020.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
A. Nibali, Z. He, S. Morgan, and L. Prendergast, "Numerical coordinate regression with convolutional neural networks," arXiv preprint arXiv:1801.07372, 2018.
C. Rubino, M. Crocco, and A. Del Bue, "3d object localisation from multi-view image detections," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1281-1294, 2018.
A. Rathinam, V. Gaudilliere, L. Pauly, and D. Aouada, "AKM Dataset: Textureless Space Target Dataset," https://zenodo.org/record/7744505, Sep 2022.
M. Kisantal, S. Sharma, T. H. Park, D. Izzo, M. Martens, and S. D'Amico, "Satellite pose estimation challenge: Dataset, competition design, and results," IEEE Trans. Aerosp. Electron. Syst., vol. 56, no. 5, pp. 4083-4098, 2020.
A. Rathinam, V. Gaudillìere, L. Pauly, and D. Aouada, "Pose Estimation of a Known Texture-Less Space Target using Convolutional Neural Networks," in 73th International Astronautical Congress (IAC), 2022.
L. Pauly, W. Rharbaoui, C. Shneider, A. Rathinam, V. Gaudillìere, and D. Aouada, "A survey on deep learning-based monocular spacecraft pose estimation: Current state, limitations and prospects," ArXiv preprint, vol. abs/2305.07348, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2305.07348
S. M. Catalogue, "Prisma (prototype)," https://www.eoportal.org/ satellite-missions/prisma-prototype#target-spacecraft.
M.-P. Dubuisson and A. Jain, "A modified hausdorff distance for object matching," in Proceedings of 12th International Conference on Pattern Recognition, vol. 1, 1994, pp. 566-568.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
A. Garcia, M. A. Musallam, V. Gaudilliere, E. Ghorbel, K. Al Ismaeil, M. Perez, and D. Aouada, "Lspnet: A 2d localization-oriented spacecraft pose estimation neural network," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021, pp. 2048-2056.
V. Gaudillìere, G. Simon, and M.-O. Berger, "Perspective-1-ellipsoid: Formulation, analysis and solutions of the camera pose estimation problem from one ellipse-ellipsoid correspondence," International Journal of Computer Vision, vol. 131, no. 9, pp. 2446-2470, 2023.
A. Rathinam, V. Gaudillìere, M. A. Mohamed Ali, M. Ortiz Del Castillo, L. Pauly, and D. Aouada, "SPARK 2022 Dataset : Spacecraft Detection and Trajectory Estimation," June 2022. [Online]. Available: https://doi.org/10.5281/zenodo.6599762