Computer Science - Computer Vision and Pattern Recognition
Résumé :
[en] To automatically localize a target object in an image is crucial for many
computer vision applications. To represent the 2D object, ellipse labels have
recently been identified as a promising alternative to axis-aligned bounding
boxes. This paper further considers 3D-aware ellipse labels, i.e.,
ellipses which are projections of a 3D ellipsoidal approximation of the object,
for 2D target localization. Indeed, projected ellipses carry more geometric
information about the object geometry and pose (3D awareness) than traditional
3D-agnostic bounding box labels. Moreover, such a generic 3D ellipsoidal model
allows for approximating known to coarsely known targets. We then propose to
have a new look at ellipse regression and replace the discontinuous geometric
ellipse parameters with the parameters of an implicit Gaussian distribution
encoding object occupancy in the image. The models are trained to regress the
values of this bivariate Gaussian distribution over the image pixels using a
statistical loss function. We introduce a novel non-trainable differentiable
layer, E-DSNT, to extract the distribution parameters. Also, we describe how to
readily generate consistent 3D-aware Gaussian occupancy parameters using only
coarse dimensions of the target and relative pose labels. We extend three
existing spacecraft pose estimation datasets with 3D-aware Gaussian occupancy
labels to validate our hypothesis.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
GAUDILLIERE, Vincent ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
PAULY, Leo ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
RATHINAM, Arunkumar ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
Garcia Sanchez, Albert
MOHAMED ALI, Mohamed Adel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
AOUADA, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
3D-Aware Object Localization using Gaussian Implicit Occupancy Function
Date de publication/diffusion :
octobre 2023
Nom de la manifestation :
2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)
Date de la manifestation :
October 1 – 5, 2023
Manifestation à portée :
International
Titre de l'ouvrage principal :
2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu, and M. Pietikainen, "Deep learning for generic object detection: A survey," Int. J. Comput. Vis., vol. 128, no. 2, pp. 261-318, 2020.
W. Dong, P. Roy, C. Peng, and V. Isler, "Ellipse R-CNN: learning to infer elliptical object from clustering and occlusion," IEEE Trans. Image Process., vol. 30, pp. 2193-2206, 2021.
Z. Wang, N. Dong, S. D. Rosario, M. Xu, P. Xie, and E. P. Xing, "Ellipse detection of optic disc-and-cup boundary in fundus images," in 16th IEEE International Symposium on Biomedical Imaging (ISBI), 2019, pp. 601-604.
H. Lin, Z. Li, M. Shih, Y. Sun, and T. Shen, "Pupil localization for ophthalmic diagnosis using anchor ellipse regression," in 16th International Conference on Machine Vision Applications (MVA), 2019, pp. 1-5.
Y. Li, "Detecting lesion bounding ellipses with gaussian proposal networks," in MICCAI Workshops 2019, Proceedings. Springer, 2019, pp. 337-344.
S. Ren, K. He, R. B. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," in NIPS, 2015, pp. 91-99.
S. Pan, S. Fan, S. W. K. Wong, J. V. Zidek, and H. Rhodin, "Ellipse detection and localization with applications to knots in sawn lumber images," in IEEE Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 3891-3900.
T. Wang, C. Lu, M. Shao, X. Yuan, and S. Xia, "Eldet: An anchorfree general ellipse object detector," in Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 2580-2595.
M. Zins, G. Simon, and M.-O. Berger, "3d-aware ellipse prediction for object-based camera pose estimation," in 8th International Conference on 3D Vision, 3DV 2020. IEEE, 2020, pp. 281-290.
-, "Object-based visual camera pose estimation from ellipsoidal model and 3d-aware ellipse prediction," International Journal of Computer Vision, vol. 130, no. 4, pp. 1107-1126, 2022.
-, "Level set-based camera pose estimation from multiple 2d/3d ellipse-ellipsoid correspondences," in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.
P. F. Proenca and Y. Gao, "Deep learning for spacecraft pose estimation from photorealistic rendering," in IEEE International Conference on Robotics and Automation, (ICRA), 2020, pp. 6007-6013.
T. H. Park, M. Martens, G. Lecuyer, D. Izzo, and S. D'Amico, "Speed+: Next-generation dataset for spacecraft pose estimation across domain gap," in IEEE Aerospace Conference (AERO), 2022, pp. 1-15.
R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press, 2004.
T. Fan, G. Wang, Y. Li, and H. Wang, "Ma-net: A multi-scale attention network for liver and tumor segmentation," IEEE Access, vol. 8, pp. 179 656-179 665, 2020.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
A. Nibali, Z. He, S. Morgan, and L. Prendergast, "Numerical coordinate regression with convolutional neural networks," arXiv preprint arXiv:1801.07372, 2018.
C. Rubino, M. Crocco, and A. Del Bue, "3d object localisation from multi-view image detections," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1281-1294, 2018.
A. Rathinam, V. Gaudilliere, L. Pauly, and D. Aouada, "AKM Dataset: Textureless Space Target Dataset," https://zenodo.org/record/7744505, Sep 2022.
M. Kisantal, S. Sharma, T. H. Park, D. Izzo, M. Martens, and S. D'Amico, "Satellite pose estimation challenge: Dataset, competition design, and results," IEEE Trans. Aerosp. Electron. Syst., vol. 56, no. 5, pp. 4083-4098, 2020.
A. Rathinam, V. Gaudillìere, L. Pauly, and D. Aouada, "Pose Estimation of a Known Texture-Less Space Target using Convolutional Neural Networks," in 73th International Astronautical Congress (IAC), 2022.
L. Pauly, W. Rharbaoui, C. Shneider, A. Rathinam, V. Gaudillìere, and D. Aouada, "A survey on deep learning-based monocular spacecraft pose estimation: Current state, limitations and prospects," ArXiv preprint, vol. abs/2305.07348, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2305.07348
S. M. Catalogue, "Prisma (prototype)," https://www.eoportal.org/ satellite-missions/prisma-prototype#target-spacecraft.
M.-P. Dubuisson and A. Jain, "A modified hausdorff distance for object matching," in Proceedings of 12th International Conference on Pattern Recognition, vol. 1, 1994, pp. 566-568.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
A. Garcia, M. A. Musallam, V. Gaudilliere, E. Ghorbel, K. Al Ismaeil, M. Perez, and D. Aouada, "Lspnet: A 2d localization-oriented spacecraft pose estimation neural network," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021, pp. 2048-2056.
V. Gaudillìere, G. Simon, and M.-O. Berger, "Perspective-1-ellipsoid: Formulation, analysis and solutions of the camera pose estimation problem from one ellipse-ellipsoid correspondence," International Journal of Computer Vision, vol. 131, no. 9, pp. 2446-2470, 2023.
A. Rathinam, V. Gaudillìere, M. A. Mohamed Ali, M. Ortiz Del Castillo, L. Pauly, and D. Aouada, "SPARK 2022 Dataset : Spacecraft Detection and Trajectory Estimation," June 2022. [Online]. Available: https://doi.org/10.5281/zenodo.6599762