TOWARDS COMPACT YET EFFECTIVE MULTI-LABEL IMAGE CLASSIFICATION: FROM A SINGLE DOMAIN TO MULTIPLE DOMAINS

machine learning; deep learning; multi-label image classification; graph convolutional networks; unsupervised domain adaptation; image processing; deepfakes; pattern recognition

Abstract :

[en] Multi-label Image Classification (MLIC) is an active research topic within the Computer Vision community. Its objective is to simultaneously identify the presence or absence of multiple objects within an image. Thanks to its practical usefulness, MLIC finds its way into numerous fields of applications such as human action recognition, multi-attribute predictions, and semantic segmentation. With the widespread availability of large-scale labeled datasets and the recent advancements in deep learning, numerous methods have achieved remarkable performance for the task of MLIC. Despite their proven success, these methods usually employ very deep neural networks leading to cumbersome architectures. This strategy practically limits their applicability in a memory-constrained scenario. Additionally, existing MLIC methods usually assume that the test data comes from the same domain as the training data. However, this might not always be true which leads to poor generalization of these methods on the images from unseen domains. To overcome this challenge, commonly known as domain-shift, very few methods have been proposed that extend the concept of Unsupervised Domain Adaptation (UDA) from single-label classification to multi-label classification. However, due to the inherent differences between the two problems, this extension might yield sub-optimal results. Additionally, these methods also suffer from the use of cumbersome network architectures. In this thesis, we propose to tackle the aforementioned challenges in MLIC, going from the simple scenario where a single-label domain is considered to a more challenging setup involving a cross-domain setting. In the first part of this thesis, we aim to efficiently and effectively model the relationship between multiple objects while keeping a moderate-size architecture for the general task of MLIC. Specifically, we make use of Graph Convolutional Networks (GCN) to model the label relationships using an adaptive graph learning strategy. In the second part of this work, we focus on tackling the domain shift that degrades the performance of MLIC methods. For that purpose, we propose novel UDA approaches specifically tailored to the task of MLIC. The proposed solutions are evaluated on several benchmarks to demonstrate their effectiveness with respect to the state-of-the-art methods.

Disciplines :

Computer science

Author, co-author :

SINGH, Inder Pal ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

Language :

English

Title :

TOWARDS COMPACT YET EFFECTIVE MULTI-LABEL IMAGE CLASSIFICATION: FROM A SINGLE DOMAIN TO MULTIPLE DOMAINS

Defense date :

12 July 2024

Institution :

Unilu - University of Luxembourg [The Faculty of Science, Technology and Medicine], Kirchberg, Luxembourg

Degree :

Docteur en Informatique (DIP_DOC_0006_B)

Jury member :

AOUADA, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

HEIN, Andreas ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SPASYS

GHORBEL, Enjie ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > CVI2 > Team Djamila AOUADA ; University of Manouba

HARTMANN, Thomas; DataThings, Luxembourg

AINOUZ Samia; INSA Rouen Normandie

Focus Area :

Computational Sciences

Available on ORBilu :

since 24 July 2024

Statistics

Number of views

182 (25 by Unilu)

Number of downloads

117 (11 by Unilu)

More statistics