Enhancing Machine Learning Robustness for Critical Industrial Systems: Constrained Adversarial Attacks and Distribution Drift Solutions

[en] Model robustness weaknesses significantly hinder the adoption of ML-enhanced systems in critical real-world contexts. Adversarial examples and the natural decay of model performance over time (known as distribution drift) are major challenges. Current methods of evaluation do not sufficiently take into account the real context and can lead to misleading evaluations. While methods to evaluate robustness against adversarial examples in Computer Vision (CV) and Natural Language Processing (NLP) are well-studied, their application to tabular data remains scarcely explored. These methods often produce unrealistic feature vectors that do not represent feasible domain objects. Additionally, existing approaches to mitigate performance decay often overlook practical issues like labeling and deployment delays, leading to an overestimation of their effectiveness. In collaboration with BGL BNP Paribas, we address these challenges by demonstrating the importance of realistic robustness evaluation for ML models in critical contexts. The contributions of the thesis are as follows. We improve the evaluation of critical industrial systems by proposing attacks that generate more realistic adversarial examples. Specifically, we propose three algorithms that produce adversarial examples that satisfy domain constraints (a necessary condition for the example to occur in reality). We develop new defenses to robustify models against such adversarial examples by combining synthetic data generators with adversarial training (a process in which adversarial examples are integrated into the training). These defenses together with the realistic evaluation of models' robustness form TabularBench, a benchmark providing 200+ models across 5 datasets trained with 14 different training methods. We are confident that our benchmark will accelerate the research of adversarial defenses for tabular ML. We apply our contributions to a real-world use case from BGL BNP Paribas and show how our defenses can robustify critical ML systems. As a final contribution of this thesis, we consider the robustness of critical systems to distribution drifts that produce performance decay. Inspired by a real-world use case from BGL BNP Paribas, we reassess model retraining strategies by considering industrial constraints such as labeling and deployment delays. Using a realistic protocol, we benchmark 17 retraining strategies and find that ignoring delays overestimates effectiveness and alters method rankings. The optimal drift detection method without delay does not remain optimal if delays occur, highlighting the necessity for realistic evaluation protocols. Overall, the objective of this dissertation is to bridge gaps between the current assumption of evaluating and robustifying ML models in the literature and the reality of ML models deployed in critical industrial systems.

Disciplines :

Computer science

Author, co-author :

SIMONETTO, Thibault Jean Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

Language :

English

Title :

Enhancing Machine Learning Robustness for Critical Industrial Systems: Constrained Adversarial Attacks and Distribution Drift Solutions

Defense date :

04 October 2024

Institution :

Unilu - University of Luxembourg [Faculty of Sciences, Technology and Medecine], Luxembourg, Luxembourg

Degree :

Docteur de l'Université du Luxembourg en Informatique

Jury member :

CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

BISSYANDE, Tegawendé François d Assise ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX

TENEY Damien; Idiap Research Institute > Machine learning

GHAMIZI Salah; LIST - Luxembourg Institute of Science and Technology [LU] > Intelligent Clean Energy Systems

Focus Area :

Computational Sciences

Name of the research project :

R-AGR-3612 - BGL - Robust Machine Learn. based syst - LE TRAON Yves

Available on ORBilu :

since 18 October 2024

Statistics

Number of views

55 (15 by Unilu)

Number of downloads

52 (6 by Unilu)

More statistics