Doctoral thesis (Dissertations and theses)
Enhancing Machine Learning Security: The Significance of Realistic Adversarial Examples
DYRMISHI, Salijona
2024
 

Files


Full Text
PhD_Thesis_Salijona_Dyrmishi_Final.pdf
Author postprint (5.14 MB) Creative Commons License - Attribution
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] Adversarial attacks pose a significant security threat in Machine Learning (ML), employing subtle, invisible perturbations on original examples to craft instances that deceive model decisions. While extensively studied in computer vision and diverse domains such as credit scoring, cybersecurity, cyber-physical systems, and natural language processing, recent findings reveal limitations in traditional adversarial attacks. These approaches often yield examples that lack realism, failing to map to real-world objects or adhere to imperceptibility requirements. The field of realistic adversarial attacks and their implications on the robustness of real-world systems is currently under-explored in the literature. Through this thesis we demonstrate the importance of realism in adversarial attacks, the conditions on which these attacks are realistic, and propose new strategies to upgrade current attacks from unrealistic to realistic. As a first contribution, we shed light on the importance of producing realistic adversarial examples when hardening models against realistic attacks. We use three real-world use cases (text classification, botnet detection, malware detection) and seven datasets in order to evaluate whether unrealistic adversarial examples can be used to protect models against realistic examples. Our results reveal discrepancies across the use cases, where unrealistic examples can either be as effective as the realistic ones or may offer only limited improvement. Second, to explain these results, we analyze the latent representation of the adversarial examples generated with realistic and unrealistic attacks. We investigate the patterns that discriminate which unrealistic examples can be used for effective hardening. As a second contribution, we evaluate the realism of the adversarial examples generated by textual attacks. The current attacks ignore the property of imperceptibility or study it under limited settings. This entails that adversarial perturbations would not pass any human quality gate and do not represent real threats to human-checked NLP systems. To bypass this limitation and enable proper assessment (and later, improvement) of NLP model robustness, we have surveyed 378 human participants about the perceptibility of text adversarial examples produced by state-of-the-art methods. Our results underline that existing text attacks are impractical in real-world scenarios where humans are involved. This contrasts with previous smaller-scale human studies, which reported overly optimistic conclusions regarding attacks' success. Through our work, we hope to position human perceptibility as a first-class success criterion for text attacks, and provide guidance for research to build effective attack algorithms and, in turn, design appropriate defence mechanisms. As a final contribution of this thesis, we enhance adversarial attacks based on Deep Generative Models (DGMs) by introducing a constrained layer to generate more realistic examples for tabular datasets. DGMs are widely utilized for synthesizing data that mirrors the distribution of original data, serving purposes like dataset augmentation, fairness promotion, and sensitive information protection. DGMs have also been leveraged for adversarial generation processes (AdvDGMs). Despite their proficiency in modeling complex distributions, DGMs often produce outputs that violate background knowledge expressed through numerical constraints, resulting in lower success rates for AdvDGMs. To address this limitation, we transform DGMs into Constrained Deep Generative Models (C-DGMs) using a Constraint Layer (CL) that repairs violated constraints. Moreover, we extend these C-DGMs to C-AdvDMs for generating realistic adversarial examples. Our experiments demonstrate that integrating this constraint layer, which incorporates background knowledge, not only enhances the quality of sampled data for machine learning model training but also improves the success rate of AdvDGMs. Notably, these enhancements are achieved without compromising sample generation speed.
Disciplines :
Computer science
Author, co-author :
DYRMISHI, Salijona ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Language :
English
Title :
Enhancing Machine Learning Security: The Significance of Realistic Adversarial Examples
Defense date :
28 June 2024
Institution :
Unilu - University of Luxembourg [Faculty of Science, Technology and Medicine], Esch-sur-Alzette, Luxembourg
Degree :
Docteur de l'Université du Luxembourg en Informatique
Jury member :
CORDY, Maxime  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
BISSYANDE, Tegawendé François d Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
PAPADAKIS, Mike ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Cavallaro, Lorenzo;  UCL - University College London [GB]
Zhang, Yang;  CISPA Helmholtz Center for Information Security
FnR Project :
FNR14585105 - Search-based Adversarial Testing Under Domain-specific Constraints, 2020 (01/10/2020-30/09/2024) - Salijona Dyrmishi
Available on ORBilu :
since 05 July 2024

Statistics


Number of views
236 (34 by Unilu)
Number of downloads
311 (59 by Unilu)

Bibliography


Similar publications



Contact ORBilu