Thèse de doctorat (Mémoires et thèses)
Enhancing Federated Learning for Financial Sector via Graph Learning and Language Models
DAMOUN, Farouk
2024
 

Documents


Texte intégral
Damoun_PhD_Thesis_Final.pdf
Postprint Auteur (13.9 MB)
Télécharger

Tous les documents dans ORBilu sont protégés par une licence d'utilisation.

Envoyer vers



Détails



Mots-clés :
Finance; Privacy-Preserving Machine Learning (ML); Federated Learning (FL), Graph Neural Networks (GNN); Language Model Tokenizer (LLM)
Résumé :
[en] In the modern financial sector, the need for robust machine learning models is increasingly critical, yet privacy regulations and competitive concerns often make centralized data inaccessible. To overcome these challenges, this dissertation proposes several novel Federated Learning (FL) methodologies that enable institutions to collaboratively train models while addressing the critical trade-offs between privacy and data utility by integrating privacy-preserving mechanisms designed to prevent input recovery with minimal loss to data utility. A key contribution of this research is the development of a federated learning framework for Privacy-Preserving Behavioral Anomaly Detection and fraud detection in financial transactions. By utilizing Graph Neural Networks (GNNs) on dynamic ego-centric graphs, the framework captures evolving transactional patterns to detect anomalies effectively, while preserving privacy. A novel domain-specific negative sampling technique enables model training without the need for labeled data from the federation participants, making it highly applicable in real-world scenarios. The results demonstrate that deep learning-based methods, particularly graph-level embedding, outperform traditional approaches in anomaly detection and improving fraud detection tasks, by introducing anonymization and noise-based mechanisms, even when the shared model gradients are exposed. Additionally, we propose G-HIN2Vec, a graph-level embedding technique for heterogeneous information networks, which models individuals, such as cardholders, using static and dynamic ego-centric graphs. This method serves as an anonymization mechanism that eliminates the need for personally identifiable information (PII) in federated models. By integrating Personalized Local Differential Privacy (PLDP), we provide an additional layer of protection, ensuring that even in the event of a model breach, sensitive data remains secure. Finally, the dissertation introduces the Federated Byte-Level Byte Pair Encoding (BPE) Tokenizer, a novel privacy-preserving tokenization approach designed for distributed textual datasets. This tokenizer outperforms existing models in vocabulary coverage and efficiency, while maintaining rigorous data privacy. Our federated tokenizer not only competes with centralized models but also demonstrates improvements in both text compression and privacy preservation, for both general and domain-specific tokenizers. The methodologies presented in this dissertation, validated through real-world transaction and textual financial datasets, highlight the transformative potential of federated learning to enhance fraud detection and language model performance while preserving privacy of individuals and institutions through anonymization and noise-based privacy mechanisms.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
DAMOUN, Farouk ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > SEDAN > Team Radu STATE
Langue du document :
Anglais
Titre :
Enhancing Federated Learning for Financial Sector via Graph Learning and Language Models
Date de soutenance :
16 décembre 2024
Institution :
Unilu - University of Luxembourg [Interdisciplinary Centre for Security, Reliability and Trust], Luxembourg, Luxembourg
UCBL - Université Claude Bernard. Lyon 1, Lyon, France
Intitulé du diplôme :
Docteur en Informatique (DIP_DOC_0006_B)
Promoteur :
STATE, Radu  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN
Hamida SEBA;  UCBL - Université Claude Bernard. Lyon 1
Président du jury :
BISSYANDE, Tegawendé François d Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
Rémi BADONNEL
Secrétaire :
GREGOIRE, Valérie ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SNT Office > Assistance
Membre du jury :
FACI, Noura
CHERKAOUI, Omar
BRORSSON, Mats
Hilger, Jean
Projet FnR :
FNR15829274 - Federated Learning And Graph Neural Networks For Retail Banking, 2021 (01/04/2021-31/10/2023) - Farouk Damoun
Intitulé du projet de recherche :
ANR-20-CE39-0008
Subventionnement (détails) :
The work is supported by the Fonds National de la Recherche (FNR) of Luxembourg through the Industrial Fellowship program (grant number 15829274), with additional funding from ANR-20-CE39-0008.
Disponible sur ORBilu :
depuis le 03 mars 2025

Statistiques


Nombre de vues
159 (dont 13 Unilu)
Nombre de téléchargements
89 (dont 3 Unilu)

Bibliographie


Publications similaires



Contacter ORBilu