Machine Learning Techniques for Suspicious Transaction Detection and Analysis

CAMINO, Ramiro Daniel

Download

Doctoral thesis (Dissertations and theses)

Machine Learning Techniques for Suspicious Transaction Detection and Analysis

CAMINO, Ramiro Daniel

2020

Permalink
https://hdl.handle.net/10993/44939

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

PhD_Thesis.pdf

Author postprint (5.27 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

machine learning; fraud detection; deep generative models; anti-money laundering; ripple; ethereum

Abstract :

[en] Financial services must monitor their transactions to prevent being used for money laundering and combat the financing of terrorism. Initially, organizations in charge of fraud regulation were only concerned about financial institutions such as banks. However, nowadays, the Fintech industry, online businesses, or platforms involving virtual assets can also be affected by similar criminal schemes. Regardless of the differences between the entities mentioned above, malicious activities affecting them share many common patterns. This dissertation's first goal is to compile and compare existing studies involving machine learning to detect and analyze suspicious transactions. The second goal is to synthesize methodologies from the last goal for tackling different use cases in an organized manner. Finally, the third goal is to assess the applicability of deep generative models for enhancing existing solutions. In the first part of the thesis, we propose an unsupervised methodology for detecting suspicious transactions applied to two case studies. One is related to transactions from a money remittance network, and the other is related to a novel payment network based on distributed ledger technologies. Anomaly detection algorithms are applied to rank user accounts based on recency, frequency, and monetary features. The results are manually validated by domain experts, confirming known scenarios and finding unexpected new cases. In the second part, we carry out an analogous analysis employing supervised methods, along with a case study where we classify Ethereum smart contracts into honeypots and non-honeypots. We take features from the source code, the transaction data, and the funds' flow characterization. The proposed classification models proved to generalize well to unseen honeypot instances and techniques and allowed us to characterize previously unknown techniques. In the third part, we analyze the challenges that tabular data brings into the domain of deep generative models, a particular type of data used to represent financial transactions in the previous two parts. We propose a new model architecture by adapting state-of-the-art methods to output multiple variables from mixed types distributions. Additionally, we extend the evaluation metrics used in the literature to the multi-output setting, and we show empirically that our approach outperforms the existing methods. Finally, in the last part, we extend the work from the third part by applying the presented models to enhance classification tasks from the second part, commonly containing a severe class imbalance. We introduce the multi-input architecture to expand models alongside our previously proposed multi-output architecture. We compare three techniques to sample from deep generative models defining a transparent and fair large-scale experimental protocol and interesting visual analysis tools. We showed that general machine learning detection and visualization techniques could help address the fraud detection domain's many challenges. In particular, deep generative models can add value to the classification task given the imbalanced nature of the fraudulent class, in exchange for implementation and time complexity. Future and promising applications for deep generative models include missing data imputation and sharing synthetic data or data generators preserving privacy constraints.

Disciplines :

Computer science

Author, co-author :

CAMINO, Ramiro Daniel ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC)

Language :

English

Title :

Machine Learning Techniques for Suspicious Transaction Detection and Analysis

Defense date :

08 October 2020

Number of pages :

136

Institution :

Unilu - University of Luxembourg, Luxembourg, Luxembourg

Degree :

Docteur de l'Université du Luxembourg en Informatique

Promotor :

STATE, Radu

President :

FRANK, Raphaël

Jury member :

AOUADA, Djamila

Fernández Slezak, Diego

Hammerschmidt, Christian

Focus Area :

Computational Sciences

FnR Project :

FNR11614300 - Advanced Market Abuse Detection With Big Data, 2017 (01/03/2017-14/10/2020) - Ramiro Daniel Camino

Available on ORBilu :

since 07 December 2020

Statistics

Number of views

447 (18 by Unilu)

Number of downloads

1522 (38 by Unilu)

More statistics