Article (Scientific journals)
A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs
Ilyas Azeem, Muhammad; ABUALHAIJA, Sallam
2024In Empirical Software Engineering, 29 (96)
Peer Reviewed verified by ORBi
 

Files


Full Text
2024-EMSE-AA.pdf
Author postprint (6.14 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Requirements Engineering (RE); The General Data Protection Regulation (GDPR); Regulatory Compliance; Data Processing Agreements (DPAs); Artificial Intelligence (AI); Natural Language Processing (NLP); Classification; Large Language Models (LLMs); Few-shot Learning (FSL); Data Augmentation
Abstract :
[en] Specifying legal requirements for software systems to ensure their compliance with the applicable regulations is a major concern of requirements engineering. Personal data which is collected by an organization is often shared with other organizations to perform certain processing activities. In such cases, the General Data Protection Regulation (GDPR) requires issuing a data processing agreement (DPA) which regulates the processing and further ensures that personal data remains protected. Violating GDPR can lead to huge fines reaching to billions of Euros. Software systems involving personal data processing must adhere to the legal obligations stipulated both at a general level in GDPR as well as the obligations outlined in DPAs highlighting specific business. In other words, a DPA is yet another source from which requirements engineers can elicit legal requirements. However, the DPA must be complete according to GDPR to ensure that the elicited requirements cover the complete set of obligations. Therefore, checking the completeness of DPAs is a prerequisite step towards developing a compliant system. Analyzing DPAs with respect to GDPR entirely manually is time consuming and requires adequate legal expertise. In this paper, we propose an automation strategy that addresses the completeness checking of DPAs against GDPR provisions as a text classification problem. Specifically, we pursue ten alternative solutions which are enabled by different technologies, namely traditional machine learning, deep learning, language modeling, and few-shot learning. The goal of our work is to empirically examine how these different technologies fare in the legal domain. We computed F2 score on a set of 30 real DPAs. Our evaluation shows that best-performing solutions yield F2 score of 86.7% and 89.7% are based on pre-trained BERT and RoBERTa language models. Our analysis further shows that other alternative solutions based on deep learning (e.g., BiLSTM) and few-shot learning (e.g., SetFit) can achieve comparable accuracy, yet are more efficient to develop.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SVV - Software Verification and Validation
NCER-FT - FinTech National Centre of Excellence in Research
Disciplines :
Computer science
Author, co-author :
Ilyas Azeem, Muhammad;  Unilu - University of Luxembourg [LU] > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
ABUALHAIJA, Sallam  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
External co-authors :
yes
Language :
English
Title :
A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs
Publication date :
14 June 2024
Journal title :
Empirical Software Engineering
ISSN :
1382-3256
eISSN :
1573-7616
Publisher :
Kluwer Academic Publishers, Netherlands
Volume :
29
Issue :
96
Peer reviewed :
Peer Reviewed verified by ORBi
FnR Project :
FNR16570468 - 2021 (01/07/2022-30/06/2030) - Yves Le Traon
Name of the research project :
U-AGR-7501 - NCER22/IS/16570468/NCER-FT_AFRICA_UL - BIANCULLI Domenico
Funders :
FNR - Fonds National de la Recherche
Funding number :
NCER22/IS/16570468/NCERFT; BRIDGES/19/IS/13759068/ARTAGO
Available on ORBilu :
since 27 November 2023

Statistics


Number of views
156 (24 by Unilu)
Number of downloads
72 (1 by Unilu)

OpenAlex citations
 
11

Bibliography


Similar publications



Contact ORBilu