Requirements Traceability; Sentence Transformers (ST); Natural Language Processing (NLP); Machine Learning (ML); The General Data Protection Regulation (GDPR); Regulatory Compliance; Large Language Models (LLMs); RICE; Prompting Framework
Abstract :
[en] New regulations are continually introduced to ensure that software development complies with ethical concerns and prioritizes public safety. A prerequisite for demonstrating compliance involves tracing software requirements to legal provisions. Requirements traceability is a fundamental task where requirements engineers are supposed to analyze technical requirements against target artifacts, often under a limited time budget. Doing this analysis manually for complex systems with hundreds of requirements is infeasible. The legal dimension introduces additional challenges that increase manual effort. In this paper, we investigate two automated solutions based on language models, including large ones (LLMs). The first solution, K ashif, is a classifier that leverages sentence transformers and semantic similarity. The second solution, Rice_LRT, prompts a recent LLM based on Rice, a prompt engineering framework. Using a publicly available benchmark dataset, we empirically evaluate K ashif and compare it against seven baseline classifiers from the literature (LSI, LDA, GloVe, TraceBERT, RoBERTa, and LLaMa). K ashif can identify trace links with F2 score of ~63%, outperforming the best baseline by a substantial margin of 21 percentage points (pp) in F2 score. On a newly created and more complex requirements document traced to the European general data protection regulation (GDPR), Rice_LRT outperforms K ashif and baseline prompts in the literature by achieving an average recall of 84% and F2 score of 61%, improving the F2 score by 34 pp compared to the best baseline prompt. Our results indicate that requirements traceability in legal contexts cannot be adequately addressed by techniques proposed in the literature that are not specifically designed for legal artifacts. Furthermore, we demonstrate that our engineered prompt outperforms both classifier-based approaches and baseline prompts.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SVV - Software Verification and Validation
Disciplines :
Computer science
Author, co-author :
Etezadi, Romina
ABUALHAIJA, Sallam ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
Arora, Chetan
Briand, Lionel
External co-authors :
yes
Language :
English
Title :
Classifier or prompt: A case study on legal requirements traceability