References of "Abualhaija, Sallam 50029496"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailAutomated Question Answering for Improved Understanding of Compliance Requirements: A Multi-Document Study
Abualhaija, Sallam UL; Arora, Chetan; Sleimi, Amin et al

in In Proceedings of the 30th IEEE International Requirements Engineering Conference (RE'22), Melbourne, Australia 15-19 August 2022 (in press)

Software systems are increasingly subject to regulatory compliance. Extracting compliance requirements from regulations is challenging. Ideally, locating compliance-related information in a regulation ... [more ▼]

Software systems are increasingly subject to regulatory compliance. Extracting compliance requirements from regulations is challenging. Ideally, locating compliance-related information in a regulation requires a joint effort from requirements engineers and legal experts, whose availability is limited. However, regulations are typically long documents spanning hundreds of pages, containing legal jargon, applying complicated natural language structures, and including cross-references, thus making their analysis effort-intensive. In this paper, we propose an automated question-answering (QA) approach that assists requirements engineers in finding the legal text passages relevant to compliance requirements. Our approach utilizes large-scale language models fine-tuned for QA, including BERT and three variants. We evaluate our approach on 107 question-answer pairs, manually curated by subject-matter experts, for four different European regulatory documents. Among these documents is the general data protection regulation (GDPR) – a major source for privacy-related requirements. Our empirical results show that, in ~94% of the cases, our approach finds the text passage containing the answer to a given question among the top five passages that our approach marks as most relevant. Further, our approach successfully demarcates, in the selected passage, the right answer with an average accuracy of ~ 91%. [less ▲]

Detailed reference viewed: 131 (5 UL)
Full Text
Peer Reviewed
See detailAutomated Handling of Anaphoric Ambiguity in Requirements: A Multi-solution Study
Ezzini, Saad UL; Abualhaija, Sallam UL; Arora, Chetan et al

in In Proceedings of the 44th International Conference on Software Engineering (ICSE'22), Pittsburgh, PA, USA 22-27 May 2022 (in press)

Ambiguity is a pervasive issue in natural-language requirements. A common source of ambiguity in requirements is when a pronoun is anaphoric. In requirements engineering, anaphoric ambiguity occurs when a ... [more ▼]

Ambiguity is a pervasive issue in natural-language requirements. A common source of ambiguity in requirements is when a pronoun is anaphoric. In requirements engineering, anaphoric ambiguity occurs when a pronoun can plausibly refer to different entities and thus be interpreted differently by different readers. In this paper, we develop an accurate and practical automated approach for handling anaphoric ambiguity in requirements, addressing both ambiguity detection and anaphora interpretation. In view of the multiple competing natural language processing (NLP) and machine learning (ML) technologies that one can utilize, we simultaneously pursue six alternative solutions, empirically assessing each using a collection of ~1,350 industrial requirements. The alternative solution strategies that we consider are natural choices induced by the existing technologies; these choices frequently arise in other automation tasks involving natural-language requirements. A side-by-side empirical examination of these choices helps develop insights about the usefulness of different state-of-the-art NLP and ML technologies for addressing requirements engineering problems. For the ambiguity detection task, we observe that supervised ML outperforms both a large-scale language model, SpanBERT (a variant of BERT), as well as a solution assembled from off-the-shelf NLP coreference resolvers. In contrast, for anaphora interpretation, SpanBERT yields the most accurate solution. In our evaluation, (1) the best solution for anaphoric ambiguity detection has an average precision of ~60% and a recall of 100%, and (2) the best solution for anaphora interpretation (resolution) has an average success rate of ~98%. [less ▲]

Detailed reference viewed: 175 (8 UL)
Full Text
Peer Reviewed
See detailAI-enabled Automation for Completeness Checking of Privacy Policies
Amaral Cejas, Orlando UL; Abualhaija, Sallam UL; Torre, Damiano et al

in IEEE Transactions on Software Engineering (2021)

Technological advances in information sharing have raised concerns about data protection. Privacy policies containprivacy-related requirements about how the personal data of individuals will be handled by ... [more ▼]

Technological advances in information sharing have raised concerns about data protection. Privacy policies containprivacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g.,a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). Aprerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisionsof GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related softwarespecifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automationfor the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterizethe privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop anautomated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machinelearning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against thecompleteness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseenprivacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword searchonly, our approach results in an improvement of 24.5% in precision and 38% in recall. [less ▲]

Detailed reference viewed: 153 (26 UL)
Full Text
Peer Reviewed
See detailA Model-based Conceptualization of Requirements for Compliance Checking of Data Processing against GDPR
Amaral Cejas, Orlando UL; Abualhaija, Sallam UL; Sabetzadeh, Mehrdad UL et al

in 2020 IEEE Eleventh International Model-Driven Requirements Engineering (MoDRE) (2021, September)

The General Data Protection Regulation (GDPR) has been recently introduced to harmonize the different data privacy laws across Europe. Whether inside the EU or outside, organizations have to comply with ... [more ▼]

The General Data Protection Regulation (GDPR) has been recently introduced to harmonize the different data privacy laws across Europe. Whether inside the EU or outside, organizations have to comply with the GDPR as long as they handle personal data of EU residents. The organizations with whom personal data is shared are referred to as data controllers. When controllers subcontract certain services that involve processing personal data to service providers (also known as data processors), then a data processing agreement (DPA) has to be issued. This agreement regulates the relationship between the controllers and processors and also ensures the protection of individuals’ personal data. Compliance with the GDPR is challenging for organizations since it is large and relies on complex legal concepts. In this paper, we draw on model-driven engineering to build a machine-analyzable conceptual model that characterizes DPA-related requirements in the GDPR. Further, we create a set of criteria for checking the compliance of a given DPA against the GDPR and discuss how our work in this paper can be adapted to develop an automated compliance checking solution. [less ▲]

Detailed reference viewed: 118 (20 UL)
Full Text
Peer Reviewed
See detailUsing Domain-specific Corpora for Improved Handling of Ambiguity in Requirements
Ezzini, Saad UL; Abualhaija, Sallam UL; Arora, Chetan et al

in In Proceedings of the 43rd International Conference on Software Engineering (ICSE'21), Madrid 25-28 May 2021 (2021, May)

Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity ... [more ▼]

Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity in requirements is tedious and time-consuming, and may further overlook unacknowledged ambiguity – the situation where different stakeholders perceive a requirement as unambiguous but, in reality, interpret the requirement differently. In this paper, we propose an automated approach that uses natural language processing for handling ambiguity in requirements. Our approach is based on the automatic generation of a domain-specific corpus from Wikipedia. Integrating domain knowledge, as we show in our evaluation, leads to a significant positive improvement in the accuracy of ambiguity detection and interpretation. We scope our work to coordination ambiguity (CA) and prepositional-phrase attachment ambiguity (PAA) because of the prevalence of these types of ambiguity in natural-language requirements [1]. We evaluate our approach on 20 industrial requirements documents. These documents collectively contain more than 5000 requirements from seven distinct application domains. Over this dataset, our approach detects CA and PAA with an average precision of 80% and an average recall of 89% (90% for cases of unacknowledged ambiguity). The automatic interpretations that our approach yields have an average accuracy of 85%. Compared to baselines that use generic corpora, our approach, which uses domain-specific corpora, has 33% better accuracy in ambiguity detection and 16% better accuracy in interpretation. [less ▲]

Detailed reference viewed: 159 (18 UL)
Full Text
Peer Reviewed
See detailMAANA: An Automated Tool for DoMAin-specific HANdling of Ambiguity
Ezzini, Saad UL; Abualhaija, Sallam UL; Arora, Chetan et al

in Companion Proceedings of the 43rd International Conference on Software Engineering (2021, May)

MAANA (in Arabic: “meaning”) is a tool for performing domain-specific handling of ambiguity in requirements. Given a requirements document as input, MAANA detects the requirements that are potentially ... [more ▼]

MAANA (in Arabic: “meaning”) is a tool for performing domain-specific handling of ambiguity in requirements. Given a requirements document as input, MAANA detects the requirements that are potentially ambiguous. The focus of MAANA is on coordination ambiguity and prepositional-phrase attachment ambiguity; these are two common ambiguity types that have been studied in the requirements engineering literature. To detect ambiguity, MAANA utilizes structural patterns and a set of heuristics derived from a domain-specific corpus. The generated analysis file after running the tool can be reviewed by requirements analysts. Through combining different knowledge sources, MAANA highlights also the requirements that might contain unacknowledged ambiguity. That is when the analysts understand different interpretations for the same requirement, without explicitly discussing it with the other analysts due to time constraints. This artifact paper presents the details of MAANA. MAANA is associated with the ICSE 2021 technical paper titled “Using Domain-specific Corpora for Improved Handling of Ambiguity in Requirements”. The tool is publicly available on GitHub and Zenodo. [less ▲]

Detailed reference viewed: 46 (2 UL)
Full Text
Peer Reviewed
See detailAn AI-assisted Approach for Checking the Completeness of Privacy Policies Against GDPR
Torre, Damiano UL; Abualhaija, Sallam UL; Sabetzadeh, Mehrdad UL et al

in in Proceedings of the 28th IEEE International Requirements Engineering Conference (RE’20) (2020, September)

Detailed reference viewed: 450 (55 UL)
Full Text
Peer Reviewed
See detailAutomated Demarcation of Requirements in Textual Specifications: A Machine Learning-Based Approach
Abualhaija, Sallam UL; Arora, Chetan; Sabetzadeh, Mehrdad UL et al

in Empirical Software Engineering (2020)

A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle, by following suitable ... [more ▼]

A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle, by following suitable writing and markup conventions, one can provide an immediate and unequivocal demarcation of requirements at the time a specification is being developed. However, neither the presence nor a fully accurate enforcement of such conventions is guaranteed. The result is that, in many practical situations, analysts end up resorting to after-the-fact reviews for sifting requirements from other material in a requirements specification. This is both tedious and time-consuming. We propose an automated approach for demarcating requirements in free-form requirements specifications. The approach, which is based on machine learning, can be applied to a wide variety of specifications in different domains and with different writing styles. We train and evaluate our approach over an independently labeled dataset comprised of 33 industrial requirements specifications. Over this dataset, our approach yields an average precision of 81.2% and an average recall of 95.7%. Compared to simple baselines that demarcate requirements based on the presence of modal verbs and identifiers, our approach leads to an average gain of 16.4% in precision and 25.5% in recall. We collect and analyze expert feedback on the demarcations produced by our approach for industrial requirements specifications. The results indicate that experts find our approach useful and efficient in practice.We developed a prototype tool, named DemaRQ, in support of our approach. To facilitate replication, we make available to the research community this prototype tool alongside the non-proprietary portion of our training data. [less ▲]

Detailed reference viewed: 370 (36 UL)
Full Text
Peer Reviewed
See detailA Machine Learning-Based Approach for Demarcating Requirements in Textual Specifications
Abualhaija, Sallam UL; Arora, Chetan UL; Sabetzadeh, Mehrdad UL et al

in 27th IEEE International Requirements Engineering Conference (RE'19) (2019)

A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle, by following suitable ... [more ▼]

A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle, by following suitable writing and markup conventions, one can provide an immediate and unequivocal demarcation of requirements at the time a specification is being developed. However, neither the presence nor a fully accurate enforcement of such conventions is guaranteed. The result is that, in many practical situations, analysts end up resorting to after-the-fact reviews for sifting requirements from other material in a requirements specification. This is both tedious and time-consuming. We propose an automated approach for demarcating requirements in free-form requirements specifications. The approach, which is based on machine learning, can be applied to a wide variety of specifications in different domains and with different writing styles. We train and evaluate our approach over an independently labeled dataset comprised of 30 industrial requirements specifications. Over this dataset, our approach yields an average precision of 81.2% and an average recall of 95.7%. Compared to simple baselines that demarcate requirements based on the presence of modal verbs and identifiers, our approach leads to an average gain of 16.4% in precision and 25.5% in recall. [less ▲]

Detailed reference viewed: 629 (63 UL)