![]() Torre, Damiano ![]() ![]() ![]() in Software and Systems Modeling (in press) In Europe and indeed worldwide, the Gen- eral Data Protection Regulation (GDPR) provides pro- tection to individuals regarding their personal data in the face of new technological developments. GDPR is ... [more ▼] In Europe and indeed worldwide, the Gen- eral Data Protection Regulation (GDPR) provides pro- tection to individuals regarding their personal data in the face of new technological developments. GDPR is widely viewed as the benchmark for data protection and privacy regulations that harmonizes data privacy laws across Europe. Although the GDPR is highly ben- e cial to individuals, it presents signi cant challenges for organizations monitoring or storing personal infor- mation. Since there is currently no automated solution with broad industrial applicability, organizations have no choice but to carry out expensive manual audits to ensure GDPR compliance. In this paper, we present a complete GDPR UML model as a rst step towards de- signing automated methods for checking GDPR compli- ance. Given that the practical application of the GDPR is infuenced by national laws of the EU Member States,we suggest a two-tiered description of the GDPR, generic and specialized. In this paper, we provide (1) the GDPR conceptual model we developed with complete trace- ability from its classes to the GDPR, (2) a glossary to help understand the model, (3) the plain-English de- scription of 35 compliance rules derived from GDPR along with their encoding in OCL, and (4) the set of 20 variations points derived from GDPR to specialize the generic model. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research. [less ▲] Detailed reference viewed: 118 (18 UL)![]() Ezzini, Saad ![]() ![]() ![]() in In Proceedings of the 45th International Conference on Software Engineering (ICSE'23), Melbourne 14-20 May 2023 (in press) Abstract—By virtue of being prevalently written in natural language (NL), requirements are prone to various defects, e.g., inconsistency and incompleteness. As such, requirements are frequently subject to ... [more ▼] Abstract—By virtue of being prevalently written in natural language (NL), requirements are prone to various defects, e.g., inconsistency and incompleteness. As such, requirements are frequently subject to quality assurance processes. These processes, when carried out entirely manually, are tedious and may further overlook important quality issues due to time and budget pressures. In this paper, we propose QAssist – a question-answering (QA) approach that provides automated assistance to stakeholders, including requirements engineers, during the analysis of NL requirements. Posing a question and getting an instant answer is beneficial in various quality-assurance scenarios, e.g., incompleteness detection. Answering requirements-related questions automatically is challenging since the scope of the search for answers can go beyond the given requirements specification. To that end, QAssist provides support for mining external domain-knowledge resources. Our work is one of the first initiatives to bring together QA and external domain knowledge for addressing requirements engineering challenges. We evaluate QAssist on a dataset covering three application domains and containing a total of 387 question-answer pairs. We experiment with state-of-the-art QA methods, based primarily on recent large-scale language models. In our empirical study, QAssist localizes the answer to a question to three passages within the requirements specification and within the external domain-knowledge resource with an average recall of 90.1% and 96.5%, respectively. QAssist extracts the actual answer to the posed question with an average accuracy of 84.2%. Index Terms—Natural-language Requirements, Question Answering (QA), Language Models, Natural Language Processing (NLP), Natural Language Generation (NLG), BERT, T5. [less ▲] Detailed reference viewed: 107 (8 UL)![]() Amaral Cejas, Orlando ![]() ![]() in IEEE Transactions on Software Engineering (2021) Technological advances in information sharing have raised concerns about data protection. Privacy policies containprivacy-related requirements about how the personal data of individuals will be handled by ... [more ▼] Technological advances in information sharing have raised concerns about data protection. Privacy policies containprivacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g.,a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). Aprerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisionsof GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related softwarespecifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automationfor the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterizethe privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop anautomated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machinelearning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against thecompleteness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseenprivacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword searchonly, our approach results in an improvement of 24.5% in precision and 38% in recall. [less ▲] Detailed reference viewed: 234 (39 UL)![]() Amaral Cejas, Orlando ![]() ![]() ![]() in 2020 IEEE Eleventh International Model-Driven Requirements Engineering (MoDRE) (2021, September) The General Data Protection Regulation (GDPR) has been recently introduced to harmonize the different data privacy laws across Europe. Whether inside the EU or outside, organizations have to comply with ... [more ▼] The General Data Protection Regulation (GDPR) has been recently introduced to harmonize the different data privacy laws across Europe. Whether inside the EU or outside, organizations have to comply with the GDPR as long as they handle personal data of EU residents. The organizations with whom personal data is shared are referred to as data controllers. When controllers subcontract certain services that involve processing personal data to service providers (also known as data processors), then a data processing agreement (DPA) has to be issued. This agreement regulates the relationship between the controllers and processors and also ensures the protection of individuals’ personal data. Compliance with the GDPR is challenging for organizations since it is large and relies on complex legal concepts. In this paper, we draw on model-driven engineering to build a machine-analyzable conceptual model that characterizes DPA-related requirements in the GDPR. Further, we create a set of criteria for checking the compliance of a given DPA against the GDPR and discuss how our work in this paper can be adapted to develop an automated compliance checking solution. [less ▲] Detailed reference viewed: 155 (24 UL)![]() Veizaga Campero, Alvaro Mario ![]() ![]() ![]() in Empirical Software Engineering (2021), 26(4), 79 [Context] Natural language (NL) is pervasive in software requirements specifications (SRSs). However, despite its popularity and widespread use, NL is highly prone to quality issues such as vagueness ... [more ▼] [Context] Natural language (NL) is pervasive in software requirements specifications (SRSs). However, despite its popularity and widespread use, NL is highly prone to quality issues such as vagueness, ambiguity, and incompleteness. Controlled natural languages (CNLs) have been proposed as a way to prevent quality problems in requirements documents, while maintaining the flexibility to write and communicate requirements in an intuitive and universally understood manner. [Objective] In collaboration with an industrial partner from the financial domain, we systematically develop and evaluate a CNL, named Rimay, intended at helping analysts write functional requirements. [Method] We rely on Grounded Theory for building Rimay and follow well-known guidelines for conducting and reporting industrial case study research. [Results] Our main contributions are: (1) a qualitative methodology to systematically define a CNL for functional requirements; this methodology is intended to be general for use across information-system domains, (2) a CNL grammar to represent functional requirements; this grammar is derived from our experience in the financial domain, but should be applicable, possibly with adaptations, to other information-system domains, and (3) an empirical evaluation of our CNL (Rimay) through an industrial case study. Our contributions draw on 15 representative SRSs, collectively containing 3215 NL requirements statements from the financial domain. [Conclusion] Our evaluation shows that Rimay is expressive enough to capture, on average, 88% (405 out of 460) of the NL requirements statements in four previously unseen SRSs from the financial domain. [less ▲] Detailed reference viewed: 655 (59 UL)![]() Ezzini, Saad ![]() ![]() in Companion Proceedings of the 43rd International Conference on Software Engineering (2021, May) MAANA (in Arabic: “meaning”) is a tool for performing domain-specific handling of ambiguity in requirements. Given a requirements document as input, MAANA detects the requirements that are potentially ... [more ▼] MAANA (in Arabic: “meaning”) is a tool for performing domain-specific handling of ambiguity in requirements. Given a requirements document as input, MAANA detects the requirements that are potentially ambiguous. The focus of MAANA is on coordination ambiguity and prepositional-phrase attachment ambiguity; these are two common ambiguity types that have been studied in the requirements engineering literature. To detect ambiguity, MAANA utilizes structural patterns and a set of heuristics derived from a domain-specific corpus. The generated analysis file after running the tool can be reviewed by requirements analysts. Through combining different knowledge sources, MAANA highlights also the requirements that might contain unacknowledged ambiguity. That is when the analysts understand different interpretations for the same requirement, without explicitly discussing it with the other analysts due to time constraints. This artifact paper presents the details of MAANA. MAANA is associated with the ICSE 2021 technical paper titled “Using Domain-specific Corpora for Improved Handling of Ambiguity in Requirements”. The tool is publicly available on GitHub and Zenodo. [less ▲] Detailed reference viewed: 91 (5 UL)![]() Ezzini, Saad ![]() ![]() in In Proceedings of the 43rd International Conference on Software Engineering (ICSE'21), Madrid 25-28 May 2021 (2021, May) Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity ... [more ▼] Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity in requirements is tedious and time-consuming, and may further overlook unacknowledged ambiguity – the situation where different stakeholders perceive a requirement as unambiguous but, in reality, interpret the requirement differently. In this paper, we propose an automated approach that uses natural language processing for handling ambiguity in requirements. Our approach is based on the automatic generation of a domain-specific corpus from Wikipedia. Integrating domain knowledge, as we show in our evaluation, leads to a significant positive improvement in the accuracy of ambiguity detection and interpretation. We scope our work to coordination ambiguity (CA) and prepositional-phrase attachment ambiguity (PAA) because of the prevalence of these types of ambiguity in natural-language requirements [1]. We evaluate our approach on 20 industrial requirements documents. These documents collectively contain more than 5000 requirements from seven distinct application domains. Over this dataset, our approach detects CA and PAA with an average precision of 80% and an average recall of 89% (90% for cases of unacknowledged ambiguity). The automatic interpretations that our approach yields have an average accuracy of 85%. Compared to baselines that use generic corpora, our approach, which uses domain-specific corpora, has 33% better accuracy in ambiguity detection and 16% better accuracy in interpretation. [less ▲] Detailed reference viewed: 218 (28 UL)![]() Sleimi, Amin ![]() ![]() ![]() in Empirical Software Engineering (2021), 26(3), 43 Semantic legal metadata provides information that helps with understanding and interpreting legal provisions. Such metadata is therefore important for the systematic analysis of legal requirements ... [more ▼] Semantic legal metadata provides information that helps with understanding and interpreting legal provisions. Such metadata is therefore important for the systematic analysis of legal requirements. However, manually enhancing a large legal corpus with semantic metadata is prohibitively expensive. Our work is motivated by two observations: (1) the existing requirements engineering (RE) literature does not provide a harmonized view on the semantic metadata types that are useful for legal requirements analysis; (2) automated support for the extraction of semantic legal metadata is scarce, and it does not exploit the full potential of artificial intelligence technologies, notably natural language processing (NLP) and machine learning (ML). Our objective is to take steps toward overcoming these limitations. To do so, we review and reconcile the semantic legal metadata types proposed in the RE literature. Subsequently, we devise an automated extraction approach for the identified metadata types using NLP and ML. We evaluate our approach through two case studies over the Luxembourgish legislation. Our results indicate a high accuracy in the generation of metadata annotations. In particular, in the two case studies, we were able to obtain precision scores of 97,2% and 82,4%, and recall scores of 94,9% and 92,4%. [less ▲] Detailed reference viewed: 192 (22 UL)![]() Shin, Seung Yeob ![]() ![]() ![]() in Journal of Systems and Software (2021) Hardware-in-the-loop (HiL) testing is important for developing cyber physical systems (CPS). HiL test cases manipulate hardware, are time-consuming and their behaviors are impacted by the uncertainties in ... [more ▼] Hardware-in-the-loop (HiL) testing is important for developing cyber physical systems (CPS). HiL test cases manipulate hardware, are time-consuming and their behaviors are impacted by the uncertainties in the CPS environment. To mitigate the risks associated with HiL testing, engineers have to ensure that (1) test cases are well-behaved, e.g., they do not damage hardware, and (2) test cases can execute within a time budget. Leveraging the UML profile mechanism, we develop a domain-specific language, HITECS, for HiL test case specification. Using HITECS, we provide uncertainty-aware analysis methods to check the well-behavedness of HiL test cases. In addition, we provide a method to estimate the execution times of HiL test cases before the actual HiL testing. We apply HITECS to an industrial case study from the satellite domain. Our results show that: (1) HITECS helps engineers define more effective assertions to check HiL test cases, compared to the assertions defined without any systematic guidance; (2) HITECS verifies in practical time that HiL test cases are well-behaved; (3) HITECS is able to resolve uncertain parameters of HiL test cases by synthesizing conditions under which test cases are guaranteed to be well-behaved; and (4) HITECS accurately estimates HiL test case execution times. [less ▲] Detailed reference viewed: 462 (47 UL)![]() Veizaga Campero, Alvaro Mario ![]() ![]() ![]() in Proceedings of 23rd ACM / IEEE International Conference on Model Driven Engineering Languages and Systems (MODELS) (2020, October) In many software and systems development projects, analysts specify requirements using a combination of modeling and natural language (NL). In such situations, systematic acceptance testing poses a ... [more ▼] In many software and systems development projects, analysts specify requirements using a combination of modeling and natural language (NL). In such situations, systematic acceptance testing poses a challenge because defining the acceptance criteria (AC) to be met by the system under test has to account not only for the information in the (requirements) model but also that in the NL requirements. In other words, neither models nor NL requirements per se provide a complete picture of the information content relevant to AC. Our work in this paper is prompted by the observation that a reconciliation of the information content in NL requirements and models is necessary for obtaining precise AC. We perform such reconciliation by devising an approach that automatically extracts AC-related information from NL requirements and helps modelers enrich their model with the extracted information. An existing AC derivation technique is then applied to the model that has now been enriched by the information extracted from NL requirements. Using a real case study from the financial domain, we evaluate the usefulness of the AC-related model enrichments recommended by our approach. Our evaluation results are very promising: Over our case study system, a group of five domain experts found 89% of the recommended enrichments relevant to AC and yet absent from the original model (precision of 89%). Furthermore, the experts could not pinpoint any additional information in the NL requirements which was relevant to AC but which had not already been brought to their attention by our approach (recall of 100%) [less ▲] Detailed reference viewed: 327 (53 UL)![]() Torre, Damiano ![]() ![]() ![]() in in Proceedings of the 28th IEEE International Requirements Engineering Conference (RE’20) (2020, September) Detailed reference viewed: 498 (64 UL)![]() Shin, Seung Yeob ![]() ![]() ![]() in Proceedings of the 15th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS'20) (2020, May) The concept of Internet of Things (IoT) has led to the development of many complex and critical systems such as smart emergency management systems. IoT-enabled applications typically depend on a ... [more ▼] The concept of Internet of Things (IoT) has led to the development of many complex and critical systems such as smart emergency management systems. IoT-enabled applications typically depend on a communication network for transmitting large volumes of data in unpredictable and changing environments. These networks are prone to congestion when there is a burst in demand, e.g., as an emergency situation is unfolding, and therefore rely on configurable software-defined networks (SDN). In this paper, we propose a dynamic adaptive SDN configuration approach for IoT systems. The approach enables resolving congestion in real time while minimizing network utilization, data transmission delays and adaptation costs. Our approach builds on existing work in dynamic adaptive search-based software engineering (SBSE) to reconfigure an SDN while simultaneously ensuring multiple quality of service criteria. We evaluate our approach on an industrial national emergency management system, which is aimed at detecting disasters and emergencies, and facilitating recovery and rescue operations by providing first responders with a reliable communication infrastructure. Our results indicate that (1) our approach is able to efficiently and effectively adapt an SDN to dynamically resolve congestion, and (2) compared to two baseline data forwarding algorithms that are static and non-adaptive, our approach increases data transmission rate by a factor of at least 3 and decreases data loss by at least 70%. [less ▲] Detailed reference viewed: 445 (54 UL)![]() ; Sabetzadeh, Mehrdad ![]() ![]() in ACM Transactions on Software Engineering and Methodology (2020), 29(2), 111-1148 The ability to generate test data is often a necessary prerequisite for automated software testing. For the generated data to be fit for its intended purpose, the data usually has to satisfy various ... [more ▼] The ability to generate test data is often a necessary prerequisite for automated software testing. For the generated data to be fit for its intended purpose, the data usually has to satisfy various logical constraints. When testing is performed at a system level, these constraints tend to be complex and are typically captured in expressive formalisms based on first-order logic. Motivated by improving the feasibility and scalability of data generation for system testing, we present a novel approach, whereby we employ a combination of metaheuristic search and Satisfiability Modulo Theories (SMT) for constraint solving. Our approach delegates constraint solving tasks to metaheuristic search and SMT in such a way as to take advantage of the complementary strengths of the two techniques. We ground our work on test data models specified in UML, with OCL used as the constraint language. We present tool support and an evaluation of our approach over three industrial case studies. The results indicate that, for complex system test data generation problems, our approach presents substantial benefits over the state of the art in terms of applicability and scalability. [less ▲] Detailed reference viewed: 235 (45 UL)![]() Sleimi, Amin ![]() ![]() ![]() in Proceedings of the 28th IEEE International Requirements Engineering Conference (RE'20) (2020) [Context] In legal requirements elicitation, requirements analysts need to extract obligations from legal texts. However, legal texts often express obligations only indirectly, for example, by attributing ... [more ▼] [Context] In legal requirements elicitation, requirements analysts need to extract obligations from legal texts. However, legal texts often express obligations only indirectly, for example, by attributing a right to the counterpart. This phenomenon has already been described in the Requirements Engineering (RE) literature. [Objectives] We investigate the use of requirements templates for the systematic elicitation of legal requirements. Our work is motivated by two observations: (1) The existing literature does not provide a harmonized view on the requirements templates that are useful for legal RE; (2) Despite the promising recent advancements in natural language processing (NLP), automated support for legal RE through the suggestion of requirements templates has not been achieved yet. Our objective is to take steps toward addressing these limitations. [Methods] We review and reconcile the legal requirement templates proposed in RE. Subsequently, we conduct a qualitative study to define NLP rules for template recommendation. [Results and Conclusions] Our contributions consist of (a) a harmonized list of requirements templates pertinent to legal RE, and (b) rules for the automatic recommendation of such templates. We evaluate our rules through a case study on 400 statements from two legal domains. The results indicate a recall and precision of 82,3% and 79,8%, respectively. We show that introducing some limited interaction with the analyst considerably improves accuracy. Specifically, our human-feedback strategy increases recall by 12% and precision by 10,8%, thus yielding an overall recall of 94,3% and overall precision of 90,6%. [less ▲] Detailed reference viewed: 272 (19 UL)![]() Abualhaija, Sallam ![]() ![]() in Empirical Software Engineering (2020) A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle, by following suitable ... [more ▼] A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle, by following suitable writing and markup conventions, one can provide an immediate and unequivocal demarcation of requirements at the time a specification is being developed. However, neither the presence nor a fully accurate enforcement of such conventions is guaranteed. The result is that, in many practical situations, analysts end up resorting to after-the-fact reviews for sifting requirements from other material in a requirements specification. This is both tedious and time-consuming. We propose an automated approach for demarcating requirements in free-form requirements specifications. The approach, which is based on machine learning, can be applied to a wide variety of specifications in different domains and with different writing styles. We train and evaluate our approach over an independently labeled dataset comprised of 33 industrial requirements specifications. Over this dataset, our approach yields an average precision of 81.2% and an average recall of 95.7%. Compared to simple baselines that demarcate requirements based on the presence of modal verbs and identifiers, our approach leads to an average gain of 16.4% in precision and 25.5% in recall. We collect and analyze expert feedback on the demarcations produced by our approach for industrial requirements specifications. The results indicate that experts find our approach useful and efficient in practice.We developed a prototype tool, named DemaRQ, in support of our approach. To facilitate replication, we make available to the research community this prototype tool alongside the non-proprietary portion of our training data. [less ▲] Detailed reference viewed: 418 (45 UL)![]() Bettaieb, Seifeddine ![]() ![]() ![]() in Empirical Software Engineering (2020), 25(4), 25502582 In many domains such as healthcare and banking, IT systems need to fulfill various requirements related to security. The elaboration of security requirements for a given system is in part guided by the ... [more ▼] In many domains such as healthcare and banking, IT systems need to fulfill various requirements related to security. The elaboration of security requirements for a given system is in part guided by the controls envisaged by the applicable security standards and best practices. An important difficulty that analysts have to contend with during security requirements elaboration is sifting through a large number of security controls and determining which ones have a bearing on the security requirements for a given system. This challenge is often exacerbated by the scarce security expertise available in most organizations. [Objective] In this article, we develop automated decision support for the identification of security controls that are relevant to a specific system in a particular context. [Method and Results] Our approach, which is based on machine learning, leverages historical data from security assessments performed over past systems in order to recommend security controls for a new system. We operationalize and empirically evaluate our approach using real historical data from the banking domain. Our results show that, when one excludes security controls that are rare in the historical data, our approach has an average recall of ≈ 94% and average precision of ≈ 63%. We further examine through a survey the perceptions of security analysts about the usefulness of the classification models derived from historical data. [Conclusions] The high recall – indicating only a few relevant security controls are missed – combined with the reasonable level of precision – indicating that the effort required to confirm recommendations is not excessive – suggests that our approach is a useful aid to analysts for more efficiently identifying the relevant security controls, and also for decreasing the likelihood that important controls would be overlooked. Further, our survey results suggest that the generated classification models help provide a documented and explicit rationale for choosing the applicable security controls. [less ▲] Detailed reference viewed: 283 (45 UL)![]() Torre, Damiano ![]() ![]() ![]() in Proceedings of the IEEE / ACM 22nd International Conference on Model Driven Engineering Languages and Systems (MODELS 19) (2019, September) The General Data Protection Regulation (GDPR) harmonizes data privacy laws and regulations across Europe. Through the GDPR, individuals are able to better control their personal data in the face of new ... [more ▼] The General Data Protection Regulation (GDPR) harmonizes data privacy laws and regulations across Europe. Through the GDPR, individuals are able to better control their personal data in the face of new technological developments. While the GDPR is highly advantageous to citizens, complying with it poses major challenges for organizations that control or process personal data. Since no automated solution with broad industrial applicability currently exists for GDPR compliance checking, organizations have no choice but to perform costly manual audits to ensure compliance. In this paper, we share our experience building a UML representation of the GDPR as a first step towards the development of future automated methods for assessing compliance with the GDPR. Given that a concrete implementation of the GDPR is affected by the national laws of the EU member states, GDPR’s expanding body of case laws and other contextual information, we propose a two-tiered representation of the GDPR: a generic tier and a specialized tier. The generic tier captures the concepts and principles of the GDPR that apply to all contexts, whereas the specialized tier describes a specific tailoring of the generic tier to a given context, including the contextual variations that may impact the interpretation and application of the GDPR. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research. [less ▲] Detailed reference viewed: 493 (54 UL)![]() Alferez, Mauricio ![]() ![]() ![]() in Proceedings of 22nd IEEE / ACM International Conference on Model Driven Engineering Languages and Systems (MODELS) (2019, September) Acceptance criteria (AC) are implementation agnostic conditions that a system must meet to be consistent with its requirements and be accepted by its stakeholders. Each acceptance criterion is typically ... [more ▼] Acceptance criteria (AC) are implementation agnostic conditions that a system must meet to be consistent with its requirements and be accepted by its stakeholders. Each acceptance criterion is typically expressed as a natural-language statement with a clear pass or fail outcome. Writing AC is a tedious and error-prone activity, especially when the requirements specifications evolve and there are different analysts and testing teams involved. Analysts and testers must iterate multiple times to ensure that AC are understandable and feasible, and accurately address the most important requirements and workflows of the system being developed. In many cases, analysts express requirements through models, along with natural language, typically in some variant of the UML. AC must then be derived by developers and testers from such models. In this paper, we bridge the gap between requirements models and AC by providing a UML-based modeling methodology and an automated solution to generate AC. We target AC in the form of Behavioral Specifications in the context of Behavioral-Driven Development (BDD), a widely used agile practice in many application domains. More specially we target the well-known Gherkin language to express AC, which then can be used to generate executable test cases. We evaluate our modeling methodology and AC generation solution through an industrial case study in the financial domain. Our results suggest that (1) our methodology is feasible to apply in practice, and (2) the additional modeling effort required by our methodology is outweighed by the benefits the methodology brings in terms of automated and systematic AC generation and improved model precision. [less ▲] Detailed reference viewed: 650 (135 UL)![]() Arora, Chetan ![]() ![]() ![]() in Empirical Software Engineering (2019), 24(4), 25092539 [Context] Domain modeling is a common strategy for mitigating incompleteness in requirements. While the benefits of domain models for checking the completeness of requirements are anecdotally known, these ... [more ▼] [Context] Domain modeling is a common strategy for mitigating incompleteness in requirements. While the benefits of domain models for checking the completeness of requirements are anecdotally known, these benefits have never been evaluated systematically. [Objective] We empirically examine the potential usefulness of domain models for detecting incompleteness in natural-language requirements. We focus on requirements written as “shall”- style statements and domain models captured using UML class diagrams. [Methods] Through a randomized simulation process, we analyze the sensitivity of domain models to omissions in requirements. Sensitivity is a measure of whether a domain model contains information that can lead to the discovery of requirements omissions. Our empirical research method is case study research in an industrial setting. [Results and Conclusions] We have experts construct domain models in three distinct industry domains. We then report on how sensitive the resulting models are to simulated omissions in requirements. We observe that domain models exhibit near-linear sensitivity to both unspecified (i.e., missing) and under-specified requirements (i.e., requirements whose details are incomplete). The level of sensitivity is more than four times higher for unspecified requirements than under-specified ones. These results provide empirical evidence that domain models provide useful cues for checking the completeness of natural-language requirements. Further studies remain necessary to ascertain whether analysts are able to effectively exploit these cues for incompleteness detection. [less ▲] Detailed reference viewed: 524 (108 UL)![]() Arora, Chetan ![]() ![]() ![]() in ACM Transactions on Software Engineering and Methodology (2019), 28(1), Domain models are a useful vehicle for making the interpretation and elaboration of natural-language requirements more precise. Advances in natural language processing (NLP) have made it possible to ... [more ▼] Domain models are a useful vehicle for making the interpretation and elaboration of natural-language requirements more precise. Advances in natural language processing (NLP) have made it possible to automatically extract from requirements most of the information that is relevant to domain model construction. However, alongside the relevant information, NLP extracts from requirements a significant amount of information that is superfluous, i.e., not relevant to the domain model. Our objective in this article is to develop automated assistance for filtering the superfluous information extracted by NLP during domain model extraction. To this end, we devise an active-learning-based approach that iteratively learns from analysts’ feedback over the relevance and superfluousness of the extracted domain model elements, and uses this feedback to provide recommendations for filtering superfluous elements. We empirically evaluate our approach over three industrial case studies. Our results indicate that, once trained, our approach automatically detects an average of ≈ 45% of the superfluous elements with a precision of ≈ 96%. Since precision is very high, the automatic recommendations made by our approach are trustworthy. Consequently, analysts can dispose of a considerable fraction – nearly half – of the superfluous elements with minimal manual work. The results are particularly promising, as they should be considered in light of the non-negligible subjectivity that is inherently tied to the notion of relevance. [less ▲] Detailed reference viewed: 691 (132 UL) |
||