References of "Briand, Lionel 50001049"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailGuidelines for Assessing the Accuracy of Log Message Template Identification Techniques
Khan, Zanis Ali UL; Shin, Donghwan UL; Bianculli, Domenico UL et al

in Proceedings of the 44th International Conference on Software Engineering (ICSE ’22) (in press)

Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based analysis, such as anomaly detection and model ... [more ▼]

Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based analysis, such as anomaly detection and model inference. While many techniques have been proposed in the literature, only two recent studies provide a comprehensive evaluation and comparison of the techniques using an established benchmark composed of real-world logs. Nevertheless, we argue that both studies have the following issues: (1) they used different accuracy metrics without comparison between them, (2) some ground-truth (oracle) templates are incorrect, and (3) the accuracy evaluation results do not provide any information regarding incorrectly identified templates. In this paper, we address the above issues by providing three guidelines for assessing the accuracy of log template identification techniques: (1) use appropriate accuracy metrics, (2) perform oracle template correction, and (3) perform analysis of incorrect templates. We then assess the application of such guidelines through a comprehensive evaluation of 14 existing template identification techniques on the established benchmark logs. Results show very different insights than existing studies and in particular a much less optimistic outlook on existing techniques. [less ▲]

Detailed reference viewed: 171 (21 UL)
Full Text
Peer Reviewed
See detailReinforcement Learning for Test Case Prioritization
Bagherzadeh, Mojtaba; Kahani, Nafiseh; Briand, Lionel UL

in IEEE Transactions on Software Engineering (in press)

Continuous Integration (CI) context significantly reduces integration problems, speeds up development time, and shortens release time. However, it also introduces new challenges for quality assurance ... [more ▼]

Continuous Integration (CI) context significantly reduces integration problems, speeds up development time, and shortens release time. However, it also introduces new challenges for quality assurance activities, including regression testing, which is the focus of this work. Though various approaches for test case prioritization have shown to be very promising in the context of regression testing, specific techniques must be designed to deal with the dynamic nature and timing constraints of CI. Recently, Reinforcement Learning (RL) has shown great potential in various challenging scenarios that require continuous adaptation, such as game playing, real-time ads bidding, and recommender systems. Inspired by this line of work and building on initial efforts in supporting test case prioritization with RL techniques, we perform here a comprehensive investigation of RL-based test case prioritization in a CI context. To this end, taking test case prioritization as a ranking problem, we model the sequential interactions between the CI environment and a test case prioritization agent as an RL problem, using three alternative ranking models. We then rely on carefully selected and tailored state-of-the-art RL techniques to automatically and continuously learn a test case prioritization strategy, whose objective is to be as close as possible to the optimal one. Our extensive experimental analysis shows that the best RL solutions provide a significant accuracy improvement over previous RL-based work, with prioritization strategies getting close to being optimal, thus paving the way for using RL to prioritize test cases in a CI context. [less ▲]

Detailed reference viewed: 148 (22 UL)
Full Text
Peer Reviewed
See detailModeling Data Protection and Privacy: Application and Experience with GDPR
Torre, Damiano UL; Alferez, Mauricio UL; Soltana, Ghanem UL et al

in Software and Systems Modeling (in press)

In Europe and indeed worldwide, the Gen- eral Data Protection Regulation (GDPR) provides pro- tection to individuals regarding their personal data in the face of new technological developments. GDPR is ... [more ▼]

In Europe and indeed worldwide, the Gen- eral Data Protection Regulation (GDPR) provides pro- tection to individuals regarding their personal data in the face of new technological developments. GDPR is widely viewed as the benchmark for data protection and privacy regulations that harmonizes data privacy laws across Europe. Although the GDPR is highly ben- e cial to individuals, it presents signi cant challenges for organizations monitoring or storing personal infor- mation. Since there is currently no automated solution with broad industrial applicability, organizations have no choice but to carry out expensive manual audits to ensure GDPR compliance. In this paper, we present a complete GDPR UML model as a rst step towards de- signing automated methods for checking GDPR compli- ance. Given that the practical application of the GDPR is infuenced by national laws of the EU Member States,we suggest a two-tiered description of the GDPR, generic and specialized. In this paper, we provide (1) the GDPR conceptual model we developed with complete trace- ability from its classes to the GDPR, (2) a glossary to help understand the model, (3) the plain-English de- scription of 35 compliance rules derived from GDPR along with their encoding in OCL, and (4) the set of 20 variations points derived from GDPR to specialize the generic model. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research. [less ▲]

Detailed reference viewed: 69 (12 UL)
Full Text
Peer Reviewed
See detailCombining Genetic Programming and Model Checking to Generate Environment Assumptions
Gaaloul, Khouloud UL; Menghi, Claudio UL; Nejati, Shiva UL et al

in IEEE Transactions on Software Engineering (in press)

Software verification may yield spurious failures when environment assumptions are not accounted for. Environment assumptions are the expectations that a system or a component makes about its operational ... [more ▼]

Software verification may yield spurious failures when environment assumptions are not accounted for. Environment assumptions are the expectations that a system or a component makes about its operational environment and are often specified in terms of conditions over the inputs of that system or component. In this article, we propose an approach to automatically infer environment assumptions for Cyber-Physical Systems (CPS). Our approach improves the state-of-the-art in three different ways: First, we learn assumptions for complex CPS models involving signal and numeric variables; second, the learned assumptions include arithmetic expressions defined over multiple variables; third, we identify the trade-off between soundness and coverage of environment assumptions and demonstrate the flexibility of our approach in prioritizing either of these criteria. We evaluate our approach using a public domain benchmark of CPS models from Lockheed Martin and a component of a satellite control system from LuxSpace, a satellite system provider. The results show that our approach outperforms state-of-the-art techniques on learning assumptions for CPS models, and further, when applied to our industrial CPS model, our approach is able to learn assumptions that are sufficiently close to the assumptions manually developed by engineers to be of practical value. [less ▲]

Detailed reference viewed: 170 (44 UL)
Full Text
Peer Reviewed
See detailOptimal Priority Assignment for Real-Time Systems: A Coevolution-Based Approach
Lee, Jaekwon UL; Shin, Seung Yeob UL; Nejati, Shiva et al

in Emperical Software Engineering (in press)

In real-time systems, priorities assigned to real-time tasks determine the order of task executions, by relying on an underlying task scheduling policy. Assigning optimal priority values to tasks is ... [more ▼]

In real-time systems, priorities assigned to real-time tasks determine the order of task executions, by relying on an underlying task scheduling policy. Assigning optimal priority values to tasks is critical to allow the tasks to complete their executions while maximizing safety margins from their specified deadlines. This enables real-time systems to tolerate unexpected overheads in task executions and still meet their deadlines. In practice, priority assignments result from an interactive process between the development and testing teams. In this article, we propose an automated method that aims to identify the best possible priority assignments in real-time systems, accounting for multiple objectives regarding safety margins and engineering constraints. Our approach is based on a multi-objective, competitive coevolutionary algorithm mimicking the interactive priority assignment process between the development and testing teams. We evaluate our approach by applying it to six industrial systems from different domains and several synthetic systems. The results indicate that our approach significantly outperforms both our baselines, i.e., random search and sequential search, and solutions defined by practitioners. Our approach scales to complex industrial systems as an offline analysis method that attempts to find near-optimal solutions within acceptable time, i.e., less than 16 hours. [less ▲]

Detailed reference viewed: 37 (8 UL)
Full Text
Peer Reviewed
See detailAutomated, Cost-effective, and Update-driven App Testing
Ngo, Chanh Duc UL; Pastore, Fabrizio UL; Briand, Lionel UL

in ACM Transactions on Software Engineering and Methodology (in press)

Apps’ pervasive role in our society led to the definition of test automation approaches to ensure their dependability. However, state-of-the-art approaches tend to generate large numbers of test inputs ... [more ▼]

Apps’ pervasive role in our society led to the definition of test automation approaches to ensure their dependability. However, state-of-the-art approaches tend to generate large numbers of test inputs and are unlikely to achieve more than 50% method coverage. In this paper, we propose a strategy to achieve significantly higher coverage of the code affected by updates with a much smaller number of test inputs, thus alleviating the test oracle problem. More specifically, we present ATUA, a model-based approach that synthesizes App models with static analysis, integrates a dynamically-refined state abstraction function, and combines complementary testing strategies, including (1) coverage of the model structure, (2) coverage of the App code, (3) random exploration, and (4) coverage of dependencies identified through information retrieval. Its model-based strategy enables ATUA to generate a small set of inputs that exercise only the code affected by the updates. In turn, this makes common test oracle solutions more cost-effective as they tend to involve human effort. A large empirical evaluation, conducted with 72 App versions belonging to nine popular Android Apps, has shown that ATUA is more effective and less effort-intensive than state-of-the-art approaches when testingApp updates. [less ▲]

Detailed reference viewed: 45 (13 UL)
Full Text
Peer Reviewed
See detailA Machine Learning Approach for Automated Filling of Categorical Fields in Data Entry Forms
Belgacem, Hichem UL; Li, Xiaochen; Bianculli, Domenico UL et al

in ACM Transactions on Software Engineering and Methodology (in press)

Users frequently interact with software systems through data entry forms. However, form filling is time-consuming and error-prone. Although several techniques have been proposed to auto-complete or pre ... [more ▼]

Users frequently interact with software systems through data entry forms. However, form filling is time-consuming and error-prone. Although several techniques have been proposed to auto-complete or pre-fill fields in the forms, they provide limited support to help users fill categorical fields, i.e., fields that require users to choose the right value among a large set of options. In this paper, we propose LAFF, a learning-based automated approach for filling categorical fields in data entry forms. LAFF first builds Bayesian Network models by learning field dependencies from a set of historical input instances, representing the values of the fields that have been filled in the past. To improve its learning ability, LAFF uses local modeling to effectively mine the local dependencies of fields in a cluster of input instances. During the form filling phase, LAFF uses such models to predict possible values of a target field, based on the values in the already-filled fields of the form and their dependencies; the predicted values (endorsed based on field dependencies and prediction confidence) are then provided to the end-user as a list of suggestions. We evaluated LAFF by assessing its effectiveness and efficiency in form filling on two datasets, one of them proprietary from the banking domain. Experimental results show that LAFF is able to provide accurate suggestions with a Mean Reciprocal Rank value above 0.73. Furthermore, LAFF is efficient, requiring at most 317 ms per suggestion. [less ▲]

Detailed reference viewed: 43 (8 UL)
Full Text
Peer Reviewed
See detailMutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results in the Space Domain
Cornejo Olivares, Oscar Eduardo UL; Pastore, Fabrizio UL; Briand, Lionel UL

in IEEE Transactions on Software Engineering (in press)

On-board embedded software developed for spaceflight systems (space software) must adhere to stringent software quality assurance procedures. For example, verification and validation activities are ... [more ▼]

On-board embedded software developed for spaceflight systems (space software) must adhere to stringent software quality assurance procedures. For example, verification and validation activities are typically performed and assessed by third party organizations. To further minimize the risk of human mistakes, space agencies, such as the European Space Agency (ESA), are looking for automated solutions for the assessment of software testing activities, which play a crucial role in this context. Though space software is our focus here, it should be noted that such software shares the above considerations, to a large extent, with embedded software in many other types of cyber-physical systems. Over the years, mutation analysis has shown to be a promising solution for the automated assessment of test suites; it consists of measuring the quality of a test suite in terms of the percentage of injected faults leading to a test failure. A number of optimization techniques, addressing scalability and accuracy problems, have been proposed to facilitate the industrial adoption of mutation analysis. However, to date, two major problems prevent space agencies from enforcing mutation analysis in space software development. First, there is uncertainty regarding the feasibility of applying mutation analysis optimization techniques in their context. Second, most of the existing techniques either can break the real-time requirements common in embedded software or cannot be applied when the software is tested in Software Validation Facilities, including CPU emulators and sensor simulators. In this paper, we enhance mutation analysis optimization techniques to enable their applicability to embedded software and propose a pipeline that successfully integrates them to address scalability and accuracy issues in this context, as described above. Further, we report on the largest study involving embedded software systems in the mutation analysis literature. Our research is part of a research project funded by ESA ESTEC involving private companies (GomSpace Luxembourg and LuxSpace) in the space sector. These industry partners provided the case studies reported in this paper; they include an on-board software system managing a microsatellite currently on-orbit, a set of libraries used in deployed cubesats, and a mathematical library certified by ESA. [less ▲]

Detailed reference viewed: 348 (39 UL)
Full Text
Peer Reviewed
See detailTest Case Selection and Prioritization Using Machine Learning: A Systematic Literature Review
Pan, Rongqi; Bagherzadeh, Mojtaba; Ghaleb, Taher et al

in Empirical Software Engineering (in press)

Regression testing is an essential activity to assure that software code changes do not adversely a ect existing functionalities. With the wide adoption of Continuous Integration (CI) in software projects ... [more ▼]

Regression testing is an essential activity to assure that software code changes do not adversely a ect existing functionalities. With the wide adoption of Continuous Integration (CI) in software projects, which increases the frequency of running software builds, running all tests can be time-consuming and resource-intensive. To alleviate that problem, Test case Selection and Prioritiza- tion (TSP) techniques have been proposed to improve regression testing by selecting and prioritizing test cases in order to provide early feedback to developers. In recent years, researchers have relied on Machine Learning (ML) techniques to achieve e ective TSP (ML-based TSP). Such techniques help combine information about test cases, from partial and imperfect sources, into accurate prediction models. This work conducts a systematic literature review focused on ML-based TSP techniques, aiming to perform an in-depth analysis of the state of the art, thus gaining insights regarding fu- ture avenues of research. To that end, we analyze 29 primary studies published from 2006 to 2020, which have been identi ed through a systematic and documented process. This paper addresses ve research questions addressing variations in ML-based TSP techniques and feature sets for training and testing ML models, alternative metrics used for evaluating the techniques, the performance of techniques, and the reproducibility of the published studies. We summarize the results related to our research questions in a high-level summary that can be used as a taxonomy for classifying future TSP studies. [less ▲]

Detailed reference viewed: 117 (20 UL)
Full Text
Peer Reviewed
See detailEfficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and Many-Objective Optimization
Ul Haq, Fitash UL; Shin, Donghwan UL; Briand, Lionel UL

in Proceedings of the 44th International Conference on Software Engineering (ICSE ’22) (in press)

With the recent advances of Deep Neural Networks (DNNs) in real-world applications, such as Automated Driving Systems (ADS) for self-driving cars, ensuring the reliability and safety of such DNN- enabled ... [more ▼]

With the recent advances of Deep Neural Networks (DNNs) in real-world applications, such as Automated Driving Systems (ADS) for self-driving cars, ensuring the reliability and safety of such DNN- enabled Systems emerges as a fundamental topic in software testing. One of the essential testing phases of such DNN-enabled systems is online testing, where the system under test is embedded into a specific and often simulated application environment (e.g., a driving environment) and tested in a closed-loop mode in interaction with the environment. However, despite the importance of online testing for detecting safety violations, automatically generating new and diverse test data that lead to safety violations present the following challenges: (1) there can be many safety requirements to be considered at the same time, (2) running a high-fidelity simulator is often very computationally-intensive, and (3) the space of all possible test data that may trigger safety violations is too large to be exhaustively explored. In this paper, we address the challenges by proposing a novel approach, called SAMOTA (Surrogate-Assisted Many-Objective Testing Approach), extending existing many-objective search algorithms for test suite generation to efficiently utilize surrogate models that mimic the simulator, but are much less expensive to run. Empirical evaluation results on Pylot, an advanced ADS composed of multiple DNNs, using CARLA, a high-fidelity driving simulator, show that SAMOTA is significantly more effective and efficient at detecting unknown safety requirement violations than state-of-the-art many-objective test suite generation algorithms and random search. In other words, SAMOTA appears to be a key enabler technology for online testing in practice. [less ▲]

Detailed reference viewed: 535 (39 UL)
Full Text
Peer Reviewed
See detailMASS: A tool for Mutation Analysis of Space CPS
Cornejo Olivares, Oscar Eduardo UL; Pastore, Fabrizio UL; Briand, Lionel UL

in 2022 IEEE/ACM 44st International Conference on Software Engineering (2022, May)

We present MASS, a mutation analysis tool for embedded software in cyber-physical systems (CPS). We target space CPS (e.g., satellites) and other CPS with similar characteristics (e.g., UAV). Mutation ... [more ▼]

We present MASS, a mutation analysis tool for embedded software in cyber-physical systems (CPS). We target space CPS (e.g., satellites) and other CPS with similar characteristics (e.g., UAV). Mutation analysis measures the quality of test suites in terms of the percentage of detected artificial faults. There are many mutation analysis tools available, but they are inapplicable to CPS because of scalability and accuracy challenges. To overcome such limitations, MASS implements a set of optimization techniques that enable the applicability of mutation analysis and address scalability and accuracy in the CPS context. MASS has been successfully evaluated on a large study involving embedded software systems provided by industry partners; the study includes an on-board software system managing a microsatellite currently on-orbit, a set of libraries used in deployed cubesats, and a mathematical library provided by the European Space Agency. A demo video of MASS is available at https://www.youtube.com/watch?v=gC1x9cU0-tU. [less ▲]

Detailed reference viewed: 72 (29 UL)
Full Text
Peer Reviewed
See detailHUDD: A tool to debug DNNs for safety analysis
Fahmy, Hazem UL; Pastore, Fabrizio UL; Briand, Lionel UL

in 2022 IEEE/ACM 44st International Conference on Software Engineering (2022, May)

We present HUDD, a tool that supports safety analysis practices for systems enabled by Deep Neural Networks (DNNs) by automatically identifying the root causes for DNN errors and retraining the DNN. HUDD ... [more ▼]

We present HUDD, a tool that supports safety analysis practices for systems enabled by Deep Neural Networks (DNNs) by automatically identifying the root causes for DNN errors and retraining the DNN. HUDD stands for Heatmap-based Unsupervised Debugging of DNNs, it automatically clusters error-inducing images whose results are due to common subsets of DNN neurons. The intent is for the generated clusters to group error-inducing images having common characteristics, that is, having a common root cause. HUDD identifies root causes by applying a clustering algorithm to matrices (i.e., heatmaps) capturing the relevance of every DNN neuron on the DNN outcome. Also, HUDD retrains DNNs with images that are automatically selected based on their relatedness to the identified image clusters. Our empirical evaluation with DNNs from the automotive domain have shown that HUDD automatically identifies all the distinct root causes of DNN errors, thus supporting safety analysis. Also, our retraining approach has shown to be more effective at improving DNN accuracy than existing approaches. A demo video of HUDD is available at https://youtu.be/drjVakP7jdU. [less ▲]

Detailed reference viewed: 100 (17 UL)
Full Text
Peer Reviewed
See detailPRINS: Scalable Model Inference for Component-based System Logs
Shin, Donghwan UL; Bianculli, Domenico UL; Briand, Lionel UL

in Empirical Software Engineering (2022)

Behavioral software models play a key role in many software engineering tasks; unfortunately, these models either are not available during software development or, if available, quickly become outdated as ... [more ▼]

Behavioral software models play a key role in many software engineering tasks; unfortunately, these models either are not available during software development or, if available, quickly become outdated as implementations evolve. Model inference techniques have been proposed as a viable solution to extract finite state models from execution logs. However, existing techniques do not scale well when processing very large logs that can be commonly found in practice. In this paper, we address the scalability problem of inferring the model of a component-based system from large system logs, without requiring any extra information. Our model inference technique, called PRINS, follows a divide-and-conquer approach. The idea is to first infer a model of each system component from the corresponding logs; then, the individual component models are merged together taking into account the flow of events across components, as reflected in the logs. We evaluated PRINS in terms of scalability and accuracy, using nine datasets composed of logs extracted from publicly available benchmarks and a personal computer running desktop business applications. The results show that PRINS can process large logs much faster than a publicly available and well-known state-of-the-art tool, without significantly compromising the accuracy of inferred models. [less ▲]

Detailed reference viewed: 228 (16 UL)
Full Text
Peer Reviewed
See detailAutomatic Generation of Acceptance Test Cases from Use Case Specifications: an NLP-based Approach
Wang, Chunhui UL; Pastore, Fabrizio UL; Göknil, Arda UL et al

in IEEE Transactions on Software Engineering (2022), 48(2), 585-616

Acceptance testing is a validation activity performed to ensure the conformance of software systems with respect to their functional requirements. In safety critical systems, it plays a crucial role since ... [more ▼]

Acceptance testing is a validation activity performed to ensure the conformance of software systems with respect to their functional requirements. In safety critical systems, it plays a crucial role since it is enforced by software standards, which mandate that each requirement be validated by such testing in a clearly traceable manner. Test engineers need to identify all the representative test execution scenarios from requirements, determine the runtime conditions that trigger these scenarios, and finally provide the input data that satisfy these conditions. Given that requirements specifications are typically large and often provided in natural language (e.g., use case specifications), the generation of acceptance test cases tends to be expensive and error-prone. In this paper, we present Use Case Modeling for System-level, Acceptance Tests Generation (UMTG), an approach that supports the generation of executable, system-level, acceptance test cases from requirements specifications in natural language, with the goal of reducing the manual effort required to generate test cases and ensuring requirements coverage. More specifically, UMTG automates the generation of acceptance test cases based on use case specifications and a domain model for the system under test, which are commonly produced in many development environments. Unlike existing approaches, it does not impose strong restrictions on the expressiveness of use case specifications. We rely on recent advances in natural language processing to automatically identify test scenarios and to generate formal constraints that capture conditions triggering the execution of the scenarios, thus enabling the generation of test data. In two industrial case studies, UMTG automatically and correctly translated 95% of the use case specification steps into formal constraints required for test data generation; furthermore, it generated test cases that exercise not only all the test scenarios manually implemented by experts, but also some critical scenarios not previously considered. [less ▲]

Detailed reference viewed: 177 (31 UL)
Full Text
Peer Reviewed
See detailAutomated Reverse Engineering of Role-based Access Control Policies of Web Applications
Le, Ha Thanh UL; Shar, Lwin Khin UL; Bianculli, Domenico UL et al

in Journal of Systems and Software (2022), 184

Access control (AC) is an important security mechanism used in software systems to restrict access to sensitive resources. Therefore, it is essential to validate the correctness of AC implementations with ... [more ▼]

Access control (AC) is an important security mechanism used in software systems to restrict access to sensitive resources. Therefore, it is essential to validate the correctness of AC implementations with respect to policy specifications or intended access rights. However, in practice, AC policy specifications are often missing or poorly documented; in some cases, AC policies are hard-coded in business logic implementations. This leads to difficulties in validating the correctness of policy implementations and detecting AC defects. In this paper, we present a semi-automated framework for reverse-engineering of AC policies from Web applications. Our goal is to learn and recover role-based access control (RBAC) policies from implementations, which are then used to validate implemented policies and detect AC issues. Our framework, built on top of a suite of security tools, automatically explores a given Web application, mines domain input specifications from access logs, and systematically generates and executes more access requests using combinatorial test generation. To learn policies, we apply machine learning on the obtained data to characterize relevant attributes that influence AC. Finally, the inferred policies are presented to the security engineer, for validation with respect to intended access rights and for detecting AC issues. Inconsistent and insufficient policies are highlighted as potential AC issues, being either vulnerabilities or implementation errors. We evaluated our approach on four Web applications (three open-source and a proprietary one built by our industry partner) in terms of the correctness of inferred policies. We also evaluated the usefulness of our approach by investigating whether it facilitates the detection of AC issues. The results show that 97.8% of the inferred policies are correct with respect to the actual AC implementation; the analysis of these policies led to the discovery of 64 AC issues that were reported to the developers. [less ▲]

Detailed reference viewed: 82 (11 UL)
Full Text
Peer Reviewed
See detailAI-enabled Automation for Completeness Checking of Privacy Policies
Amaral Cejas, Orlando UL; Abualhaija, Sallam UL; Torre, Damiano et al

in IEEE Transactions on Software Engineering (2021)

Technological advances in information sharing have raised concerns about data protection. Privacy policies containprivacy-related requirements about how the personal data of individuals will be handled by ... [more ▼]

Technological advances in information sharing have raised concerns about data protection. Privacy policies containprivacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g.,a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). Aprerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisionsof GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related softwarespecifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automationfor the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterizethe privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop anautomated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machinelearning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against thecompleteness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseenprivacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword searchonly, our approach results in an improvement of 24.5% in precision and 38% in recall. [less ▲]

Detailed reference viewed: 149 (25 UL)
Full Text
Peer Reviewed
See detailA Theoretical Framework for Understanding the Relationship Between Log Parsing and Anomaly Detection
Shin, Donghwan UL; Khan, Zanis Ali UL; Bianculli, Domenico UL et al

in Proceedings of the 21st International Conference on Runtime Verification (2021, October)

Log-based anomaly detection identifies systems' anomalous behaviors by analyzing system runtime information recorded in logs. While many approaches have been proposed, all of them have in common an ... [more ▼]

Log-based anomaly detection identifies systems' anomalous behaviors by analyzing system runtime information recorded in logs. While many approaches have been proposed, all of them have in common an essential pre-processing step called log parsing. This step is needed because automated log analysis requires structured input logs, whereas original logs contain semi-structured text printed by logging statements. Log parsing bridges this gap by converting the original logs into structured input logs fit for anomaly detection. Despite the intrinsic dependency between log parsing and anomaly detection, no existing work has investigated the impact of the "quality" of log parsing results on anomaly detection. In particular, the concept of "ideal" log parsing results with respect to anomaly detection has not been formalized yet. This makes it difficult to determine, upon obtaining inaccurate results from anomaly detection, if (and why) the root cause for such results lies in the log parsing step. In this short paper, we lay the theoretical foundations for defining the concept of "ideal" log parsing results for anomaly detection. Based on these foundations, we discuss practical implications regarding the identification and localization of root causes, when dealing with inaccurate anomaly detection, and the identification of irrelevant log messages. [less ▲]

Detailed reference viewed: 138 (26 UL)
Full Text
Peer Reviewed
See detailA Model-based Conceptualization of Requirements for Compliance Checking of Data Processing against GDPR
Amaral Cejas, Orlando UL; Abualhaija, Sallam UL; Sabetzadeh, Mehrdad UL et al

in 2020 IEEE Eleventh International Model-Driven Requirements Engineering (MoDRE) (2021, September)

The General Data Protection Regulation (GDPR) has been recently introduced to harmonize the different data privacy laws across Europe. Whether inside the EU or outside, organizations have to comply with ... [more ▼]

The General Data Protection Regulation (GDPR) has been recently introduced to harmonize the different data privacy laws across Europe. Whether inside the EU or outside, organizations have to comply with the GDPR as long as they handle personal data of EU residents. The organizations with whom personal data is shared are referred to as data controllers. When controllers subcontract certain services that involve processing personal data to service providers (also known as data processors), then a data processing agreement (DPA) has to be issued. This agreement regulates the relationship between the controllers and processors and also ensures the protection of individuals’ personal data. Compliance with the GDPR is challenging for organizations since it is large and relies on complex legal concepts. In this paper, we draw on model-driven engineering to build a machine-analyzable conceptual model that characterizes DPA-related requirements in the GDPR. Further, we create a set of criteria for checking the compliance of a given DPA against the GDPR and discuss how our work in this paper can be adapted to develop an automated compliance checking solution. [less ▲]

Detailed reference viewed: 105 (18 UL)
Full Text
Peer Reviewed
See detailCan Offline Testing of Deep Neural Networks Replace Their Online Testing?
Ul Haq, Fitash UL; Shin, Donghwan UL; Nejati, Shiva UL et al

in Empirical Software Engineering (2021), 26(5),

We distinguish two general modes of testing for Deep Neural Networks (DNNs): Offline testing where DNNs are tested as individual units based on test datasets obtained without involving the DNNs under test ... [more ▼]

We distinguish two general modes of testing for Deep Neural Networks (DNNs): Offline testing where DNNs are tested as individual units based on test datasets obtained without involving the DNNs under test, and online testing where DNNs are embedded into a specific application environment and tested in a closed-loop mode in interaction with the application environment. Typically, DNNs are subjected to both types of testing during their development life cycle where offline testing is applied immediately after DNN training and online testing follows after offline testing and once a DNN is deployed within a specific application environment. In this paper, we study the relationship between offline and online testing. Our goal is to determine how offline testing and online testing differ or complement one another and if offline testing results can be used to help reduce the cost of online testing? Though these questions are generally relevant to all autonomous systems, we study them in the context of automated driving systems where, as study subjects, we use DNNs automating end-to-end controls of steering functions of self-driving vehicles. Our results show that offline testing is less effective than online testing as many safety violations identified by online testing could not be identified by offline testing, while large prediction errors generated by offline testing always led to severe safety violations detectable by online testing. Further, we cannot exploit offline testing results to reduce the cost of online testing in practice since we are not able to identify specific situations where offline testing could be as accurate as online testing in identifying safety requirement violations. [less ▲]

Detailed reference viewed: 113 (28 UL)
Full Text
Peer Reviewed
See detailLog-based Slicing for System-level Test Cases
Messaoudi, Salma UL; Shin, Donghwan UL; Panichella, Annibale et al

in Proceedings of ISSTA '21: 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (2021, July)

Regression testing is arguably one of the most important activities in software testing. However, its cost-effectiveness and usefulness can be largely impaired by complex system test cases that are poorly ... [more ▼]

Regression testing is arguably one of the most important activities in software testing. However, its cost-effectiveness and usefulness can be largely impaired by complex system test cases that are poorly designed (e.g., test cases containing multiple test scenarios combined into a single test case) and that require a large amount of time and resources to run. One way to mitigate this issue is decomposing such system test cases into smaller, separate test cases---each of them with only one test scenario and with its corresponding assertions---so that the execution time of the decomposed test cases is lower than the original test cases, while the test effectiveness of the original test cases is preserved. This decomposition can be achieved with program slicing techniques, since test cases are software programs too. However, existing static and dynamic slicing techniques exhibit limitations when (1) the test cases use external resources, (2) code instrumentation is not a viable option, and (3) test execution is expensive. In this paper, we propose a novel approach, called DS3 (Decomposing System teSt caSe), which automatically decomposes a complex system test case into separate test case slices. The idea is to use test case execution logs, obtained from past regression testing sessions, to identify "hidden" dependencies in the slices generated by static slicing. Since logs include run-time information about the system under test, we can use them to extract access and usage of global resources and refine the slices generated by static slicing. We evaluated DS3 in terms of slicing effectiveness and compared it with a vanilla static slicing tool. We also compared the slices obtained by DS3 with the corresponding original system test cases, in terms of test efficiency and effectiveness. The evaluation results on one proprietary system and one open-source system show that DS3 is able to accurately identify the dependencies related to the usage of global resources, which vanilla static slicing misses. Moreover, the generated test case slices are, on average, 3.56 times faster than original system test cases and they exhibit no significant loss in terms of fault detection effectiveness. [less ▲]

Detailed reference viewed: 366 (33 UL)