![]() Lee, Jaekwon ![]() ![]() ![]() in ACM Transactions on Software Engineering and Methodology (2023) Weakly hard real-time systems can, to some degree, tolerate deadline misses, but their schedulability still needs to be analyzed to ensure their quality of service. Such analysis usually occurs at early ... [more ▼] Weakly hard real-time systems can, to some degree, tolerate deadline misses, but their schedulability still needs to be analyzed to ensure their quality of service. Such analysis usually occurs at early design stages to provide implementation guidelines to engineers so that they can make better design decisions. Estimating worst-case execution times (WCET) is a key input to schedulability analysis. However, early on during system design, estimating WCET values is challenging and engineers usually determine them as plausible ranges based on their domain knowledge. Our approach aims at finding restricted, safe WCET sub-ranges given a set of ranges initially estimated by experts in the context of weakly hard real-time systems. To this end, we leverage (1) multi-objective search aiming at maximizing the violation of weakly hard constraints in order to find worst-case scheduling scenarios and (2) polynomial logistic regression to infer safe WCET ranges with a probabilistic interpretation. We evaluated our approach by applying it to an industrial system in the satellite domain and several realistic synthetic systems. The results indicate that our approach significantly outperforms a baseline relying on random search without learning, and estimates safe WCET ranges with a high degree of confidence in practical time (< 23h). [less ▲] Detailed reference viewed: 31 (1 UL)![]() Gaaloul, Khouloud ![]() ![]() ![]() in IEEE Transactions on Software Engineering (2022) Software verification may yield spurious failures when environment assumptions are not accounted for. Environment assumptions are the expectations that a system or a component makes about its operational ... [more ▼] Software verification may yield spurious failures when environment assumptions are not accounted for. Environment assumptions are the expectations that a system or a component makes about its operational environment and are often specified in terms of conditions over the inputs of that system or component. In this article, we propose an approach to automatically infer environment assumptions for Cyber-Physical Systems (CPS). Our approach improves the state-of-the-art in three different ways: First, we learn assumptions for complex CPS models involving signal and numeric variables; second, the learned assumptions include arithmetic expressions defined over multiple variables; third, we identify the trade-off between soundness and coverage of environment assumptions and demonstrate the flexibility of our approach in prioritizing either of these criteria. We evaluate our approach using a public domain benchmark of CPS models from Lockheed Martin and a component of a satellite control system from LuxSpace, a satellite system provider. The results show that our approach outperforms state-of-the-art techniques on learning assumptions for CPS models, and further, when applied to our industrial CPS model, our approach is able to learn assumptions that are sufficiently close to the assumptions manually developed by engineers to be of practical value. [less ▲] Detailed reference viewed: 231 (51 UL)![]() Ul Haq, Fitash ![]() ![]() ![]() in Empirical Software Engineering (2021), 26(5), We distinguish two general modes of testing for Deep Neural Networks (DNNs): Offline testing where DNNs are tested as individual units based on test datasets obtained without involving the DNNs under test ... [more ▼] We distinguish two general modes of testing for Deep Neural Networks (DNNs): Offline testing where DNNs are tested as individual units based on test datasets obtained without involving the DNNs under test, and online testing where DNNs are embedded into a specific application environment and tested in a closed-loop mode in interaction with the application environment. Typically, DNNs are subjected to both types of testing during their development life cycle where offline testing is applied immediately after DNN training and online testing follows after offline testing and once a DNN is deployed within a specific application environment. In this paper, we study the relationship between offline and online testing. Our goal is to determine how offline testing and online testing differ or complement one another and if offline testing results can be used to help reduce the cost of online testing? Though these questions are generally relevant to all autonomous systems, we study them in the context of automated driving systems where, as study subjects, we use DNNs automating end-to-end controls of steering functions of self-driving vehicles. Our results show that offline testing is less effective than online testing as many safety violations identified by online testing could not be identified by offline testing, while large prediction errors generated by offline testing always led to severe safety violations detectable by online testing. Further, we cannot exploit offline testing results to reduce the cost of online testing in practice since we are not able to identify specific situations where offline testing could be as accurate as online testing in identifying safety requirement violations. [less ▲] Detailed reference viewed: 152 (32 UL)![]() ; Ben Abdessalem (helali), Raja ![]() ![]() in 2021 IEEE 14th International Conference on Software Testing, Validation and Verification (ICST) (2021, May 25) The increasing levels of software- and data-intensive driving automation call for an evolution of automotive software testing. As a recommended practice of the Verification and Validation (V&V) process of ... [more ▼] The increasing levels of software- and data-intensive driving automation call for an evolution of automotive software testing. As a recommended practice of the Verification and Validation (V&V) process of ISO/PAS 21448, a candidate standard for safety of the intended functionality for road vehicles, simulation-based testing has the potential to reduce both risks and costs. There is a growing body of research on devising test automation techniques using simulators for Advanced Driver-Assistance Systems (ADAS). However, how similar are the results if the same test scenarios are executed in different simulators? We conduct a replication study of applying a Search-Based Software Testing (SBST) solution to a real-world ADAS (PeVi, a pedestrian vision detection system) using two different commercial simulators, namely, TASS/Siemens PreScan and ESI Pro-SiVIC. Based on a minimalistic scene, we compare critical test scenarios generated using our SBST solution in these two simulators. We show that SBST can be used to effectively and efficiently generate critical test scenarios in both simulators, and the test results obtained from the two simulators can reveal several weaknesses of the ADAS under test. However, executing the same test scenarios in the two simulators leads to notable differences in the details of the test outputs, in particular, related to (1) safety violations revealed by tests, and (2) dynamics of cars and pedestrians. Based on our findings, we recommend future V&V plans to include multiple simulators to support robust simulation-based testing and to base test objectives on measures that are less dependant on the internals of the simulators. [less ▲] Detailed reference viewed: 121 (18 UL)![]() Shin, Seung Yeob ![]() ![]() ![]() in Journal of Systems and Software (2021) Hardware-in-the-loop (HiL) testing is important for developing cyber physical systems (CPS). HiL test cases manipulate hardware, are time-consuming and their behaviors are impacted by the uncertainties in ... [more ▼] Hardware-in-the-loop (HiL) testing is important for developing cyber physical systems (CPS). HiL test cases manipulate hardware, are time-consuming and their behaviors are impacted by the uncertainties in the CPS environment. To mitigate the risks associated with HiL testing, engineers have to ensure that (1) test cases are well-behaved, e.g., they do not damage hardware, and (2) test cases can execute within a time budget. Leveraging the UML profile mechanism, we develop a domain-specific language, HITECS, for HiL test case specification. Using HITECS, we provide uncertainty-aware analysis methods to check the well-behavedness of HiL test cases. In addition, we provide a method to estimate the execution times of HiL test cases before the actual HiL testing. We apply HITECS to an industrial case study from the satellite domain. Our results show that: (1) HITECS helps engineers define more effective assertions to check HiL test cases, compared to the assertions defined without any systematic guidance; (2) HITECS verifies in practical time that HiL test cases are well-behaved; (3) HITECS is able to resolve uncertain parameters of HiL test cases by synthesizing conditions under which test cases are guaranteed to be well-behaved; and (4) HITECS accurately estimates HiL test case execution times. [less ▲] Detailed reference viewed: 470 (47 UL)![]() Ul Haq, Fitash ![]() ![]() ![]() in 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST) (2020, August 05) There is a growing body of research on developing testing techniques for Deep Neural Networks (DNN). We distinguish two general modes of testing for DNNs: Offline testing where DNNs are tested as ... [more ▼] There is a growing body of research on developing testing techniques for Deep Neural Networks (DNN). We distinguish two general modes of testing for DNNs: Offline testing where DNNs are tested as individual units based on test datasets obtained independently from the DNNs under test, and online testing where DNNs are embedded into a specific application and tested in a close-loop mode in interaction with the application environment. In addition, we identify two sources for generating test datasets for DNNs: Datasets obtained from real-life and datasets generated by simulators. While offline testing can be used with datasets obtained from either sources, online testing is largely confined to using simulators since online testing within real-life applications can be time consuming, expensive and dangerous. In this paper, we study the following two important questions aiming to compare test datasets and testing modes for DNNs: First, can we use simulator-generated data as a reliable substitute to real-world data for the purpose of DNN testing? Second, how do online and offline testing results differ and complement each other? Though these questions are generally relevant to all autonomous systems, we study them in the context of automated driving systems where, as study subjects, we use DNNs automating end-to-end control of cars' steering actuators. Our results show that simulator-generated datasets are able to yield DNN prediction errors that are similar to those obtained by testing DNNs with real-life datasets. Further, offline testing is more optimistic than online testing as many safety violations identified by online testing could not be identified by offline testing, while large prediction errors generated by offline testing always led to severe safety violations detectable by online testing. [less ▲] Detailed reference viewed: 296 (75 UL)![]() ; ; Nejati, Shiva ![]() in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2020) (2020, July) In the past years, several automated repair strategies have been proposed to fix bugs in individual software programs without any human intervention. There has been, however, little work on how automated ... [more ▼] In the past years, several automated repair strategies have been proposed to fix bugs in individual software programs without any human intervention. There has been, however, little work on how automated repair techniques can resolve failures that arise at the system-level and are caused by undesired interactions among different system components or functions. Feature interaction failures are common in complex systems such as autonomous cars that are typically built as a composition of independent features (i.e., units of functionality). In this paper, we propose a repair technique to automatically resolve undesired feature interaction failures in automated driving systems (ADS) that lead to the violation of system safety requirements. Our repair strategy achieves its goal by (1) localizing faults spanning several lines of code, (2) simultaneously resolving multiple interaction failures caused by independent faults, (3) scaling repair strategies from the unit-level to the system-level, and (4) resolving failures based on their order of severity. We have evaluated our approach using two industrial ADS containing four features. Our results show that our repair strategy resolves the undesired interaction failures in these two systems in less than 16h and outperforms existing automated repair techniques. [less ▲] Detailed reference viewed: 182 (16 UL)![]() Shin, Seung Yeob ![]() ![]() ![]() in Proceedings of the 15th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS'20) (2020, May) The concept of Internet of Things (IoT) has led to the development of many complex and critical systems such as smart emergency management systems. IoT-enabled applications typically depend on a ... [more ▼] The concept of Internet of Things (IoT) has led to the development of many complex and critical systems such as smart emergency management systems. IoT-enabled applications typically depend on a communication network for transmitting large volumes of data in unpredictable and changing environments. These networks are prone to congestion when there is a burst in demand, e.g., as an emergency situation is unfolding, and therefore rely on configurable software-defined networks (SDN). In this paper, we propose a dynamic adaptive SDN configuration approach for IoT systems. The approach enables resolving congestion in real time while minimizing network utilization, data transmission delays and adaptation costs. Our approach builds on existing work in dynamic adaptive search-based software engineering (SBSE) to reconfigure an SDN while simultaneously ensuring multiple quality of service criteria. We evaluate our approach on an industrial national emergency management system, which is aimed at detecting disasters and emergencies, and facilitating recovery and rescue operations by providing first responders with a reliable communication infrastructure. Our results indicate that (1) our approach is able to efficiently and effectively adapt an SDN to dynamically resolve congestion, and (2) compared to two baseline data forwarding algorithms that are static and non-adaptive, our approach increases data transmission rate by a factor of at least 3 and decreases data loss by at least 70%. [less ▲] Detailed reference viewed: 460 (55 UL)![]() Gaaloul, Khouloud ![]() ![]() ![]() in Proceedings of the The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (2020) Software verification approaches aim to check a software component under analysis for all possible environments. In reality, however, components are expected to operate within a larger system and are ... [more ▼] Software verification approaches aim to check a software component under analysis for all possible environments. In reality, however, components are expected to operate within a larger system and are required to satisfy their requirements only when their inputs are constrained by environment assumptions. In this paper, we propose EPIcuRus, an approach to automatically synthesize environment assumptions for a component under analysis (i.e., conditions on the component inputs under which the component is guaranteed to satisfy its requirements). EPIcuRus combines search-based testing, machine learning and model checking. The core of EPIcuRus is a decision tree algorithm that infers environment assumptions from a set of test results including test cases and their verdicts. The test cases are generated using search-based testing, and the assumptions inferred by decision trees are validated through model checking. In order to improve the efficiency and effectiveness of the assumption generation process, we propose a novel test case generation technique, namely Important Features Boundary Test (IFBT), that guides the test generation based on the feedback produced by machine learning. We evaluated EPIcuRus by assessing its effectiveness in computing assumptions on a set of study subjects that include 18 requirements of four industrial models. We show that, for each of the 18 requirements, EPIcuRus was able to compute an assumption to ensure the satisfaction of that requirement, and further, ≈78% of these assumptions were computed in one hour. [less ▲] Detailed reference viewed: 386 (138 UL)![]() Menghi, Claudio ![]() ![]() ![]() in Proceedings of the 42nd International Conference on Software Engineering (2020) Black-box testing has been extensively applied to test models of Cyber-Physical systems (CPS) since these models are not often amenable to static and symbolic testing and verification. Black-box testing ... [more ▼] Black-box testing has been extensively applied to test models of Cyber-Physical systems (CPS) since these models are not often amenable to static and symbolic testing and verification. Black-box testing, however, requires to execute the model under test for a large number of candidate test inputs. This poses a challenge for a large and practically-important category of CPS models, known as compute-intensive CPS (CI-CPS) models, where a single simulation may take hours to complete. We propose a novel approach, namely ARIsTEO, to enable effective and efficient testing of CI-CPS models. Our approach embeds black-box testing into an iterative approximation-refinement loop. At the start, some sampled inputs and outputs of the CI-CPS model under test are used to generate a surrogate model that is faster to execute and can be subjected to black-box testing. Any failure-revealing test identified for the surrogate model is checked on the original model. If spurious, the test results are used to refine the surrogate model to be tested again. Otherwise, the test reveals a valid failure. We evaluated ARIsTEO by comparing it with S-Taliro, an open-source and industry-strength tool for testing CPS models. Our results, obtained based on five publicly-available CPS models, show that, on average, ARIsTEO is able to find 24% more requirements violations than S-Taliro and is 31% faster than S-Taliro in finding those violations. We further assessed the effectiveness and efficiency of ARIsTEO on a large industrial case study from the satellite domain. In contrast to S-Taliro, ARIsTEO successfully tested two different versions of this model and could identify three requirements violations, requiring four hours, on average, for each violation. [less ▲] Detailed reference viewed: 192 (53 UL)![]() ; Nejati, Shiva ![]() ![]() in IEEE Transactions on Software Engineering (2019), 45(9), 919-944 All engineering disciplines are founded and rely on models, although they may differ on purposes and usages of modeling. Among the different disciplines, the engineering of Cyber Physical Systems (CPSs ... [more ▼] All engineering disciplines are founded and rely on models, although they may differ on purposes and usages of modeling. Among the different disciplines, the engineering of Cyber Physical Systems (CPSs) particularly relies on models with dynamic behaviors (i.e., models that exhibit time-varying changes). The Simulink modeling platform greatly appeals to CPS engineers since it captures dynamic behavior models. It further provides seamless support for two indispensable engineering activities: (1) automated verification of abstract system models via model simulation, and (2) automated generation of system implementation via code generation. We identify three main challenges in the verification and testing of Simulink models with dynamic behavior, namely incompatibility, oracle and scalability challenges. We propose a Simulink testing approach that attempts to address these challenges. Specifically, we propose a black-box test generation approach, implemented based on meta-heuristic search, that aims to maximize diversity in test output signals generated by Simulink models. We argue that in the CPS domain test oracles are likely to be manual and therefore the main cost driver of testing. In order to lower the cost of manual test oracles, we propose a test prioritization algorithm to automatically rank test cases generated by our test generation algorithm according to their likelihood to reveal a fault. Engineers can then select, according to their test budget, a subset of the most highly ranked test cases. To demonstrate scalability, we evaluate our testing approach using industrial Simulink models. Our evaluation shows that our test generation and test prioritization approaches outperform baseline techniques that rely on random testing and structural coverage. [less ▲] Detailed reference viewed: 430 (99 UL)![]() Menghi, Claudio ![]() ![]() ![]() in Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), (2019, August) Test automation requires automated oracles to assess test outputs. For cyber physical systems (CPS), oracles, in addition to be automated, should ensure some key objectives: (i) they should check test ... [more ▼] Test automation requires automated oracles to assess test outputs. For cyber physical systems (CPS), oracles, in addition to be automated, should ensure some key objectives: (i) they should check test outputs in an online manner to stop expensive test executions as soon as a failure is detected; (ii) they should handle time- and magnitude-continuous CPS behaviors; (iii) they should provide a quantitative degree of satisfaction or failure measure instead of binary pass/fail outputs; and (iv) they should be able to handle uncertainties due to CPS interactions with the environment. We propose an automated approach to translate CPS requirements specified in a logic-based language into test oracles specified in Simulink - a widely-used development and simulation language for CPS. Our approach achieves the objectives noted above through the identification of a fragment of Signal First Order logic (SFOL) to specify requirements, the definition of a quantitative semantics for this fragment and a sound translation of the fragment into Simulink. The results from applying our approach on 11 industrial case studies show that: (i) our requirements language can express all the 98 requirements of our case studies; (ii) the time and effort required by our approach are acceptable, showing potentials for the adoption of our work in practice, and (iii) for large models, our approach can dramatically reduce the test execution time compared to when test outputs are checked in an offline manner. [less ▲] Detailed reference viewed: 271 (63 UL)![]() Arora, Chetan ![]() ![]() ![]() in ACM Transactions on Software Engineering and Methodology (2019), 28(1), Domain models are a useful vehicle for making the interpretation and elaboration of natural-language requirements more precise. Advances in natural language processing (NLP) have made it possible to ... [more ▼] Domain models are a useful vehicle for making the interpretation and elaboration of natural-language requirements more precise. Advances in natural language processing (NLP) have made it possible to automatically extract from requirements most of the information that is relevant to domain model construction. However, alongside the relevant information, NLP extracts from requirements a significant amount of information that is superfluous, i.e., not relevant to the domain model. Our objective in this article is to develop automated assistance for filtering the superfluous information extracted by NLP during domain model extraction. To this end, we devise an active-learning-based approach that iteratively learns from analysts’ feedback over the relevance and superfluousness of the extracted domain model elements, and uses this feedback to provide recommendations for filtering superfluous elements. We empirically evaluate our approach over three industrial case studies. Our results indicate that, once trained, our approach automatically detects an average of ≈ 45% of the superfluous elements with a precision of ≈ 96%. Since precision is very high, the automatic recommendations made by our approach are trustworthy. Consequently, analysts can dispose of a considerable fraction – nearly half – of the superfluous elements with minimal manual work. The results are particularly promising, as they should be considered in light of the non-negligible subjectivity that is inherently tied to the notion of relevance. [less ▲] Detailed reference viewed: 697 (132 UL)![]() Nejati, Shiva ![]() ![]() ![]() in Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (2019) Matlab/Simulink is a development and simulation language that is widely used by the Cyber-Physical System (CPS) industry to model dynamical systems. There are two mainstream approaches to verify CPS ... [more ▼] Matlab/Simulink is a development and simulation language that is widely used by the Cyber-Physical System (CPS) industry to model dynamical systems. There are two mainstream approaches to verify CPS Simulink models: model testing that attempts to identify failures in models by executing them for a number of sampled test inputs, and model checking that attempts to exhaustively check the correctness of models against some given formal properties. In this paper, we present an industrial Simulink model benchmark, provide a categorization of different model types in the benchmark, describe the recurring logical patterns in the model requirements, and discuss the results of applying model checking and model testing approaches to identify requirements violations in the benchmarked models. Based on the results, we discuss the strengths and weaknesses of model testing and model checking. Our results further suggest that model checking and model testing are complementary and by combining them, we can significantly enhance the capabilities of each of these approaches individually. We conclude by providing guidelines as to how the two approaches can be best applied together. [less ▲] Detailed reference viewed: 236 (62 UL)![]() Gonzalez Perez, Carlos Alberto ![]() ![]() ![]() in Proceedings of ACM/IEEE 21st International Conference on Model Driven Engineering Languages and Systems (MODELS’18) (2018, October) Applying traditional testing techniques to Cyber-Physical Systems (CPS) is challenging due to the deep intertwining of software and hardware, and the complex, continuous interactions between the system ... [more ▼] Applying traditional testing techniques to Cyber-Physical Systems (CPS) is challenging due to the deep intertwining of software and hardware, and the complex, continuous interactions between the system and its environment. To alleviate these challenges we propose to conduct testing at early stages and over executable models of the system and its environment. Model testing of CPSs is however not without difficulties. The complexity and heterogeneity of CPSs renders necessary the combination of different modeling formalisms to build faithful models of their different components. The execution of CPS models thus requires an execution framework supporting the co-simulation of different types of models, including models of the software (e.g., SysML), hardware (e.g., SysML or Simulink), and physical environment (e.g., Simulink). Furthermore, to enable testing in realistic conditions, the co-simulation process must be (1) fast, so that thousands of simulations can be conducted in practical time, (2) controllable, to precisely emulate the expected runtime behavior of the system and, (3) observable, by producing simulation data enabling the detection of failures. To tackle these challenges, we propose a SysML-based modeling methodology for model testing of CPSs, and an efficient SysML-Simulink co-simulation framework. Our approach was validated on a case study from the satellite domain. [less ▲] Detailed reference viewed: 462 (53 UL)![]() Shin, Seung Yeob ![]() ![]() ![]() in Proceedings of ACM/IEEE 21st International Conference on Model Driven Engineering Languages and Systems (MODELS’18) (2018, October) Hardware-in-the-loop (HiL) testing is an important step in the development of cyber physical systems (CPS). CPS HiL test cases manipulate hardware components, are time-consuming and their behaviors are ... [more ▼] Hardware-in-the-loop (HiL) testing is an important step in the development of cyber physical systems (CPS). CPS HiL test cases manipulate hardware components, are time-consuming and their behaviors are impacted by the uncertainties in the CPS environment. To mitigate the risks associated with HiL testing, engineers have to ensure that (1) HiL test cases are well-behaved, i.e., they implement valid test scenarios and do not accidentally damage hardware, and (2) HiL test cases can execute within the time budget allotted to HiL testing. This paper proposes an approach to help engineers systematically specify and analyze CPS HiL test cases. Leveraging the UML profile mechanism, we develop an executable domain-specific language, HITECS, for HiL test case specification. HITECS builds on the UML Testing Profile (UTP) and the UML action language (Alf). Using HITECS, we provide analysis methods to check whether HiL test cases are well-behaved, and to estimate the execution times of these test cases before the actual HiL testing stage. We apply HITECS to an industrial case study from the satellite domain. Our results show that: (1) HITECS is feasible to use in practice; (2) HITECS helps engineers define more complete and effective well-behavedness assertions for HiL test cases, compared to when these assertions are defined without systematic guidance; (3) HITECS verifies in practical time that HiL test cases are well-behaved; and (4) HITECS accurately estimates HiL test case execution times. [less ▲] Detailed reference viewed: 415 (90 UL)![]() ; Briand, Lionel ![]() ![]() in IEEE Software (2018), 35(5), 44-49 Software engineering is not only an increasingly challenging endeavor that goes beyond the intellectual capabilities of any single individual engineer, but is also an intensely human one. Tools and ... [more ▼] Software engineering is not only an increasingly challenging endeavor that goes beyond the intellectual capabilities of any single individual engineer, but is also an intensely human one. Tools and methods to develop software are employed by engineers of varied backgrounds within a large variety of organizations and application domains. As a result, the variation in challenges and practices in system requirements, architecture, and quality assurance is staggering. Human, domain and organizational factors define the context within which software engineering methodologies and technologies are to be applied and therefore the context that research needs to account for, if it is to be impactful. This paper provides an assessment of the current challenges faced by software engineering research in achieving its potential, a description of the root causes of such challenges, and a proposal for the field to move forward and become more impactful through collaborative research and innovation between public research and industry. [less ▲] Detailed reference viewed: 384 (42 UL)![]() Shin, Seung Yeob ![]() ![]() ![]() in Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'18) (2018, July) Acceptance testing validates that a system meets its requirements and determines whether it can be sufficiently trusted and put into operation. For cyber physical systems (CPS), acceptance testing is a ... [more ▼] Acceptance testing validates that a system meets its requirements and determines whether it can be sufficiently trusted and put into operation. For cyber physical systems (CPS), acceptance testing is a hardware-in-the-loop process conducted in a (near-)operational environment. Acceptance testing of a CPS often necessitates that the test cases be prioritized, as there are usually too many scenarios to consider given time constraints. CPS acceptance testing is further complicated by the uncertainty in the environment and the impact of testing on hardware. We propose an automated test case prioritization approach for CPS acceptance testing, accounting for time budget constraints, uncertainty, and hardware damage risks. Our approach is based on multi-objective search, combined with a test case minimization algorithm that eliminates redundant operations from an ordered sequence of test cases. We evaluate our approach on a representative case study from the satellite domain. The results indicate that, compared to test cases that are prioritized manually by satellite engineers, our automated approach more than doubles the number of test cases that fit into a given time frame, while reducing to less than one third the number of operations that entail the risk of damage to key hardware components. [less ▲] Detailed reference viewed: 507 (76 UL)![]() Gonzalez Perez, Carlos Alberto ![]() ![]() ![]() Report (2018) Detailed reference viewed: 270 (36 UL)![]() ; Nejati, Shiva ![]() in Empirical Software Engineering (2018), 24(1), 444-490 One promising way to improve the accuracy of fault localization based on statistical debugging is to increase diversity among test cases in the underlying test suite. In many practical situations, adding ... [more ▼] One promising way to improve the accuracy of fault localization based on statistical debugging is to increase diversity among test cases in the underlying test suite. In many practical situations, adding test cases is not a cost-free option because test oracles are developed manually or running test cases is expensive. Hence, we require to have test suites that are both diverse and small to improve debugging. In this paper, we focus on improving fault localization of Simulink models by generating test cases. We identify four test objectives that aim to increase test suite diversity. We use four objectives in a search-based algorithm to generate diversified but small test suites. To further minimize test suite sizes, we develop a prediction model to stop test generation when adding test cases is unlikely to improve fault localization. We evaluate our approach using three industrial subjects. Our results show (1) expanding test suites used for fault localization using any of our four test objectives, even when the expansion is small, can significantly improve the accuracy of fault localization, (2) varying test objectives used to generate the initial test suites for fault localization does not have a significant impact on the fault localization results obtained based on those test suites, and (3) we identify an optimal configuration for prediction models to help stop test generation when it is unlikely to be beneficial. We further show that our optimal prediction model is able to maintain almost the same fault localization accuracy while reducing the average number of newly generated test cases by more than half. [less ▲] Detailed reference viewed: 464 (66 UL) |
||