Test Data Generation; Usage-based Statistical Testing; Model-Driven Engineering; UML; OCL
Résumé :
[en] Usage-based statistical testing employs knowledge about the actual or anticipated usage profile of the system under test for estimating system reliability. For many systems, usage-based statistical testing involves generating synthetic test data. Such data must possess the same statistical characteristics as the actual data that the system will process during operation. Synthetic test data must further satisfy any logical validity constraints that the actual data is subject to. Targeting data-intensive systems, we propose an approach for generating synthetic test data that is both statistically representative and logically valid. The approach works by first generating a data sample that meets the desired statistical characteristics, without taking into account the logical constraints. Subsequently, the approach tweaks the generated sample to fix any logical constraint violations. The tweaking process is iterative and continuously guided toward achieving the desired statistical characteristics. We report on a realistic evaluation of the approach, where we generate a synthetic population of citizens' records for testing a public administration IT system. Results suggest that our approach is scalable and capable o
Centre de recherche :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Software Verification and Validation Lab (SVV Lab)
Disciplines :
Sciences informatiques
Auteur, co-auteur :
SOLTANA, Ghanem ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
SABETZADEH, Mehrdad ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
BRIAND, Lionel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Synthetic Data Generation for Statistical Testing
Date de publication/diffusion :
2017
Nom de la manifestation :
32nd IEEE/ACM International Conference on Automated Software Engineering (ASE'17)
Lieu de la manifestation :
Illinois, Etats-Unis
Date de la manifestation :
from 30-10-2017 to 03-11-2017
Manifestation à portée :
International
Titre de l'ouvrage principal :
32nd IEEE/ACM International Conference on Automated Software Engineering (ASE'17)
Maison d'édition :
IEEE
ISBN/EAN :
978-1-5386-2684-9
Pagination :
872-882
Peer reviewed :
Peer reviewed
Focus Area :
Security, Reliability and Trust
Projet européen :
H2020 - 694277 - TUNE - Testing the Untestable: Model Testing of Complex Software-Intensive Systems
P. Runeson and C. Wohlin, "Statistical usage testing for software reliability control," Informatica, vol. 19, no. 2, pp. 195-207, 1995.
J. D. Musa, "Operational profiles in software-reliability engineering," IEEE Software, vol. 10, no. 2, pp. 14-32, 1993.
J. A. Whittaker and J. H. Poore, "Markov analysis of software specifications," ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 2, no. 1, pp. 93-106, 1993.
J. H. Poore and C. J. Trammell, "Application of statistical science to testing and evaluating software intensive systems," in Statistics, Testing, and Defense Acquisition, M. L. Cohen, D. L. Steffey, and J. E. Rolph, Eds. National Academies Press, 1999, ch. 3.
C. Kallepalli and J. Tian, "Measuring and modeling usage and reliability for statistical web testing," IEEE Transactions on Software Engineering (TSE), vol. 27, no. 11, pp. 1023-1036, 2001.
P. Tonella and F. Ricca, "Statistical testing of web applications," Journal of Software Maintenance and Evolution: Research and Practice, vol. 16, no. 1-2, pp. 103-127, 2004.
H. L. Guen, R. Marie, and T. Thelin, "Reliability estimation for statistical usage testing using markov chains," in Proceedings of 15th IEEE International Symposium on Software Reliability Engineering (ISSRE'04). IEEE, 2004, pp. 54-65.
S. Herbold, P. Harms, and J. Grabowski, "Combining usage-based and model-based testing for service-oriented architectures in the industrial practice," International Journal on Software Tools for Technology Transfer (STTT), 2016, (in press).
S. De Capitani di Vimercati, S. Foresti, S. Jajodia, S. Paraboschi, and P. Samarati, "Fragments and loose associations: Respecting privacy in data publishing," Proceedings of Very Large Data Bases Endowment (VLDB), vol. 3, no. 1, pp. 1370-1381, 2010.
D. Al-Azizy, D. Millard, I. Symeonidis, K. O'Hara, and N. Shadbolt, "A literature survey and classifications on data deanonymisation," in Proceedings of 10th International Conference on Risks and Security of Internet and Systems (CRiSIS'10). Springer, 2015, pp. 36-51.
G. Soltana, N. Sannier, M. Sabetzadeh, and L. Briand, "Model-based simulation of legal policies: Framework, tool support, and validation," Software & Systems Modeling (SoSyM), 2016, (in press).
F. Figari, A. Paulus, and H. Sutherland, "Microsimulation and policy analysis," Handbook of Income Distribution, vol. 2, 2014.
G. Soltana, M. Sabetzadeh, and L. Briand, "Model-based simulation of legal requirements: Experience from tax policy simulation," in Proceedings of 24th IEEE International Requirements Engineering Conference (RE'16). IEEE, 2016.
Object Management Group, "Object Constraint Language 2.4 Specification," 2004, http://www.omg.org/spec/OCL/2.4/, last accessed: May 2017.
S. Ali, M. Z. Iqbal, M. Khalid, and A. Arcuri, "Improving the performance of OCL constraint solving with novel heuristics for logical operations: A search-based approach," Empirical Software Engineering (ESE), vol. 21, no. 6, pp. 2459-2502, 2016.
Object Management Group, "OMG Unified Modeling Language (UML)," 2015, http://www.omg.org/spec/UML/2.5, last accessed: March 2017.
K. Anastasakis, B. Bordbar, G. Georg, and I. Ray, "On challenges of model transformation from UML to Alloy," Software & Systems Modeling (SoSyM), vol. 9, no. 1, pp. 69-86, 2010.
A. Cunha, A. Garis, and D. Riesco, "Translating between Alloy specifications and UML class diagrams annotated with OCL," Software & Systems Modeling (SoSyM), vol. 14, no. 1, pp. 5-25, 2015.
J. Cabot, R. Clarisó, and D. Riera, "On the verification of UML/OCL class diagrams using constraint programming," Journal of Systems and Software (JSS), vol. 93, pp. 1-23, 2014.
S. Ali, M. Z. Iqbal, A. Arcuri, and L. C. Briand, "Generating test data from OCL constraints with search techniques," IEEE Transactions on Software Engineering (TSE), vol. 39, no. 10, pp. 1376-1402, 2013.
M. P. Krieger and A. Knapp, "Executing underspecified OCL operation contracts with a SAT solver," Electronic Communication of the European Association of Software Science and Technology (ECEASST), vol. 15, pp. 1-16, 2008.
P. Hurley, A concise introduction to logic. Nelson Education, 2014.
P. B. Miltersen, J. Radhakrishnan, and I. Wegener, "On converting CNF to DNF," Theoretical Computer Science, vol. 347, no. 1-2, pp. 325-335, 2005.
R. K. Hammond and J. E. Bickel, "Discretization methods for continuous probability distributions," in Wiley Encyclopedia of Operations Research and Management Science. Wiley, 2015.
S.-H. Cha, "Comprehensive survey on distance/similarity measures between probability density functions," Mathematical Models and Methods in Applied Sciences, vol. 1, no. 2, pp. 300-307, 2007.
G. Soltana, M. Sabetzadeh, and L. Briand, "Synthetic data generation for statistical testing: Supplementary material," SnT Centre for Security, Reliability and Trust, University of Luxembourg, Supplementary Material, May 2017, http://people.svv.lu/soltana/ASE17-supp.pdf.
Eclipse Foundation, "EMF: Eclipse Modeling Framework," http://www. eclipse.org/emf, last accessed: May 2017.
Apache Foundation, "Apache commons mathematics library," http:// commons.apache.org/proper/commons-math/, last accessed: May 2017.
S. Ali, L. C. Briand, H. Hemmati, and R. K. Panesar-Walawege, "A systematic review of the application and empirical investigation of search-based test case generation," IEEE Transactions on Software Engineering (TSE), vol. 36, no. 6, pp. 742-762, 2010.
D. Jackson, Software Abstractions: logic, language, and analysis. MIT press, 2012.
T. Hartmann, F. Fouquet, J. Klein, Y. Le Traon, A. Pelov, L. Toutain, and T. Ropitault, "Generating realistic smart grid communication topologies based on real-data," in Proceedings of 5th IEEE International Conference on Smart Grid Communications (SmartGridComm'14), 2014, pp. 428-433.
A. Mougenot, A. Darrasse, X. Blanc, and M. Soria, "Uniform random generation of huge metamodel instances," in Proceedings of 5th European Conference on Model Driven Architecture-Foundations and Applications (ECMDA-FA'09), 2009, pp. 130-145.
G. Fraser and A. Arcuri, "Whole test suite generation," IEEE Transactions on Software Engineering (TSE), vol. 39, no. 2, pp. 276-291, 2013.
J. M. Rojas, M. Vivanti, A. Arcuri, and G. Fraser, "A detailed investigation of the effectiveness of whole test suite generation," Empirical Software Engineering (ESE), vol. 22, no. 2, pp. 852-893, 2017.