Test Input Prioritization for Machine Learning Classifiers

DANG, Xueqi; LI, Yinghua; PAPADAKIS, Mike; KLEIN, Jacques; BISSYANDE, Tegawendé François d Assise; LE TRAON, Yves

doi:10.1109/TSE.2024.3350019

Download

Article (Scientific journals)

Test Input Prioritization for Machine Learning Classifiers

DANG, Xueqi; LI, Yinghua; PAPADAKIS, Mike et al.

2024 • In IEEE Transactions on Software Engineering, 50 (3), p. 413 - 442

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/10993/62232

DOI
10.1109/TSE.2024.3350019

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

2024_TSE_MLPrior.pdf

Publisher postprint (1.54 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Disciplines :

Computer science

Author, co-author :

DANG, Xueqi ^✱; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

LI, Yinghua ^✱; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX

PAPADAKIS, Mike ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

KLEIN, Jacques ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX

BISSYANDE, Tegawendé François d Assise ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX

LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

^✱ These authors have contributed equally to this work.

External co-authors :

Language :

English

Title :

Test Input Prioritization for Machine Learning Classifiers

Publication date :

05 January 2024

Journal title :

IEEE Transactions on Software Engineering

ISSN :

0098-5589

eISSN :

1939-3520

Publisher :

Institute of Electrical and Electronics Engineers Inc.

Volume :

Issue :

Pages :

413 - 442

Peer reviewed :

Peer Reviewed verified by ORBi

Focus Area :

Computational Sciences

Additional URL :

http://xplorestaging.ieee.org/ielx7/32/10473597/10382258.pdf?arnumber=10382258

European Projects :

H2020 - 949014 - NATURAL - Natural Program Repair

FnR Project :

FNR17036341 - Towards Improving The Robustness Of Graph Neural Network Models: An Empirical Study, 2022 (01/08/2022-31/07/2025) - Xueqi Dang

Funders :

Luxembourg National Research Fund AFR PhD
European Research Council
Luxembourg National Research Funds
Union Européenne

Funding number :

AFR PhD 17036341; C20/IS/14761415/TestFlakes; H2020 - 949014 - NATURAL

Available on ORBilu :

since 18 October 2024

Statistics

Number of views

120 (6 by Unilu)

Number of downloads

51 (1 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

WoS citations^™

Bibliography

Y. Li et al., "AI-driven mobile apps: An explorative study, " 2022, arXiv:2212.01635.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition, " in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2016, pp. 770-778.
D. W. Otter, J. R. Medina, and J. K. Kalita, "A survey of the usages of deep learning for natural language processing, " IEEE Trans. Neural Netw. Learn. Syst., vol. 32, no. 2, pp. 604-624, Feb. 2021.
T. Wolf et al., "Transformers: State-of-the-art natural language processing, " in Proc. Conf. Empirical Methods Natural Lang. Process., Syst. Demonstrations, 2020, pp. 38-45.
Z. Batmaz, A. Yurekli, A. Bilge, and C. Kaleli, "A review on deep learning for recommender systems: Challenges and remedies, " Artif. Intell. Rev., vol. 52, pp. 1-37, Aug. 2019.
J. Zeng, H. Tang, Y. Li, and X. He, "A deep learning model based on sparse matrix for point-of-interest recommendation, " in Proc. Int. Conf. Softw. Eng. Knowl. Eng., 2019, pp. 379-492.
D. V. Carvalho, E. M. Pereira, and J. S. Cardoso, "Machine learning interpretability: A survey on methods and metrics, " Electronics, vol. 8, no. 8, p. 832, 2019.
T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system, " in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016, pp. 785-794.
Y.-Y. Song and L. Ying, "Decision tree methods: Applications for classification and prediction, " Shanghai Arch. Psychiatry, vol. 27, no. 2, p. 130, 2015.
R. E. Wright, "Logistic regression, " in Reading and Understanding Multivariate Statistics, L. G. Grimm and P. R. Yarnold, Eds., Washington, DC, USA: American Psychological Assoc., 1995, pp. 217-244.
J. M. Zhang, M. Harman, L. Ma, and Y. Liu, "Machine learning testing: Survey, landscapes and horizons, " IEEE Trans. Softw. Eng., vol. 48, no. 1, pp. 1-36, Jan. 2022.
M. J. Raihan, M. A.-M. Khan, S.-H. Kee, and A.-A. Nahid, "Detection of the chronic kidney disease using XGboost classifier and explaining the influence of the attributes on the model using SHAP, " Scientific Rep., vol. 13, no. 1, p. 6263, 2023.
D. Yu et al., "Copy number variation in plasma as a tool for lung cancer prediction using Extreme Gradient Boosting (XGBoost) classifier, " Thoracic Cancer, vol. 11, no. 1, pp. 95-102, 2020.
T. W. Cenggoro, B. Mahesworo, A. Budiarto, J. Baurley, T. Suparyanto, and B. Pardamean, "Features importance in classification models for colorectal cancer cases phenotype in Indonesia, " Procedia Comput. Sci., vol. 157, pp. 313-320, Oct. 2019.
Z. Wang, H. You, J. Chen, Y. Zhang, X. Dong, and W. Zhang, "Prioritizing test inputs for deep neural networks via mutation analysis, " in Proc. IEEE/ACM 43rd Int. Conf. Softw. Eng. (ICSE), 2021, pp. 397-409.
Y. Feng, Q. Shi, X. Gao, J. Wan, C. Fang, and Z. Chen, "DeepGini: Prioritizing massive tests to enhance the robustness of deep neural networks, " in Proc. 29th ACM SIGSOFT Int. Symp. Softw. Testing Anal., 2020, pp. 177-188.
K. Pei, Y. Cao, J. Yang, and S. Jana, "DeepXplore: Automated whitebox testing of deep learning systems, " in Proc. 26th Symp. Operating Syst. Princ., 2017, pp. 1-18.
L. Ma et al., "DeepGauge: Multi-granularity testing criteria for deep learning systems, " in Proc. 33rd ACM/IEEE Int. Conf. Autom. Softw. Eng., 2018, pp. 120-131.
M. Wicker, X. Huang, and M. Kwiatkowska, "Feature-guided blackbox safety testing of deep neural networks, " in Proc. Tools Algorithms Construction Anal. Syst., 24th Int. Conf., TACAS, Held as Part Eur. Joint Conf. Theory Pract. Softw., ETAPS, Thessaloniki, Greece, Apr. 14-20, 2018, Part I 24. Springer-Verlag, 2018, pp. 408-426.
M. Weiss and P. Tonella, "Simple techniques work surprisingly well for neural network test prioritization and active learning (replicability study), " in Proc. 31st ACM SIGSOFT Int. Symp. Softw. Testing Anal., 2022, pp. 139-150.
B. K. Aichernig, H. Brandl, E. Jöbstl, W. Krenn, R. Schlick, and S. Tiran, "Killing strategies for model-based mutation testing, " Softw. Test. Verification Reliab., vol. 25, no. 8, pp. 716-748, 2015. [Online]. Available: https://doi.org/10.1002/stvr.1522
X. Devroey, G. Perrouin, M. Papadakis, A. Legay, P. Schobbens, and P. Heymans, "Featured model-based mutation analysis, " in Proc. 38th Int. Conf. Softw. Eng., (ICSE), Austin, TX, USA, May 14-22, 2016, L. K. Dillon, W. Visser, and L. A. Williams, Eds., New York, NY, USA: ACM, 2016, pp. 655-666. [Online]. Available: https://doi.org/10.1145/2884781.2884821
M. Papadakis, C. Henard, and Y. L. Traon, "Sampling program inputs with mutation analysis: Going beyond combinatorial interaction testing, " in Proc. 7th IEEE Int. Conf. Softw. Testing, Verification Validation (ICST), Mar. 31/Apr. 4, 2014, Cleveland, OH, USA. Los Alamitos, CA, USA: IEEE Comput. Soc. Press, 2014, pp. 1-10.[Online]. Available: https://doi.org/10.1109/ICST.2014.11
L. Ma et al., "DeepMutation: Mutation testing of deep learning systems, " in Proc. IEEE 29th Int. Symp. Softw. Rel. Eng. (ISSRE), 2018, pp. 100-111.
X. Gao, J. Zhai, S. Ma, C. Shen, Y. Chen, and Q. Wang, "FairNeuron: Improving deep neural network fairness with adversary games on selective neurons, " in Proc. 44th Int. Conf. Softw. Eng., 2022, pp. 921-933.
Z. Chen, J. M. Zhang, F. Sarro, and M. Harman, "MAAT: A novel ensemble approach to addressing fairness and performance bugs for machine learning software, " in Proc. 30th ACM Joint Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2022, pp. 1122-1134.
B. R. Kiran et al., "Deep reinforcement learning for autonomous driving: A survey, " IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4909-4926, Jun. 2022.
M. Fatima et al., "Survey of machine learning algorithms for disease diagnostic, " J. Intell. Learn. Syst. Appl., vol. 9, no. 1, p. 1, 2017.
G. Ke et al., "LightGBM: A highly efficient gradient boosting decision tree, " in Proc. Adv. Neural Inf. Process. Syst., 2017, vol. 30, pp. 3149-3157.
S. Mallat, "Understanding deep convolutional networks, " Philos. Trans. Roy. Soc. A, Math., Phys. Eng. Sci., vol. 374, no. 2065, 2016, Art. no. 20150203.
M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks, " in Proc. Eur. Conf. Comput. Vision, Springer-Verlag, 2014, pp. 818-833.
B. Yeo and D. Grant, "Predicting service industry performance using decision tree analysis, " Int. J. Inf. Manage., vol. 38, no. 1, pp. 288-300, 2018.
B. Kim, R. Khanna, and O. O. Koyejo, "Examples are not enough, learn to criticize! criticism for interpretability, " in Proc. Adv. Neural Inf. Process. Syst., vol. 29, 2016, pp. 2288-2296.
M. Sewak, S. K. Sahay, and H. Rathore, "Comparison of deep learning and the classical machine learning algorithm for the malware detection, " in Proc. 19th IEEE/ACIS Int. Conf. Softw. Eng., Artif. Intell., Netw. Parallel/Distrib. Comput. (SNPD), 2018, pp. 293-296.
M. Ghassemi, T. Naumann, P. Schulam, A. L. Beam, I. Y. Chen, and R. Ranganath, "A review of challenges and opportunities in machine learning for health, " in Proc. AMIA Summits Translational Sci., 2020, vol. 2020, p. 191.
F. Rundo, F. Trenta, A. L. di Stallo, and S. Battiato, "Machine learning for quantitative finance applications: A survey, " Appl. Sci., vol. 9, no. 24, p. 5574, 2019.
P. Weber, K. V. Carl, and O. Hinz, "Applications of explainable artificial intelligence in finance-A systematic review of finance, information systems, and computer science literature, " Manage. Rev. Quart., vol. 73, pp. 1-41, Feb. 2023.
C. Chen, K. Lin, C. Rudin, Y. Shaposhnik, S. Wang, and T. Wang, "A holistic approach to interpretability in financial lending: Models, visualizations, and summary-explanations, " Decis. Support Syst., vol. 152, Jan. 2022, Art. no. 113647.
A. Adadi and M. Berrada, "Explainable AI for healthcare: From black box to interpretable models, " in Proc. Embedded Syst. Artif. Intell. (ESAI), Fez, Morocco. Springer-Verlag, 2020, pp. 327-337.
M. Verdicchio and A. Perin, "When doctors and AI interact: On human responsibility for artificial risks, " Philos. Technol., vol. 35, no. 1, p. 11, 2022.
H. Smith, "Clinical AI: Opacity, accountability, responsibility and liability, " AI Soc., vol. 36, no. 2, pp. 535-545, 2021.
J. Amann, A. Blasimme, E. Vayena, D. Frey, V. I. Madai, and Precise4Q Consortium, "Explainability for artificial intelligence in healthcare: A multidisciplinary perspective, " BMC Med. Inform. Decis. Making, vol. 20, pp. 1-9, Nov. 2020.
T. Grote and P. Berens, "On the ethics of algorithmic decision-making in healthcare, " J. Med. Ethics, vol. 45, pp. 205-211, Nov. 2019.
H. Yan et al., "New trend in Fintech: Research on artificial intelligence model interpretability in financial fields, " Open J. Appl. Sci., vol. 9, no. 10, p. 761, 2019.
K. Suzuki, "Overview of deep learning in medical imaging, " Radiological Phys. Technol., vol. 10, no. 3, pp. 257-273, 2017.
D. Shen, G. Wu, and H.-I. Suk, "Deep learning in medical image analysis, " Annu. Rev. Biomed. Eng., vol. 19, pp. 221-248, Jun. 2017.
Z. A. Shirazi, C. P. de Souza, R. Kashef, and F. F. Rodrigues, "Deep learning in the healthcare industry: Theory and applications, " in Computational Intelligence and Soft Computing Applications in Healthcare Management Science. Hershey, PA, USA: IGI Global, 2020, pp. 220-245.
R. Shwartz-Ziv and A. Armon, "Tabular data: Deep learning is not all you need, " Inf. Fusion, vol. 81, pp. 84-90, May 2022.
Y. Wang and T. Wang, "Application of improved lightGBM model in blood glucose prediction, " Appl. Sci., vol. 10, no. 9, p. 3227, 2020.
L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal, "Explaining explanations: An overview of interpretability of machine learning, " in Proc. IEEE 5th Int. Conf. Data Sci. Adv. Anal. (DSAA), 2018, pp. 80-89.
M. A. Hanif, F. Khalid, R. V. W. Putra, S. Rehman, and M. Shafique, "Robust machine learning systems: Reliability and security for deep neural networks, " in Proc. IEEE 24th Int. Symp.-Line Testing Robust Syst. Des. (IOLTS), 2018, pp. 257-260.
N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, "A survey on bias and fairness in machine learning, " ACM Comput. Surv. (CSUR), vol. 54, no. 6, pp. 1-35, 2021.
A. N. Bhagoji, D. Cullina, C. Sitawarin, and P. Mittal, "Enhancing robustness of machine learning systems via data transformations, " in Proc. 52nd Annu. Conf. Inf. Sci. Syst. (CISS), 2018, pp. 1-5.
X. Xie, J. W. Ho, C. Murphy, G. Kaiser, B. Xu, and T. Y. Chen, "Testing and validating machine learning classifiers by metamorphic testing, " J. Syst. Softw., vol. 84, no. 4, pp. 544-558, 2011.
J. Chen, Z. Wu, Z. Wang, H. You, L. Zhang, and M. Yan, "Practical accuracy estimation for efficient deep neural network testing, " ACM Trans. Softw. Eng. Methodology (TOSEM), vol. 29, no. 4, pp. 1-35, 2020.
Z. Li, X. Ma, C. Xu, C. Cao, J. Xu, and J. Lü, "Boosting operational DNN testing efficiency through conditioning, " in Proc. 27th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., 2019, pp. 499-509.
S. Yoo and M. Harman, "Regression testing minimization, selection and prioritization: A survey, " Softw. Testing, Verification Rel., vol. 22, no. 2, pp. 67-120, 2012.
X. Dang, Y. Li, M. Papadakis, J. Klein, T. F. Bissyandé, and Y. L. Traon, "GraphPrior: Mutation-based test input prioritization for graph neural networks, " ACM Trans. Softw. Eng. Methodology, vol. 33, pp. 1-40, Nov. 2023.
D. W. Hosmer Jr., S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression. Hoboken, NJ, USA: Wiley, 2013.
L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees. NY, USA: Routledge, 2017.
R. A. DeMillo, R. J. Lipton, and F. G. Sayward, "Hints on test data selection: Help for the practicing programmer, " Computer, vol. 11, no. 4, pp. 34-41, Apr. 1978. [Online]. Available: https://doi.org/10.1109/C-M.1978.218136
P. Ammann and J. Offutt, Introduction to Software Testing. Cambridge, U.K.: Cambridge Univ. Press, 2008.
Y. Jia and M. Harman, "An analysis and survey of the development of mutation testing, " IEEE Trans. Softw. Eng., vol. 37, no. 5, pp. 649-678, Sep./Oct. 2011.
G. Jahangirova and P. Tonella, "An empirical evaluation of mutation operators for deep learning systems, " in Proc. IEEE 13th Int. Conf. Softw. Testing, Validation Verification (ICST), 2020, pp. 74-84.
P. Delgado-Pérez, I. Habli, S. Gregory, R. Alexander, J. Clark, and I. Medina-Bulo, "Evaluation of mutation testing in a nuclear industry case study, " IEEE Trans. Rel., vol. 67, no. 4, pp. 1406-1419, Dec. 2018.
G. Petrovic, M. Ivankovic, B. Kurtz, P. Ammann, and R. Just, "An industrial application of mutation testing: Lessons, challenges, and research directions, " in Proc. IEEE Int. Conf. Softw. Testing, Verification Validation Workshops (ICSTW), 2018, pp. 47-53.
A. J. Offutt, A. Lee, G. Rothermel, R. H. Untch, and C. Zapf, "An experimental determination of sufficient mutant operators, " ACM Trans. Softw. Eng. Methodology (TOSEM), vol. 5, no. 2, pp. 99-118, 1996.
D. Schuler and A. Zeller, "Javalanche: Efficient mutation testing for Java, " in Proc. 7th joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp. Found. Softw. Eng., 2009, pp. 297-298.
M. Papadakis, M. Kintis, J. Zhang, Y. Jia, Y. Le Traon, and M. Harman, "Mutation testing advances: An analysis and survey, " in Proc. Adv. Comput., Elsevier, 2019, vol. 112, pp. 275-378.
T. Fredriksson, J. Bosch, and H. H. Olsson, "Machine learning models for automatic labeling: A systematic literature review, " in Proc. Int. Conf. Softw. Technol. (ICSOFT), 2020, pp. 552-561.
T. Fredriksson, D. I. Mattos, J. Bosch, and H. H. Olsson, "Data labeling: An empirical investigation into industrial challenges and mitigation strategies, " in Proc. Int. Conf. Product-Focused Softw. Process Improvement, Springer-Verlag, 2020, pp. 202-216.
M. Desmond, E. Duesterwald, K. Brimijoin, M. Brachman, and Q. Pan, "Semi-automated data labeling, " in Proc. NeurIPS Competition Demonstration Track, 2021, pp. 156-169.
J. Wu, C. Ye, V. S. Sheng, Y. Yao, P. Zhao, and Z. Cui, "Semi-automatic labeling with active learning for multi-label image classification, " in Proc. Adv. Multimedia Inf. Process.(PCM), 16th Pacific-Rim Conf. Multimedia, Gwangju, South Korea, Part I. Springer-Verlag, Sep. 16-18, 2015, pp. 473-482.
B. Hancock. "Making automated data labeling a reality in modern AI." Snorkel. Accessed: 2023. [Online]. Available: https://www.snorkel.ai/blog/automated-data-labeling
G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, "KNN modelbased approach in classification, " in Proc. Move Meaningful Internet Syst., CoopIS, DOA, ODBASE, OTM Confederated Int. Conf., CoopIS, DOA, ODBASE, Catania, Sicily, Italy, Springer-Verlag, Nov. 3-7, 2003, pp. 986-996.
H. Kim and Z. Gu, "A logistic regression analysis for predicting bankruptcy in the hospitality industry, " J. Hospitality Financial Manage., vol. 14, no. 1, pp. 17-34, 2006.
A. Mayr, H. Binder, O. Gefeller, and M. Schmid, "The evolution of boosting algorithms, " Methods Inf. Med., vol. 53, no. 6, pp. 419-427, 2014.
P. Chen, S. Liu, H. Zhao, and J. Jia, "Gridmask data augmentation, " 2020, arXiv:2001.04086.
Q. H. Nguyen et al., "Influence of data splitting on performance of machine learning models in prediction of shear strength of soil, " Math. Problems Eng., vol. 2021, pp. 1-15, Feb. 2021.
L. Breiman, "Random forests, " Mach. Learn., vol. 45, pp. 5-32, Oct. 2001.
S. Boughorbel, F. Jarray, and M. El-Anbari, "Optimal classifier for imbalanced data using Matthews correlation coefficient metric, " PLoS One, vol. 12, no. 6, 2017, Art. no. e0177678.
"The adult census income dataset, " 2017. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/adult
"The bank dataset, " 2014. [Online]. Available: https://archive.ics.uci. edu/ml/datasets/Bank+Marketing
T. Tazin et al., "Stroke disease detection and prediction using robust learning approaches, " J. Healthcare Eng., vol. 2021, pp. 1-12, Nov. 2021.
G. Özsezer and G. Mermer, "Diabetes risk prediction with machine learning models, " Artif. Intell. Theory Appl., vol. 2, no. 2, pp. 1-9, 2022.
J. Hua, B. Chu, J. Zou, and J. Jia, "ECG signal classification in wearable devices based on compressed domain, " PLoS One, vol. 18, no. 4, 2023, Art. no. e0284008.
S. Tizpaz-Niari, A. Kumar, G. Tan, and A. Trivedi, "Fairness-aware configuration of machine learning libraries, " in Proc. 44th Int. Conf. Softw. Eng., 2022, pp. 909-920.
Y. Li et al., "Training data debugging for the fairness of machine learning software, " in Proc. 44th Int. Conf. Softw. Eng., 2022, pp. 2215-2227.
H. Zheng et al., "NeuronFair: Interpretable white-box fairness testing through biased neuron identification, " in Proc. 44th Int. Conf. Softw. Eng., 2022, pp. 1519-1531.
D. Dua and C. Graff, "UCI machine learning repository, " School Inf. Comput. Sci., Univ. California, Irvine, CA, USA, 2019.
R. Kohavi et al., "Scaling up the accuracy of Naive-Bayes classifiers: A decision-tree hybrid, " in Proc. 2nd Int. Conf. Knowl. Discovery Data Mining, vol. 96, Aug. 1996, pp. 202-207.
A. Ogunleye and Q.-G. Wang, "XGBoost model for chronic kidney disease diagnosis, " IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 17, no. 6, pp. 2131-2140, Nov./Dec. 2020.
C. Rao, Y. Liu, and M. Goh, "Credit risk assessment mechanism of personal auto loan based on PSO-XGBoost model, " Complex Intell. Syst., vol. 9, no. 2, pp. 1391-1414, 2023.
M. Mukid, T. Widiharih, A. Rusgiyono, and A. Prahutama, "Credit scoring analysis using weighted k nearest neighbor, " J. Phys.: Conf. Ser., IOP Publishing, vol. 1025, no. 1, 2018, Art. no. 012114.
H. Kamel, D. Abdulah, and J. M. Al-Tuwaijari, "Cancer classification using Gaussian naive Bayes algorithm, " in Proc. Int. Eng. Conf. (IEC), 2019, pp. 165-170.
D. Slack, S. A. Friedler, C. Scheidegger, and C. D. Roy, "Assessing the local interpretability of machine learning models, " 2019, arXiv:1902.03501.
Y. Li, J. Wang, and C. Wang, "Systematic testing of the data-poisoning robustness of KNN, " in Proc. ACM SIGSOFT Int. Symp. Softw. Testing Anal., 2023, pp. 1207-1218.
S. A. Friedler, C. Scheidegger, S. Venkatasubramanian, S. Choudhary, E. P. Hamilton, and D. Roth, "A comparative study of fairnessenhancing interventions in machine learning, " in Proc. Conf. Fairness, Accountability, Transparency, 2019, pp. 329-338.
A. Stevens, P. Deruyck, Z. Van Veldhoven, and J. Vanthienen, "Explainability and fairness in machine learning: Improve fair end-to-end lending for kiva, " in Proc. IEEE Symp. Ser. Comput. Intell. (SSCI), 2020, pp. 1241-1248.
R. J. Lewis, "An introduction to classification and regression tree (cart) analysis, " in Proc. Annu. Meeting Soc. Acad. Emerg. Med., San Francisco, CA, USA, vol. 14, 2000, pp. 1-14.
S. Elbaum, A. G. Malishevsky, and G. Rothermel, "Test case prioritization: A family of empirical studies, " IEEE Trans. Softw. Eng., vol. 28, no. 2, pp. 159-182, Feb. 2002.
F. Pedregosa et al., "Scikit-learn: Machine learning in Python, " J. Mach. Learn. Res., vol. 12, pp. 2825-2830, Nov. 2011.
M. Fan, W. Wei, W. Jin, Z. Yang, and T. Liu, "Explanation-guided fairness testing through genetic algorithm, " in Proc. 44th Int. Conf. Softw. Eng., 2022, pp. 871-882.
G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, "Prioritizing test cases for regression testing, " IEEE Trans. Softw. Eng., vol. 27, no. 10, pp. 929-948, Oct. 2001.
B. Jiang, Z. Zhang, W. K. Chan, and T. Tse, "Adaptive random test case prioritization, " in Proc. IEEE/ACM Int. Conf. Automated Softw. Eng., 2009, pp. 233-244.
L. Zhang, J. Zhou, D. Hao, L. Zhang, and H. Mei, "Prioritizing JUnit test cases in absence of coverage information, " in Proc. IEEE Int. Conf. Softw. Maintenance, 2009, pp. 19-28.
P. Tonella, P. Avesani, and A. Susi, "Using the case-based ranking methodology for test case prioritization, " in Proc. 22nd IEEE Int. Conf. Softw. Maintenance, 2006, pp. 123-133.
S. Yoo, M. Harman, P. Tonella, and A. Susi, "Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge, " in Proc. 18th Int. Symp. Softw. Testing Anal., 2009, pp. 201-212.
Y. Lou, D. Hao, and L. Zhang, "Mutation-based test-case prioritization in software evolution, " in Proc. IEEE 26th Int. Symp. Softw. Rel. Eng. (ISSRE), 2015, pp. 46-57.
C. Henard, M. Papadakis, M. Harman, Y. Jia, and Y. Le Traon, "Comparing white-box and black-box test prioritization, " in Proc. IEEE/ACM 38th Int. Conf. Softw. Eng. (ICSE) 2016, pp. 523-534.
E. Engström, P. Runeson, and M. Skoglund, "A systematic review on regression test selection techniques, " Inf. Softw. Technol., vol. 52, no. 1, pp. 14-30, 2010.
H. Hemmati, A. Arcuri, and L. Briand, "Achieving scalable modelbased testing through test case diversity, " ACM Trans. Softw. Eng. Methodology (TOSEM), vol. 22, no. 1, pp. 1-42, 2013.
J. Chen et al., "Coverage prediction for accelerating compiler testing, " IEEE Trans. Softw. Eng., vol. 47, no. 2, pp. 261-278, Feb. 2021.
N. Humbatova, G. Jahangirova, and P. Tonella, "DeepCrime: Mutation testing of deep learning systems based on real faults, " in Proc. 30th ACM SIGSOFT Int. Symp. Softw. Testing Anal., 2021, pp. 67-78.
T. Byun, V. Sharma, A. Vijayakumar, S. Rayadurgam, and D. Cofer, "Input prioritization for testing neural networks, " in Proc. IEEE Int. Conf. Artif. Intell. Testing (AITest), 2019, pp. 63-70.
Y. Tian, K. Pei, S. Jana, and B. Ray, "DeepTest: Automated testing of deep-neural-network-driven autonomous cars, " in Proc. 40th Int. Conf. Softw. Eng., 2018, pp. 303-314.
T. Y. Chen, H. Leung, and I. K. Mak, "Adaptive random testing, " in Proc. Adv. Comput. Sci.-ASIAN, Higher-Level Decis. Making, 9th Asian Comput. Sci. Conf.. Dedicated Jean-Louis Lassez Occasion 5th Birthday, Chiang Mai, Thailand: Springer-Verlag, 2005, pp. 320-329.
J. Kim, R. Feldt, and S. Yoo, "Guiding deep learning system testing using surprise adequacy, " in Proc. IEEE/ACM 41st Int. Conf. Softw. Eng. (ICSE), 2019, pp. 1039-1049.
D. Shin, S. Yoo, M. Papadakis, and D.-H. Bae, "Empirical evaluation of mutation-based test case prioritization techniques, " Softw. Testing, Verification Rel., vol. 29, nos. 1-2, 2019, Art. no. e1695.
Q. Hu, L. Ma, X. Xie, B. Yu, Y. Liu, and J. Zhao, "Deepmutation++: A mutation testing framework for deep learning systems, " in Proc. 34th IEEE/ACM Int. Conf. Autom. Softw. Eng. (ASE), 2019, pp. 1158-1161.