Shin Yoo and Mark Harman. Regression testing minimization, selection and prioritization: A survey. Software testing, verification and reliability, 22 (2): 67-120, 2012.
Saif Ur Rehman Khan, Sai Peck Lee, Nadeem Javaid, and Wadood Abdul. A systematic review on test suite reduction: Approaches, experiment's quality evaluation, and guidelines. IEEE Access, 6: 11816-11841, 2018.
Sebastian Elbaum, Gregg Rothermel, and John Penix. Techniques for improving regression testing in continuous integration development environments. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 235-245, 2014.
Kim Herzig. Testing and continuous integration at scale: Limits, costs, and expectations. In Proceedings of the 11th International Workshop on Search-Based Software Testing, pages 38-38, 2018.
Emilio Cruciani, Breno Miranda, Roberto Verdecchia, and Antonia Bertolino. Scalable approaches for test suite reduction. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 419-429. IEEE, 2019.
Raphael Noemmer and Roman Haas. An evaluation of test suite minimization techniques. In International Conference on Software Quality, pages 51-66. Springer, 2020.
Marko Vasic, Zuhair Parvez, Aleksandar Milicevic, and Milos Gligoric. File-level vs. module-level regression test selection for. NET. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages 848-853, 2017.
Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. Practical regression test selection with dynamic file dependencies. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, pages 211-222, 2015.
Lingming Zhang, Dan Hao, Lu Zhang, Gregg Rothermel, and Hong Mei. Bridging the gap between the total and additional test-case prioritization strategies. In 2013 35th International Conference on Software Engineering (ICSE), pages 192-201. IEEE, 2013.
Hong Mei, Dan Hao, Lingming Zhang, Lu Zhang, Ji Zhou, and Gregg Rothermel. A static approach to prioritizing junit test cases. IEEE transactions on software engineering, 38 (6): 1258-1275, 2012.
Heleno de S. Campos Junior, Marco Antônio P Aráujo, José Maria N David, Regina Braga, Fernanda Campos, and Victor Ströele. Test case prioritization: A systematic review and mapping of the literature. In Proceedings of the 31st Brazilian Symposium on Software Engineering, pages 34-43, 2017.
Lucas Pereira da Silva and Patrícia Vilain. LCCSS: A similarity metric for identifying similar test code. In Proceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse, pages 91-100, 2020.
Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, Kaixuan Wang, and Xudong Liu. A novel neural source code representation based on abstract syntax tree. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 783-794. IEEE, 2019.
Robert E Noonan. An algorithm for generating abstract syntax trees. Computer Languages, 10 (3-4): 225-236, 1985.
Gabriel Valiente. Algorithms on trees and graphs. Springer Science & Business Media, 2002.
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolutionary Search-Replication Package. https://doi. org/10. 5281/zenodo. 7455766.
Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, 6 (2): 182-197, 2002.
J. Blank and K. Deb. Pymoo: Multi-objective optimization in Python. IEEE Access, 8: 89497-89509, 2020.
Sean Luke. Essentials of Metaheuristics. Lulu, second edition, 2013. Available at http://cs. gmu. edu/?sean/book/metaheuristics/.
Shaukat Ali, Lionel C Briand, Hadi Hemmati, and Rajwinder Kaur Panesar-Walawege. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Transactions on Software Engineering, 36 (6): 742-762, 2009.
Peter D Turney and Patrick Pantel. From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research, 37: 141-188, 2010.
David Arthur and Sergei Vassilvitskii. k-means++: The advantages of careful seeding. Technical report, Stanford, 2006.
Olivier Bachem, Mario Lucic, and Andreas Krause. Scalable k-means clustering via lightweight coresets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1119-1127, 2018.
Anand Rajaraman and Jeffrey David Ullman. Mining of massive datasets. Cambridge University Press, 2011.
Michel Raymond and François Rousset. An exact test for population differentiation. Evolution, pages 1280-1283, 1995.
Andrea Arcuri and Lionel Briand. A hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability, 24 (3): 219-250, 2014.
Erick Cantú-Paz et al. A survey of parallel genetic algorithms. Calculateurs paralleles, reseaux et systems repartis, 10 (2): 141-171, 1998.
Sourabh Katoch, Sumit Singh Chauhan, and Vijay Kumar. A review on genetic algorithm: past, present, and future. Multimedia Tools and Applications, 80 (5): 8091-8126, 2021.
Breno Miranda and Antonia Bertolino. Scope-aided test prioritization, selection and minimization for software reuse. Journal of Systems and Software, 131: 528-549, 2017.
Tsong Yueh Chen and Man Fai Lau. Heuristics towards the optimization of the size of a test suite. WIT Transactions on Information and Communication Technologies, 14, 1970.
Carmen Coviello, Simone Romano, Giuseppe Scanniello, Alessandro Marchetto, Giuliano Antoniol, and Anna Corazza. Clustering support for inadequate test suite reduction. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pages 95-105. IEEE, 2018.
Markos Viggiato, Dale Paas, Chris Buzon, and Cor-Paul Bezemer. Identifying similar test cases that are specified in natural language. IEEE Transactions on Software Engineering, 2022.
Hadi Hemmati, Andrea Arcuri, and Lionel Briand. Achieving scalable model-based testing through test case diversity. ACM Transactions on Software Engineering and Methodology (TOSEM), 22 (1): 1-42, 2013.
Man Zhang, Shaukat Ali, and Tao Yue. Uncertainty-wise test case generation and minimization for cyber-physical systems. Journal of Systems and Software, 153: 1-21, 2019.
Shuai Wang, Shaukat Ali, and Arnaud Gotlieb. Cost-effective test suite minimization in product lines using search techniques. Journal of Systems and Software, 103: 370-391, 2015.
Antonio J Nebro, Juan J Durillo, Francisco Luna, Bernabé Dorronsoro, and Enrique Alba. Mocell: A cellular genetic algorithm for multiobjective optimization. International Journal of Intelligent Systems, 24 (7): 726-746, 2009.
Eckart Zitzler, Marco Laumanns, and Lothar Thiele. Spea2: Improving the strength pareto evolutionary algorithm. TIK-report, 103, 2001.
Robert Feldt, Simon Poulding, David Clark, and Shin Yoo. Test set diameter: Quantifying the diversity of sets of test cases. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST), pages 223-233. IEEE, 2016.
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, et al. Code-BERT: A pre-trained model for programming and natural languages. arXiv preprint arXiv: 2002. 08155, 2020.
Xue Jiang, Zhuoran Zheng, Chen Lyu, Liang Li, and Lei Lyu. Tree-BERT: A tree-based pre-trained model for programming language. In Uncertainty in Artificial Intelligence, pages 54-63. PMLR, 2021.