D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., "Mastering the game of go with deep neural networks and tree search, " nature, vol. 529, no. 7587, pp. 484-489, 2016.
C. Badue, R. Guidolini, R. V. Carneiro, P. Azevedo, V. B. Cardoso, A. Forechi, L. Jesus, R. Berriel, T. M. Paixao, F. Mutz et al., "Selfdriving cars: A survey, " Expert Systems with Applications, vol. 165, p. 113816, 2021.
G. Hu, Y. Yang, D. Yi, J. Kittler, W. Christmas, S. Z. Li, and T. Hospedales, "When face recognition meets with deep learning: An evaluation of convolutional neural networks for face recognition, " in Proceedings of the IEEE international conference on computer vision workshops, 2015, pp. 142-150.
U. Alon, M. Zilberstein, O. Levy, and E. Yahav, "code2vec: Learning distributed representations of code, " Proceedings of the ACM on Programming Languages, vol. 3, no. POPL, pp. 1-29, 2019.
R. Puri, D. S. Kung, G. Janssen, W. Zhang, G. Domeniconi, V. Zolotov, J. Dolby, J. Chen, M. Choudhury, L. Decker et al., "Project codenet: A large-scale ai for code dataset for learning a diversity of coding tasks, " arXiv preprint arXiv:2105.12655, 2021.
A. Masood and A. Hashmi, "Aiops: Predictive analytics & machine learning in operations, " in Cognitive Computing Recipes. Springer, 2019, pp. 359-382.
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., "Language models are few-shot learners, " arXiv preprint arXiv:2005.14165, 2020.
D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu et al., "Graphcodebert: Pre-training code representations with data flow, " arXiv preprint arXiv:2009.08366, 2020.
Q. Guo, S. Chen, X. Xie, L. Ma, Q. Hu, H. Liu, Y. Liu, J. Zhao, and X. Li, "An empirical study towards characterizing deep learning development and deployment across different frameworks and platforms, " in 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2019, pp. 810-822.
X. Xie, L. Ma, H. Wang, Y. Li, Y. Liu, and X. Li, "Diffchaser: Detecting disagreements for deep neural networks." in IJCAI, 2019, pp. 5772-5778.
Y. Tian, W. Zhang, M. Wen, S.-C. Cheung, C. Sun, S. Ma, and Y. Jiang, "Fast test input generation for finding deviated behaviors in compressed deep neural network, " arXiv preprint arXiv:2112.02819, 2021.
Z. Chen, Y. Cao, Y. Liu, H. Wang, T. Xie, and X. Liu, "A comprehensive study on challenges in deploying deep learning based software, " in Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020, pp. 750-762.
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., "Tensorflow: A system for largescale machine learning, " in 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), 2016, pp. 265-283.
M. Thakkar, Beginning machine learning in ios: CoreML framework, 1st ed. APress, 2019.
P. W. Koh, S. Sagawa, H. Marklund, S. M. Xie, M. Zhang, A. Balsubramani, W. Hu, M. Yasunaga, R. L. Phillips, I. Gao et al., "Wilds: A benchmark of in-the-wild distribution shifts, " in International Conference on Machine Learning. PMLR, 2021, pp. 5637-5664.
D. Berend, X. Xie, L. Ma, L. Zhou, Y. Liu, C. Xu, and J. Zhao, "Cats are not fish: Deep learning testing calls for out-of-distribution awareness, " in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020, pp. 1041-1052.
R. Hu, J. Sang, J. Wang, and C. Jiang, "Understanding and testing generalization of deep networks on out-of-distribution data, " arXiv preprint arXiv:2111.09190, 2021.
S. Dola, M. B. Dwyer, and M. L. Soffa, "Distribution-aware testing of neural networks using generative models, " in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021, pp. 226-237.
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, "Quantization and training of neural networks for efficient integer-arithmetic-only inference, " in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2704-2713.
I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples, " arXiv preprint arXiv:1412.6572, 2014.
H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization, " arXiv preprint arXiv:1710.09412, 2017.
"Project website, " 2023. [Online]. Available: Https://github.com/Anony4paper/quan study
I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press, 2016.
G. Shomron, F. Gabbay, S. Kurzum, and U. Weiser, "Post-training sparsity-aware quantization, " arXiv preprint arXiv:2105.11010, 2021.
I. Hubara, Y. Nahshan, Y. Hanani, R. Banner, and D. Soudry, "Accurate post training quantization with small calibration sets, " in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18-24 Jul 2021, pp. 4466-4475. [Online]. Available: Https://proceedings.mlr.press/v139/hubara21a.html
Y. Li, R. Gong, X. Tan, Y. Yang, P. Hu, Q. Zhang, F. Yu, W. Wang, and S. Gu, "Brecq: Pushing the limit of post-training quantization by block reconstruction, " arXiv preprint arXiv:2102.05426, 2021.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition, " Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
A. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, "Learning word vectors for sentiment analysis, " in Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, 2011, pp. 142-150.
N. Mu and J. Gilmer, "Mnist-c: A robustness benchmark for computer vision, " arXiv preprint arXiv:1906.02337, 2019.
D. Hendrycks and T. Dietterich, "Benchmarking neural network robustness to common corruptions and perturbations, " arXiv preprint arXiv:1903.12261, 2019.
Q. Hu, Y. Guo, M. Cordy, X. Xie, L. Ma, M. Papadakis, and Y. Le Traon, "An empirical study on data distribution-aware test selection for deep learning enhancement (in press), " ACM Transactions on Software Engineering and Methodology (TOSEM), 2022. [Online]. Available: Https://orbilu.uni.lu/handle/10993/50265
A. Krizhevsky, "Learning multiple layers of features from tiny images, " University of Toronto, Toronto, Tech. Rep., 2009.
M. Lin, Q. Chen, and S. Yan, "Network in network, " arXiv preprint arXiv:1312.4400, 2013.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition, " in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks, " in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.
S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling, " arXiv preprint arXiv:1412.3555, 2014.
D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan, "Augmix: A simple data processing method to improve robustness and uncertainty, " arXiv preprint arXiv:1912.02781, 2019.
T. Fawcett, "An introduction to roc analysis, " Pattern recognition letters, vol. 27, no. 8, pp. 861-874, 2006.
Q. Hu, Y. Guo, M. Cordy, X. Xie, W. Ma, M. Papadakis, and Y. Le Traon, "Towards exploring the limitations of active learning: An empirical study, " in 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2021, pp. 917-929.
W. Ma, M. Papadakis, A. Tsakmalis, M. Cordy, and Y. L. Traon, "Test selection for deep learning systems, " ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 30, no. 2, pp. 1-22, 2021.
C. E. Shannon, "A mathematical theory of communication, " The Bell system technical journal, vol. 27, no. 3, pp. 379-423, 1948.
Y. Feng, Q. Shi, X. Gao, J. Wan, C. Fang, and Z. Chen, "Deepgini: Prioritizing massive tests to enhance the robustness of deep neural networks, " in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020, pp. 177-188.
D. Wang and Y. Shang, "A new active labeling method for deep learning, " in 2014 International joint conference on neural networks (IJCNN). IEEE, 2014, pp. 112-119.
B. Settles, "Active learning literature survey, " 2009.
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, "Towards deep learning models resistant to adversarial attacks, " arXiv preprint arXiv:1706.06083, 2017.
S. Ren, Y. Deng, K. He, and W. Che, "Generating natural language adversarial examples through probability weighted word saliency, " in Proceedings of the 57th annual meeting of the association for computational linguistics, 2019, pp. 1085-1097.
Y. Fu, Q. Yu, M. Li, V. Chandra, and Y. Lin, "Double-win quant: Aggressively winning robustness of quantized deep neural networks via random precision training and inference, " in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18-24 Jul 2021, pp. 3492-3504. [Online]. Available: Https://proceedings.mlr.press/v139/fu21c.html
P. Bielik and M. Vechev, "Adversarial robustness for code, " in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13-18 Jul 2020, pp. 896-907. [Online]. Available: Https://proceedings.mlr.press/v119/bielik20a.html
O. Sagi and L. Rokach, "Ensemble learning: A survey, " WIREs Data Mining and Knowledge Discovery, vol. 8, no. 4, p. e1249, 2018. [Online]. Available: Https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1249
L. Li, Q. Hu, X. Wu, and D. Yu, "Exploration of classification confidence in ensemble learning, " Pattern Recognition, vol. 47, no. 9, pp. 3120-3131, 2014. [Online]. Available: Https://www.sciencedirect.com/science/article/pii/S0031320314001198
Y. Gal and Z. Ghahramani, "Dropout as a bayesian approximation: Representing model uncertainty in deep learning, " in international conference on machine learning. PMLR, 2016, pp. 1050-1059.
L. Ma, F. Juefei-Xu, M. Xue, Q. Hu, S. Chen, B. Li, Y. Liu, J. Zhao, J. Yin, and S. See, "Secure deep learning engineering: A software quality assurance perspective, " arXiv preprint arXiv:1810.04538, 2018.
J. M. Zhang, M. Harman, L. Ma, and Y. Liu, "Machine learning testing: Survey, landscapes and horizons, " IEEE Transactions on Software Engineering, 2020.
J. Kim, R. Feldt, and S. Yoo, "Guiding deep learning system testing using surprise adequacy, " in 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 2019, pp. 1039-1049.
Y. Tian, K. Pei, S. Jana, and B. Ray, "Deeptest: Automated testing of deep-neural-network-driven autonomous cars, " in Proceedings of the 40th international conference on software engineering, 2018, pp. 303-314.
X. Gao, R. K. Saha, M. R. Prasad, and A. Roychoudhury, "Fuzz testing based data augmentation to improve robustness of deep neural networks, " in 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 2020, pp. 1147-1158.
Q. Hu, L. Ma, X. Xie, B. Yu, Y. Liu, and J. Zhao, "Deepmutation++: A mutation testing framework for deep learning systems, " in 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2019, pp. 1158-1161.
L. Ma, F. Juefei-Xu, F. Zhang, J. Sun, M. Xue, B. Li, C. Chen, T. Su, L. Li, Y. Liu et al., "Deepgauge: Multi-granularity testing criteria for deep learning systems, " in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 120-131.
X. Xie, L. Ma, F. Juefei-Xu, M. Xue, H. Chen, Y. Liu, J. Zhao, B. Li, J. Yin, and S. See, "Deephunter: A coverage-guided fuzz testing framework for deep neural networks, " in Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2019, pp. 146-157.
J. Guo, Y. Jiang, Y. Zhao, Q. Chen, and J. Sun, "Dlfuzz: Differential fuzzing testing of deep learning systems, " in Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 739-743.
V. Riccio, N. Humbatova, G. Jahangirova, and P. Tonella, "Deepmetis: Augmenting a deep learning test set to increase its mutation score, " arXiv preprint arXiv:2109.07514, 2021.
J. Chen, Z. Wu, Z. Wang, H. You, L. Zhang, and M. Yan, "Practical accuracy estimation for efficient deep neural network testing, " ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 29, no. 4, pp. 1-35, 2020.
Z. Li, X. Ma, C. Xu, C. Cao, J. Xu, and J. Lü, "Boosting operational dnn testing efficiency through conditioning, " in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 499-509.
T. Zhang, C. Gao, L. Ma, M. R. Lyu, and M. Kim, "An empirical study of common challenges in developing deep learning applications, " 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), pp. 104-115, 2019.
Z. Chen, H. Yao, Y. Lou, Y. Cao, Y. Liu, H. Wang, and X. Liu, "An empirical study on deployment faults of deep learning based mobile applications, " 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 674-685, 2021.