Reference : An Empirical Study on Data Distribution-Aware Test Selection for Deep Learning Enhancement
Scientific journals : Article
Engineering, computing & technology : Computer science
http://hdl.handle.net/10993/50265
An Empirical Study on Data Distribution-Aware Test Selection for Deep Learning Enhancement
English
Hu, Qiang mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal >]
Guo, Yuejun mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal >]
Cordy, Maxime mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal >]
Xie, Xiaofei mailto [Singapore Management University]
Ma, Lei mailto [University of Alberta]
Papadakis, Mike mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)]
Le Traon, Yves mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal >]
2022
ACM Transactions on Software Engineering and Methodology
Yes
[en] deep learning testing ; test selection ; data distribution
[en] Similar to traditional software that is constantly under evolution, deep neural networks (DNNs) need to evolve upon the rapid growth of test data for continuous enhancement, e.g., adapting to distribution shift in a new environment for deployment. However, it is labor-intensive to manually label all the collected test data. Test selection solves this problem by strategically choosing a small set to label. Via retraining with the selected set, DNNs will achieve competitive accuracy. Unfortunately, existing selection metrics involve three main limitations: 1) using different retraining processes; 2) ignoring data distribution shifts; 3) being insufficiently evaluated. To fill this gap, we first conduct a systemically empirical study to reveal the impact of the retraining process and data distribution on model enhancement. Then based on our findings, we propose a novel distribution-aware test (DAT) selection metric. Experimental results reveal that retraining using both the training and selected data outperforms using only the selected data. None of the selection metrics perform the best under various data distributions. By contrast, DAT effectively alleviates the impact of distribution shifts and outperforms the compared metrics by up to 5 times and 30.09% accuracy improvement for model enhancement on simulated and in-the-wild distribution shift scenarios, respectively.
Fonds National de la Recherche - FnR
CORE project C18/IS/12669767/STELLAR/LeTraon
http://hdl.handle.net/10993/50265
FnR ; FNR12669767 > Yves Le Traon > STELLAR > Testing Self-learning Systems > 01/09/2019 > 31/08/2022 > 2018

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
TOSEM_DAT.pdfAuthor preprint1.77 MBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.