Reference : Are Your Training Datasets Yet Relevant? - An Investigation into the Importance of Ti...
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
http://hdl.handle.net/10993/20299
Are Your Training Datasets Yet Relevant? - An Investigation into the Importance of Timeline in Machine Learning-Based Malware Detection
English
Allix, Kevin mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) > ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Bissyande, Tegawendé François D Assise mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Klein, Jacques mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) > ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Le Traon, Yves mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) > ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
2015
Engineering Secure Software and Systems - 7th International Symposium ESSoS 2015, Milan, Italy, March 4-6, 2015. Proceedings
Springer International Publishing
51-67
Yes
978-3-319-15617-0
7th International Symposium on Engineering Secure Software and Systems, ESSoS'15
from 04-03-2015 to 06-03-2015
Milano
Italy
[en] Machine Learning ; Malware Detection ; Time ; Android
[en] In this paper, we consider the relevance of timeline in the construction of datasets,
to highlight its impact on the performance of a machine learning-based malware
detection scheme. Typically, we show that simply picking a random set of known
malware to train a malware detector, as it is done in many assessment scenarios
from the literature, yields significantly biased results. In the process of assessing
the extent of this impact through various experiments, we were also able to con-
firm a number of intuitive assumptions about Android malware. For instance,
we discuss the existence of Android malware lineages and how they could impact
the performance of malware detection in the wild.
University of Luxembourg: High Performance Computing - ULHPC
http://hdl.handle.net/10993/20299
10.1007/978-3-319-15618-7_5
http://dx.doi.org/10.1007/978-3-319-15618-7_5

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
essos15_preprint.pdfAuthor preprint626.85 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.