Reference : Machine Learning-Based Malware Detection for Android Applications: History Matters!
Reports : Other
Engineering, computing & technology : Computer science
http://hdl.handle.net/10993/17251
Machine Learning-Based Malware Detection for Android Applications: History Matters!
English
Allix, Kevin mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > > ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) >]
Bissyande, Tegawendé François D Assise mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Klein, Jacques mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) >]
Le Traon, Yves mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC) >]
26-May-2014
University of Luxembourg, SnT
17
978-2-87971-132-4
Luxembourg
Luxembourg
[en] Machine Learning-based malware detection is a promis-
ing scalable method for identifying suspicious applica-
tions. In particular, in today’s mobile computing realm
where thousands of applications are daily poured into
markets, such a technique could be valuable to guaran-
tee a strong filtering of malicious apps. The success
of machine-learning approaches however is highly de-
pendent on (1) the quality of the datasets that are used
for training and of (2) the appropriateness of the tested
datasets with regards to the built classifiers. Unfortu-
nately, there is scarce mention of these aspects in the
evaluation of existing state-of-the-art approaches in the
literature.
In this paper, we consider the relevance of history in
the construction of datasets, to highlight its impact on
the performance of the malware detection scheme. Typ-
ically, we show that simply picking a random set of
known malware to train a malware detector, as it is done
in most assessment scenarios from the literature, yields
significantly biased results. In the process of assessing
the extent of this impact through various experiments, we
were also able to confirm a number of intuitive assump-
tions about Android malware. For instance, we discuss
the existence of Android malware lineages and how they
could impact the performance of malware detection in
the wild.
University of Luxembourg: High Performance Computing - ULHPC
Researchers
http://hdl.handle.net/10993/17251

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
history_matters.pdfPublisher postprint482.19 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.