Paper published in a book (Scientific congresses, symposiums and conference proceedings)
Large-scale Machine Learning-based Malware Detection: Confronting the "10-fold Cross Validation" Scheme with Reality
ALLIX, Kevin; BISSYANDE, Tegawendé François D Assise; JEROME, Quentin et al.
2014In Proceedings of the 4th ACM Conference on Data and Application Security and Privacy
Peer reviewed
 

Files


Full Text
p163-allix.pdf
Publisher postprint (578.88 kB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
android; machine learning; malware; ten-fold
Abstract :
[en] To address the issue of malware detection, researchers have recently started to investigate the capabilities of machine- learning techniques for proposing effective approaches. Sev- eral promising results were recorded in the literature, many approaches being assessed with the common “10-Fold cross validation” scheme. This paper revisits the purpose of mal- ware detection to discuss the adequacy of the “10-Fold” scheme for validating techniques that may not perform well in real- ity. To this end, we have devised several Machine Learning classifiers that rely on a novel set of features built from ap- plications’ CFGs. We use a sizeable dataset of over 50,000 Android applications collected from sources where state-of- the art approaches have selected their data. We show that our approach outperforms existing machine learning-based approaches. However, this high performance on usual-size datasets does not translate in high performance in the wild.
Disciplines :
Computer science
Author, co-author :
ALLIX, Kevin ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
BISSYANDE, Tegawendé François D Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
JEROME, Quentin ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
KLEIN, Jacques ;  University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
STATE, Radu  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
LE TRAON, Yves ;  University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Language :
English
Title :
Large-scale Machine Learning-based Malware Detection: Confronting the "10-fold Cross Validation" Scheme with Reality
Publication date :
March 2014
Event name :
4th ACM Conference on Data and Application Security and Privacy
Event place :
San Antonio, Texas, United States
Event date :
from 03-03-2014 to 05-03-2014
Main work title :
Proceedings of the 4th ACM Conference on Data and Application Security and Privacy
Publisher :
ACM, New York, NY, USA, Unknown/unspecified
ISBN/EAN :
978-1-4503-2278-2
Collection name :
CODASPY '14
Pages :
163--166
Peer reviewed :
Peer reviewed
Available on ORBilu :
since 21 September 2014

Statistics


Number of views
356 (34 by Unilu)
Number of downloads
2170 (23 by Unilu)

Scopus citations®
 
22
Scopus citations®
without self-citations
19
OpenCitations
 
12

Bibliography


Similar publications



Contact ORBilu