[en] We present a growing collection of Android Applications collected from several sources, including the official Google Play app market. Our dataset, AndroZoo, currently contains more than three million apps, each of which has been analysed by tens of different AntiVirus products to know which applications are detected as Malware. We provide this
dataset to contribute to ongoing research efforts, as well as
to enable new potential research topics on Android Apps.
By releasing our dataset to the research community, we also
aim at encouraging our fellow researchers to engage in reproducible experiments.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
ALLIX, Kevin ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
KLEIN, Jacques ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
LE TRAON, Yves ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
AndroZoo: Collecting Millions of Android Apps for the Research Community
Date de publication/diffusion :
mai 2016
Nom de la manifestation :
Mining Software Repositories 2016 (MSR)
Lieu de la manifestation :
Austin, Texas, Etats-Unis
Date de la manifestation :
from 14-05-2016 to 15-05-2016
Manifestation à portée :
International
Titre de l'ouvrage principal :
Proceedings of the 13th International Workshop on Mining Software Repositories
K. Allix, T. F. Bissyandé, Q. Jerome, J. Klein, R. State, and Y. Le Traon. Empirical assessment of machine learning-based malware detectors for android: Measuring the gap between in-the-lab and in-the-wild validation scenarios. Empirical Software Engineering, pages 1-29, 2014.
K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon. Are your training datasets yet relevant? an investigation into the importance of timeline in machine learning-based malware detection. In Engineering Secure Software and Systems, volume 8978 of LNCS, pages 51-67. Springer International Publishing, 2015.
K. Allix, Q. Jérome, T. F. Bissyandé, J. Klein, R. State, and Y. Le Traon. A forensic analysis of android malware: How is malware written and how it could be detected? In Computer Software and Applications Conference (COMPSAC), 2014.
G. Hecht, O. Benomar, R. Rouvoy, N. Moha, and L. Duchien. Tracking the software quality of android applications along their evolution. In Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on, pages 236-247, Nov. 2015.
L. Li, A. Bartel, T. F. Bissyandé, J. Klein, Y. Le Traon, S. Arzt, S. Rasthofer, E. Bodden, D. Octeau, and P. McDaniel. Iccta: Detecting inter-component privacy leaks in android apps. In Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on, volume 1, pages 280-291, May 2015.
Y. Zhou and X. Jiang. Dissecting android malware: Characterization and evolution. In Proceedings of the 2012 IEEE Symposium on Security and Privacy, SP'12, pages 95-109, Washington, DC, USA, 2012. IEEE.