Results 1-2 of 2.
((uid:50009230))

Bookmark and Share    
Full Text
Peer Reviewed
See detailEuphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware
Hurier, Médéric UL; Suarez-Tangil, Guillermo; Dash, Santanu Kumar et al

in MSR 2017 (2017, May 21)

Android malware is now pervasive and evolving rapidly. Thousands of malware samples are discovered every day with new models of attacks. The growth of these threats has come hand in hand with the ... [more ▼]

Android malware is now pervasive and evolving rapidly. Thousands of malware samples are discovered every day with new models of attacks. The growth of these threats has come hand in hand with the proliferation of collective repositories sharing the latest specimens. Having access to a large number of samples opens new research directions aiming at efficiently vetting apps. However, automatically inferring a reference ground-truth from those repositories is not straightforward and can inadvertently lead to unforeseen misconceptions. On the one hand, samples are often mis-labeled as different parties use distinct naming schemes for the same sample. On the other hand, samples are frequently mis-classified due to conceptual errors made during labeling processes. In this paper, we analyze the associations between all labels given by different vendors and we propose a system called EUPHONY to systematically unify common samples into family groups. The key novelty of our approach is that no a-priori knowledge on malware families is needed. We evaluate our approach using reference datasets and more than 0.4 million additional samples outside of these datasets. Results show that EUPHONY provides competitive performance against the state-of-the-art. [less ▲]

Detailed reference viewed: 175 (14 UL)
Full Text
Peer Reviewed
See detailOn the Lack of Consensus in Anti-Virus Decisions: Metrics and Insights on Building Ground Truths of Android Malware
Hurier, Médéric UL; Allix, Kevin UL; Bissyande, Tegawendé François D Assise UL et al

in Detection of Intrusions and Malware, and Vulnerability Assessment - 13th International Conference (2016)

There is generally a lack of consensus in Antivirus (AV) engines' decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners ... [more ▼]

There is generally a lack of consensus in Antivirus (AV) engines' decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners may rely on unvalidated approaches to build their ground truth, e.g., by considering decisions from a selected set of Antivirus vendors or by setting up a threshold number of positive detections before classifying a sample. Both approaches are biased as they implicitly either decide on ranking AV products, or they consider that all AV decisions have equal weights. In this paper, we extensively investigate the lack of agreement among AV engines. To that end, we propose a set of metrics that quantitatively describe the different dimensions of this lack of consensus. We show how our metrics can bring important insights by using the detection results of 66 AV products on 2 million Android apps as a case study. Our analysis focuses not only on AV binary decision but also on the notoriously hard problem of labels that AVs associate with suspicious files, and allows to highlight biases hidden in the collection of a malware ground truth---a foundation stone of any machine learning-based malware detection approach. [less ▲]

Detailed reference viewed: 283 (19 UL)