[en] Android malware detection and family classification have been extensively studied, yet localizing the exact malicious payloads within a detected sample remains a challenging and labor-intensive task. We propose RAML, a novel Retrieval-Augmented Malicious payload Localization pipeline inspired by retrieval-augmented generation (RAG), which leverages large language models (LLMs) to bridge high-level behavior descriptions and low-level Smali code. RAML generates class-level descriptions from Smali code, embeds them into a vector database, and performs semantic retrieval via similarity search. Matched candidates are re-ranked with LLM assistance, followed by method-level LLM analysis to precisely identify malicious methods and provide insightful role explanations. Preliminary results show that RAML effectively localizes corresponding malicious payloads based on behavioral descriptions, narrows the analysis scope, and reduces manual effort—offering a promising direction for automated malware forensics.
Disciplines :
Computer science
Author, co-author :
SUN, Tiezhu ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
ALECCI, Marco ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
SONG, Yewei ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
TANG, Xunzhu ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
KLEIN, Jacques ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
External co-authors :
yes
Language :
English
Title :
RAML: Toward Retrieval-Augmented Localization of Malicious Payloads in Android Apps
Publication date :
16 November 2025
Event name :
The 40th IEEE/ACM International Conference on Automated Software Engineering, ASE 2025
Event date :
16 - 20 November 2025
Audience :
International
Main work title :
The 40th IEEE/ACM International Conference on Automated Software Engineering, ASE 2025
Publisher :
IEEE/ACM
Peer reviewed :
Peer reviewed
FnR Project :
FNR16344458 - REPROCESS - Pre And Post Processing For Comprehensive And Practical Android App Static Analysis, 2021 (01/07/2022-30/06/2025) - Jacques Klein FNR18154263 - UNLOCK - Breaking The Barriers Of Android Dynamic Analysis With Static Analysis, 2023 (01/01/2024-31/12/2026) - Jacques Klein