Paper published in a book (Scientific congresses, symposiums and conference proceedings)
TATA: Benchmark NIDS Test Sets Assessment and Targeted Augmentation
ANSER, Omar; FRANCOIS, Jérôme; Chrisment, Isabelle et al.
2026In Nicomette, Vincent (Ed.) Computer Security – ESORICS 2025 - 30th European Symposium on Research in Computer Security, Proceedings
Peer reviewed
 

Files


Full Text
TATA.pdf
Author postprint (1.28 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Data augmentation; Data quality; ML; NIDS; Benchmark networks; High detection rate; Machine-learning; Network intrusion detection systems; Set assessment; System test; Test sets; Training sets; Theoretical Computer Science; Computer Science (all)
Abstract :
[en] Research works on Network Intrusion Detection Systems (NIDSs) using Machine Learning (ML) usually reports very high detection rate, often well above 90%. However, these results typically originate from overly simplistic NIDS datasets, where the test set, often just a subset of the overall dataset, mirrors the training set distribution, failing to rigorously assess the NIDS’s robustness under more varied conditions. To address this shortcoming, we propose a method for Test sets Assessment and Targeted Augmentation (TATA). TATA is a model-agnostic approach that assesses and augments the quality of benchmark ML–based NIDS test sets. First, TATA encodes both training and test sets in a structured latent space via a contrastive autoencoder, defining three quality metrics (diversity, proximity, and scarcity) to identify test set gaps where the ML-based classification is harder. Next, TATA employs a reinforcement learning (RL) approach guided by these metrics, configuring a testbed that produces realistic data specifically targeting these gaps, creating a more robust test set. Using CIC-IDS2017 and CSE-CIC-IDS2018, we observe a positive correlation between higher metric values and increased detection difficulty, confirming their utility as meaningful indicators of test set robustness. With the same datasets, TATA’s RL-based augmentation significantly raises detection difficulty for multiple NIDS models, revealing previously overlooked weaknesses.
Disciplines :
Computer science
Author, co-author :
ANSER, Omar ;  Inria, Université de Lorraine, CNRS, LORIA, Nancy, France
FRANCOIS, Jérôme  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN ; Inria, Université de Lorraine, CNRS, LORIA, Nancy, France
Chrisment, Isabelle;  Inria, Université de Lorraine, CNRS, LORIA, Nancy, France
Kondo, Daishi;  Information Technology Center, The University of Tokyo, Bunkyo, Japan
External co-authors :
yes
Language :
English
Title :
TATA: Benchmark NIDS Test Sets Assessment and Targeted Augmentation
Publication date :
2026
Event name :
ESORICS 2025
Event place :
Toulouse, France
Event date :
22-09-2025
Audience :
International
Main work title :
Computer Security – ESORICS 2025 - 30th European Symposium on Research in Computer Security, Proceedings
Editor :
Nicomette, Vincent
Publisher :
Springer Science and Business Media Deutschland GmbH
ISBN/EAN :
978-3-03-207883-4
Peer reviewed :
Peer reviewed
Available on ORBilu :
since 16 December 2025

Statistics


Number of views
31 (4 by Unilu)
Number of downloads
23 (1 by Unilu)

Scopus citations®
 
0
Scopus citations®
without self-citations
0
OpenCitations
 
0
OpenAlex citations
 
0

Bibliography


Similar publications



Contact ORBilu