Reference : IntJect: Vulnerability Intent Bug Seeding
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Security, Reliability and Trust
http://hdl.handle.net/10993/53858
IntJect: Vulnerability Intent Bug Seeding
English
PETIT, Benjamin mailto [University of Namur, Namur, Belgium]
Khanfir, Ahmed mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal]
Soremekun, Ezekiel mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal]
Perrouin, Gilles mailto [University of Namur, Namur, Belgium]
Papadakis, Michail mailto [University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Life Sciences and Medicine (DLSM)]
2022
22nd IEEE International Conference on Software Quality, Reliability, and Security
Yes
22nd IEEE International Conference on Software Quality, Reliability, and Security
2022
[en] Software Vulnerabilities ; Vulnerability injection ; Software Security
[en] Studying and exposing software vulnerabilities is important to ensure software security, safety, and reliability. Software engineers often inject vulnerabilities into their programs to test the reliability of their test suites, vulnerability detectors, and security measures. However, state-of-the-art vulnerability injection methods only capture code syntax/patterns, they do not learn the intent of the vulnerability and are limited to the syntax of the original dataset. To address this challenge, we propose the first intent-based vulnerability injection method that learns both the program syntax and vulnerability intent. Our approach applies a combination of NLP methods and semantic-preserving program mutations (at the bytecode level) to inject code vulnerabilities. Given a dataset of known vulnerabilities (containing benign and vulnerable code pairs), our approach proceeds by employing semantic-preserving program mutations to transform the existing dataset to semantically similar code. Then, it learns the intent of the vulnerability via neural machine translation (Seq2Seq) models. The key insight is to employ Seq2Seq to learn the intent (context) of the vulnerable code in a manner that is agnostic of the specific program instance. We evaluate the performance of our approach using 1275 vulnerabilities belonging to five (5) CWEs from the Juliet test suite. We examine the effectiveness of our approach in producing compilable and vulnerable code. Our results show that INTJECT is effective, almost all (99%) of the code produced by our approach is vulnerable and compilable. We also demonstrate that the vulnerable programs generated by INTJECT are semantically similar to the withheld original vulnerable code. Finally, we show that our mutation-based data transformation approach outperforms its alternatives, namely data obfuscation and using the original data.
http://hdl.handle.net/10993/53858

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
Vulnerability_Injection_QRS_2022_3.pdfPublisher postprint594.57 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.