Doctoral thesis (Dissertations and theses)
Evaluating Vulnerability Prediction Models
Jimenez, Matthieu


Full Text
Author postprint (8.51 MB)

All documents in ORBilu are protected by a user license.

Send to


Keywords :
Security; Vulnerability; Prediction Modelling
Abstract :
[en] Today almost every device depends on a piece of software. As a result, our life increasingly depends on some software form such as smartphone apps, laundry machines, web applications, computers, transportation and many others, all of which rely on software. Inevitably, this dependence raises the issue of software vulnerabilities and their possible impact on our lifestyle. Over the years, researchers and industrialists suggested several approaches to detect such issues and vulnerabilities. A particular popular branch of such approaches, usually called Vulnerability Prediction Modelling (VPM) techniques, leverage prediction modelling techniques that flag suspicious (likely vulnerable) code components. These techniques rely on source code features as indicators of vulnerabilities to build the prediction models. However, the emerging question is how effective such methods are and how they can be used in practice. The present dissertation studies vulnerability prediction models and evaluates them on real and reliable playground. To this end, it suggests a toolset that automatically collects real vulnerable code instances, from major open source systems, suitable for applying VPM. These code instances are then used to analyze, replicate, compare and develop new VPMs. Specifically, the dissertation has 3 main axes: The first regards the analysis of vulnerabilities. Indeed, to build VPMs accurately, numerous data are required. However, by their nature, vulnerabilities are scarce and the information about them is spread over different sources (NVD, Git, Bug Trackers). Thus, the suggested toolset (develops an automatic way to build a large dataset) enables the reliable and relevant analysis of VPMs. The second axis focuses on the empirical comparison and analysis of existing Vulnerability Prediction Models. It thus develops and replicates existing VPMs. To this end, the thesis introduces a framework that builds, analyse and compares existing prediction models (using the already proposed sets of features) using the dataset developed on the first axis. The third axis explores the use of cross-entropy (metric used by natural language processing) as a potential feature for developing new VPMs. Cross-entropy, usually referred to as the naturalness of code, is a recent approach that measures the repetitiveness of code (relying on statistical models). Using cross-entropy, the thesis investigates different ways of building and using VPMs. Overall, this thesis provides a fully-fledge study on Vulnerability Prediction Models aiming at assessing and improving their performance.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Security Design and Validation Research Group (SerVal)
Disciplines :
Computer science
Author, co-author :
Jimenez, Matthieu  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
Language :
Title :
Evaluating Vulnerability Prediction Models
Defense date :
02 October 2018
Number of pages :
Institution :
Unilu - University of Luxembourg, Luxembourg
Degree :
Docteur en Informatique
Promotor :
President :
Jury member :
Papadakis, Mike 
Sarro, Federica
Blanc, Xavier
Focus Area :
Security, Reliability and Trust
Available on ORBilu :
since 08 October 2018


Number of views
765 (59 by Unilu)
Number of downloads
585 (36 by Unilu)


Similar publications

Contact ORBilu