Abstract :
[en] Programming has become central in the development of human activities while not
being immune to defaults, or bugs. Developers have developed specific methods and
sequences of tests that they implement to prevent these bugs from being deployed in
releases. Nonetheless, not all cases can be thought through beforehand, and automation
presents limits the community attempts to overcome. As a consequence, not all bugs
can be caught.
These defaults are causing particular concerns in case bugs can be exploited to
breach the program’s security policy. They are then called vulnerabilities and provide
specific actors with undesired access to the resources a program manages. It damages
the trust in the program and in its developers, and may eventually impact the adoption
of the program. Hence, to attribute a specific attention to vulnerabilities appears as a
natural outcome. In this regard, this PhD work targets the following three challenges:
(1) The research community references those vulnerabilities, categorises them, reports
and ranks their impact. As a result, analysts can learn from past vulnerabilities in
specific programs and figure out new ideas to counter them. Nonetheless, the resulting
quality of the lessons and the usefulness of ensuing solutions depend on the quality and
the consistency of the information provided in the reports.
(2) New methods to detect vulnerabilities can emerge among the teachings this
monitoring provides. With responsible reporting, these detection methods can provide
hardening of the programs we rely on. Additionally, in a context of computer perfor-
mance gain, machine learning algorithms are increasingly adopted, providing engaging
promises.
(3) If some of these promises can be fulfilled, not all are not reachable today.
Therefore a complementary strategy needs to be adopted while vulnerabilities evade
detection up to public releases. Instead of preventing their introduction, programs can
be hardened to scale down their exploitability. Increasing the complexity to exploit
or lowering the impact below specific thresholds makes the presence of vulnerabilities
an affordable risk for the feature provided. The history of programming development
encloses the experimentation and the adoption of so-called defence mechanisms. Their
goals and performances can be diverse, but their implementation in worldwide adopted
programs and systems (such as the Android Open Source Project) acknowledges their
pivotal position.
To face these challenges, we provide the following contributions:
• We provide a manual categorisation of the vulnerabilities of the worldwide adopted
Android Open Source Project up to June 2020. Clarifying to adopt a vulnera-
bility analysis provides consistency in the resulting data set. It facilitates the
explainability of the analyses and sets up for the updatability of the resulting
set of vulnerabilities. Based on this analysis, we study the evolution of AOSP’s
vulnerabilities. We explore the different temporal evolutions of the vulnerabilities affecting the system for their severity, the type of vulnerability, and we provide a
focus on memory corruption-related vulnerabilities.
• We undertake the replication of a machine-learning based detection algorithms
that, besides being part of the state-of-the-art and referenced to by ensuing works,
was not available. Named VCCFinder, this algorithm implements a Support-
Vector Machine and bases its training on Vulnerability-Contributing Commits
and related patches for C and C++ code. Not in capacity to achieve analogous
performances to the original article, we explore parameters and algorithms, and
attempt to overcome the challenge provided by the over-population of unlabeled
entries in the data set. We provide the community with our code and results as a
replicable baseline for further improvement.
• We eventually list the defence mechanisms that the Android Open Source Project
incrementally implements, and we discuss how it sometimes answers comments
the community addressed to the project’s developers. We further verify the extent
to which specific memory corruption defence mechanisms were implemented in the
binaries of different versions of Android (from API-level 10 to 28). We eventually
confront the evolution of memory corruption-related vulnerabilities with the
implementation timeline of related defence mechanisms.