Reference : Learning Finite Automata via Flexible State-Merging and Applications in Networking
Dissertations and theses : Doctoral thesis
Engineering, computing & technology : Computer science
Learning Finite Automata via Flexible State-Merging and Applications in Networking
Hammerschmidt, Christian mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
University of Luxembourg, ​​Luxembourg
Docteur en Informatique
State, Radu mailto
Verwer, Sicco mailto
Sorger, Ulrich mailto
Engel, Thomas mailto
Francois, Jérome mailto
[en] machine learning ; grammatical inference ; data science ; computer networks ; netflow
[en] Being able to model behavior described by a linear sequence of observations (such as log files) goes a long way towards better understanding the underlying processes. This improved understanding can be very helpful in a number of activities, ranging from software (reverse) engineering to network traffic analysis.
The developments in this thesis were driven by specific goals in predicting (human) behaviors captured by a software appliance observing network traffic and user requests to specific resources. Its final contributions have exceeded the original goals of the project in two important ways:
I present (1) a flexible learning algorithm for finite automata accompanied by theoretical underpinning and its implementation, a contribution towards better learning algorithms, and (2) applications of the algorithm to use-cases in computer networking and beyond.

The central algorithm considered in the thesis is a blue-fringe state-merging automaton learning algorithm, conducting a greedy search over feasible solutions. Its key components are a heuristic to search for consistent merges and an evaluation metric to assess the quality of a merge by assigning scores to merges.
I generalize this framework by making the heuristic components explicitly parametric. While state-merging algorithms were originally defined for probabilistic and non-probabilistic finite state machines and later used to derive algorithms for more extended models such as real-time automata, the work presented here extends the scope of the algorithms to a wide range of ad-hoc defined models as well as enables the user to implement modifications to the heuristic search process. These modifications help to account for domain knowledge and richer semantics of models with a regular language core.

I provide an implementation and a Python interface of the flexible state-merging framework, including stream/online and interactive variants of the algorithm based on a C++ implementation of the blue-fringe greedy search algorithm called DFASAT. The algorithm and the framework encompass and improve upon state-of-the-art approaches.

The application problems considered in this thesis can be seen as classical classification and anomaly detection tasks in machine learning. The application domain is network traffic analysis with a focus on network security.
I discuss the problematic properties of data from computer networks and address how using automaton models can help mitigate them. I then use the flexible state-merging approach for host profiling. I show how to efficiently learn finite state automata as behavioral profiles. These profiles can serve as digital fingerprints and help to identify malicious traffic such as botnet traffic.
Moreover, I show how communication profiles can be used for sequence clustering on NetFlow data to distinguish different behaviors over time.
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Services and Data management research group (SEDAN)
Fonds National de la Recherche - FnR
Researchers ; Professionals ; Students
FnR ; FNR10053360 > Christian Hammerschmidt > PAULINE > Stream Mining For Predictive Authentication Under Adversarial Influence > 01/03/2015 > 14/11/2017 > 2015

There is no file associated with this reference.

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.