References of "State, Radu 50003137"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailYour Moves, Your Device: Establishing Behavior Profiles Using Tensors
Falk, Eric UL; Charlier, Jérémy Henri J. UL; State, Radu UL

in Advanced Data Mining and Applications - 13th International Conference, ADMA 2017 (2017, November)

Smartphones became a person's constant companion. As the strictly personal devices they are, they gradually enable the replacement of well established activities as for instance payments, two factor ... [more ▼]

Smartphones became a person's constant companion. As the strictly personal devices they are, they gradually enable the replacement of well established activities as for instance payments, two factor authentication or personal assistants. In addition, Internet of Things (IoT) gadgets extend the capabilities of the latter even further. Devices such as body worn fitness trackers allow users to keep track of daily activities by periodically synchronizing data with the smartphone and ultimately with the vendor's computational centers in the cloud. These fitness trackers are equipped with an array of sensors to measure the movements of the device, to derive information as step counts or make assessments about sleep quality. We capture the raw sensor data from wrist-worn activity trackers to model a biometric behavior profile of the carrier. We establish and present techniques to determine rather the original person, who trained the model, is currently wearing the bracelet or another individual. Our contribution is based on CANDECOMP/PARAFAC (CP) tensor decomposition so that computational complexity facilitates: the execution on light computational devices on low precision settings, or the migration to stronger CPUs or to the cloud, for high to very high granularity. This precision parameter allows the security layer to be adaptable, in order to be compliant with the requirements set by the use cases. We show that our approach identifies users with high confidence. [less ▲]

Detailed reference viewed: 1 (0 UL)
Full Text
Peer Reviewed
See detailOptimising Packet Forwarding in Multi-Tenant Networks using Rule Compilation
Hommes, Stefan UL; Valtchev, Petko; Blaiech, Khalil et al

in Optimising Packet Forwarding in Multi-Tenant Networks using Rule Compilation (2017, November)

Packet forwarding in Software-Defined Networks (SDN) relies on a centralised network controller which enforces network policies expressed as forwarding rules. Rules are deployed as sets of entries into ... [more ▼]

Packet forwarding in Software-Defined Networks (SDN) relies on a centralised network controller which enforces network policies expressed as forwarding rules. Rules are deployed as sets of entries into network device tables. With heterogeneous devices, deployment is strongly bounded by the respective table constraints (size, lookup time, etc.) and forwarding pipelines. Hence, minimising the overall number of entries is paramount in reducing resource consumption and speeding up the search. Moreover, since multiple control plane applications can deploy own rules, conflicts may occur. To avoid those and ensure overall correctness, a rule validation mechanism is required. Here, we present a compilation mechanism for rules of diverging origins that minimises the number of entries. Since it exploits the semantics of rules and entries, our compiler fits a heterogeneous landscape of network devices. We evaluated compiler implementations on both software and hardware switches using a realistic testbed. Experimental results show a reduction in both produced table entries and forwarding delay. [less ▲]

Detailed reference viewed: 6 (0 UL)
Full Text
Peer Reviewed
See detailAdvanced Interest Flooding Attacks in Named-Data Networking
Signorello, Salvatore UL; Marchal, Samuel; François, Jérôme et al

Scientific Conference (2017, October 30)

The Named-Data Networking (NDN) has emerged as a clean-slate Internet proposal on the wave of Information-Centric Networking. Although the NDN’s data-plane seems to offer many advantages, e.g., native ... [more ▼]

The Named-Data Networking (NDN) has emerged as a clean-slate Internet proposal on the wave of Information-Centric Networking. Although the NDN’s data-plane seems to offer many advantages, e.g., native support for multicast communications and flow balance, it also makes the network infrastructure vulnerable to a specific DDoS attack, the Interest Flooding Attack (IFA). In IFAs, a botnet issuing unsatisfiable content requests can be set up effortlessly to exhaust routers’ resources and cause a severe performance drop to legitimate users. So far several countermeasures have addressed this security threat, however, their efficacy was proved by means of simplistic assumptions on the attack model. Therefore, we propose a more complete attack model and design an advanced IFA. We show the efficiency of our novel attack scheme by extensively assessing some of the state-of-the-art countermeasures. Further, we release the software to perform this attack as open source tool to help design future more robust defense mechanisms. [less ▲]

Detailed reference viewed: 49 (3 UL)
See detailReliable Machine Learning for Networking: Key Concerns and Approaches
Hammerschmidt, Christian UL; Garcia, Sebastian; Verwer, Sicco et al

Poster (2017, October)

Machine learning has become one of the go-to methods for solving problems in the field of networking. This development is driven by data availability in large-scale networks and the commodification of ... [more ▼]

Machine learning has become one of the go-to methods for solving problems in the field of networking. This development is driven by data availability in large-scale networks and the commodification of machine learning frameworks. While this makes it easier for researchers to implement and deploy machine learning solutions on networks quickly, there are a number of vital factors to account for when using machine learning as an approach to a problem in networking and translate testing performance to real networks deployments successfully. This paper, rather than presenting a particular technical result, discusses the necessary considerations to obtain good results when using machine learning to analyze network-related data. [less ▲]

Detailed reference viewed: 11 (0 UL)
Full Text
Peer Reviewed
See detailIntroduction to Detection of Non-Technical Losses using Data Analytics
Glauner, Patrick UL; Meira, Jorge Augusto UL; State, Radu UL et al

Scientific Conference (2017, September)

Electricity losses are a frequently appearing problem in power grids. Non-technical losses (NTL) appear during distribution and include, but are not limited to, the following causes: Meter tampering in ... [more ▼]

Electricity losses are a frequently appearing problem in power grids. Non-technical losses (NTL) appear during distribution and include, but are not limited to, the following causes: Meter tampering in order to record lower consumptions, bypassing meters by rigging lines from the power source, arranged false meter readings by bribing meter readers, faulty or broken meters, un-metered supply, technical and human errors in meter readings, data processing and billing. NTLs are also reported to range up to 40% of the total electricity distributed in countries such as Brazil, India, Malaysia or Lebanon. This is an introductory level course to discuss how to predict if a customer causes a NTL. In the last years, employing data analytics methods such as data mining and machine learning have evolved as the primary direction to solve this problem. This course will compare and contrast different approaches reported in the literature. Practical case studies on real data sets will be included. Therefore, attendees will not only understand, but rather experience the challenges of NTL detection and learn how these challenges could be solved in the coming years. [less ▲]

Detailed reference viewed: 28 (3 UL)
Full Text
Peer Reviewed
See detailProfiling Smart Contracts Interactions Tensor Decomposition and Graph Mining.
Charlier, Jérémy Henri J. UL; Lagraa, Sofiane UL; State, Radu UL et al

in Proceedings of the Second Workshop on MIning DAta for financial applicationS (MIDAS 2017) co-located with the 2017 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2017), Skopje, Macedonia, September 18, 2017. (2017, September)

Smart contracts, computer protocols designed for autonomous execution on predefined conditions, arise from the evolution of the Bitcoin’s crypto-currency. They provide higher transaction security and ... [more ▼]

Smart contracts, computer protocols designed for autonomous execution on predefined conditions, arise from the evolution of the Bitcoin’s crypto-currency. They provide higher transaction security and allow economy of scale through the automated process. Smart contracts provides inherent benefits for financial institutions such as investment banking, retail banking, and insurance. This technology is widely used within Ethereum, an open source block-chain platform, from which the data has been extracted to conduct the experiments. In this work, we propose an multi-dimensional approach to find and predict smart contracts interactions only based on their crypto-currency exchanges. This approach relies on tensor modeling combined with stochastic processes. It underlines actual exchanges between smart contracts and targets the predictions of future interactions among the community. The tensor analysis is also challenged with the latest graph algorithms to assess its strengths and weaknesses in comparison to a more standard approach. [less ▲]

Detailed reference viewed: 8 (2 UL)
Full Text
Peer Reviewed
See detailIs Big Data Sufficient for a Reliable Detection of Non-Technical Losses?
Glauner, Patrick UL; Migliosi, Angelo UL; Meira, Jorge Augusto UL et al

in Proceedings of the 19th International Conference on Intelligent System Applications to Power Systems (ISAP 2017) (2017, September)

Non-technical losses (NTL) occur during the distribution of electricity in power grids and include, but are not limited to, electricity theft and faulty meters. In emerging countries, they may range up to ... [more ▼]

Non-technical losses (NTL) occur during the distribution of electricity in power grids and include, but are not limited to, electricity theft and faulty meters. In emerging countries, they may range up to 40% of the total electricity distributed. In order to detect NTLs, machine learning methods are used that learn irregular consumption patterns from customer data and inspection results. The Big Data paradigm followed in modern machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. However, the sample of inspected customers may be biased, i.e. it does not represent the population of all customers. As a consequence, machine learning models trained on these inspection results are biased as well and therefore lead to unreliable predictions of whether customers cause NTL or not. In machine learning, this issue is called covariate shift and has not been addressed in the literature on NTL detection yet. In this work, we present a novel framework for quantifying and visualizing covariate shift. We apply it to a commercial data set from Brazil that consists of 3.6M customers and 820K inspection results. We show that some features have a stronger covariate shift than others, making predictions less reliable. In particular, previous inspections were focused on certain neighborhoods or customer classes and that they were not sufficiently spread among the population of customers. This framework is about to be deployed in a commercial product for NTL detection. [less ▲]

Detailed reference viewed: 44 (6 UL)
Full Text
See detailHuman in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms
Hammerschmidt, Christian UL; State, Radu UL; Verwer, Sicco

Poster (2017, August)

We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse ... [more ▼]

We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge about the target domain can guide this process, and typically is captured in parameter settings. Often, domain expertise is subconscious and not expressed explicitly. Directly interacting with the learning algorithm makes it easier to utilize this knowledge effectively. [less ▲]

Detailed reference viewed: 8 (0 UL)
Full Text
Peer Reviewed
See detailQuery-able Kafka: An agile data analytics pipeline for mobile wireless networks
Falk, Eric UL; Gurbani, Vijay K.; State, Radu UL

in Proceedings of the VLDB Endowment (2017, August), 10

Due to their promise of delivering real-time network insights, today's streaming analytics platforms are increasingly being used in the communications networks where the impact of the insights go beyond ... [more ▼]

Due to their promise of delivering real-time network insights, today's streaming analytics platforms are increasingly being used in the communications networks where the impact of the insights go beyond sentiment and trend analysis to include real-time detection of security attacks and prediction of network state (i.e., is the network transitioning towards an outage). Current streaming analytics platforms operate under the assumption that arriving traffic is to the order of kilobytes produced at very high frequencies. However, communications networks, especially the telecommunication networks, challenge this assumption because some of the arriving traffic in these networks is to the order of gigabytes, but produced at medium to low velocities. Furthermore, these large datasets may need to be ingested in their entirety to render network insights in real-time. Our interest is to subject today's streaming analytics platforms --- constructed from state-of-the art software components (Kafka, Spark, HDFS, ElasticSearch) --- to traffic densities observed in such communications networks. We find that filtering on such large datasets is best done in a common upstream point instead of being pushed to, and repeated, in downstream components. To demonstrate the advantages of such an approach, we modify Apache Kafka to perform limited \emph{native} data transformation and filtering, relieving the downstream Spark application from doing this. Our approach outperforms four prevalent analytics pipeline architectures with negligible overhead compared to standard Kafka. [less ▲]

Detailed reference viewed: 5 (2 UL)
Full Text
Peer Reviewed
See detailRule Compilation in Multi-Tenant Networks
Blaiech, Khalil; Hamadi, Salaheddine; Hommes, Stefan UL et al

in Rule Compilation in Multi-Tenant Networks (2017, May 18)

Detailed reference viewed: 44 (6 UL)
Full Text
Peer Reviewed
See detailDetecting and predicting outages in mobile networks with log data.
Gurbani, Vijay K.; Kushnir, Dan; Mendiratta, Veena B. et al

in IEEE International Conference on Communications, ICC 2017 (2017, May)

Modern cellular networks are complex systems offering a wide range of services and present challenges in detecting anomalous events when they do occur. The networks are engineered for high reliability and ... [more ▼]

Modern cellular networks are complex systems offering a wide range of services and present challenges in detecting anomalous events when they do occur. The networks are engineered for high reliability and, hence, the data from these networks is predominantly normal with a small proportion being anomalous. From an operations perspective, it is important to detect these anomalies in a timely manner, to correct vulnerabilities in the network and preclude the occurrence of major failure events. The objective of our work is anomaly detection in cellular networks in near real-time to improve network performance and reliability. We use performance data from a 4G LTE network to develop a methodology for anomaly detection in such networks. Two rigorous prediction models are proposed: a non-parametric approach (Chi-Square test), and a parametric one (Gaussian Mixture Models). These models are trained to detect differences between distributions to classify a target distribution as belonging to a normal period or abnormal period with high accuracy. We discuss the merits between the approaches and show that both provide a more nuanced view of the network than simple thresh- olds of success/failure used by operators in production networks today. [less ▲]

Detailed reference viewed: 7 (3 UL)
Full Text
Peer Reviewed
See detailOn non-parametric models for detecting outages in the mobile network
Falk, Eric UL; Camino, Ramiro Daniel UL; State, Radu UL et al

in Integrated Network and Service Management 2017 (2017, May)

The wireless/cellular communications network is composed of a complex set of interconnected computation units that form the mobile core network. The mobile core network is engineered to be fault tolerant ... [more ▼]

The wireless/cellular communications network is composed of a complex set of interconnected computation units that form the mobile core network. The mobile core network is engineered to be fault tolerant and redundant; small errors that manifest themselves in the network are usually resolved automatically. However, some errors remain latent, and if discovered early enough can provide warnings to the network operator about a pending service outage. For mobile network operators, it is of high interest to detect these minor anomalies near real-time. In this work we use performance data from a 4G-LTE network carrier to train two parameter-free models. A first model relies on isolation forests, and the second is histogram based. The trained models represent the data characteristics for normal periods; new data is matched against the trained models to classify the new time period as being normal or abnormal. We show that the proposed methods can gauge the mobile network state with more subtlety than standard success/failure thresholds used in real-world networks today. [less ▲]

Detailed reference viewed: 1 (0 UL)
Full Text
Peer Reviewed
See detailIdentifying Irregular Power Usage by Turning Predictions into Holographic Spatial Visualizations
Glauner, Patrick UL; Dahringer, Niklas; Puhachov, Oleksandr et al

in Proceedings of the 17th IEEE International Conference on Data Mining Workshops (ICDMW 2017) (2017)

Power grids are critical infrastructure assets that face non-technical losses (NTL) such as electricity theft or faulty meters. NTL may range up to 40% of the total electricity distributed in emerging ... [more ▼]

Power grids are critical infrastructure assets that face non-technical losses (NTL) such as electricity theft or faulty meters. NTL may range up to 40% of the total electricity distributed in emerging countries. Industrial NTL detection systems are still largely based on expert knowledge when deciding whether to carry out costly on-site inspections of customers. Electricity providers are reluctant to move to large-scale deployments of automated systems that learn NTL profiles from data due to the latter's propensity to suggest a large number of unnecessary inspections. In this paper, we propose a novel system that combines automated statistical decision making with expert knowledge. First, we propose a machine learning framework that classifies customers into NTL or non-NTL using a variety of features derived from the customers' consumption data. The methodology used is specifically tailored to the level of noise in the data. Second, in order to allow human experts to feed their knowledge in the decision loop, we propose a method for visualizing prediction results at various granularity levels in a spatial hologram. Our approach allows domain experts to put the classification results into the context of the data and to incorporate their knowledge for making the final decisions of which customers to inspect. This work has resulted in appreciable results on a real-world data set of 3.6M customers. Our system is being deployed in a commercial NTL detection software. [less ▲]

Detailed reference viewed: 36 (8 UL)
Full Text
Peer Reviewed
See detailRecurrent Dynamical Projection for Time series-based Fraud detection
Antonelo, Eric Aislan UL; State, Radu UL

in ICANN 2017, Part II, LNCS 10614 (2017)

Detailed reference viewed: 19 (3 UL)
Full Text
Peer Reviewed
See detailDistilling Provider-Independent Data for General Detection of Non-Technical Losses
Meira, Jorge Augusto UL; Glauner, Patrick UL; State, Radu UL et al

in Power and Energy Conference, Illinois 23-24 February 2017 (2017)

Non-technical losses (NTL) in electricity distribution are caused by different reasons, such as poor equipment maintenance, broken meters or electricity theft. NTL occurs especially but not exclusively in ... [more ▼]

Non-technical losses (NTL) in electricity distribution are caused by different reasons, such as poor equipment maintenance, broken meters or electricity theft. NTL occurs especially but not exclusively in emerging countries. Developed countries, even though usually in smaller amounts, have to deal with NTL issues as well. In these countries the estimated annual losses are up to six billion USD. These facts have directed the focus of our work to the NTL detection. Our approach is composed of two steps: 1) We compute several features and combine them in sets characterized by four criteria: temporal, locality, similarity and infrastructure. 2) We then use the sets of features to train three machine learning classifiers: random forest, logistic regression and support vector vachine. Our hypothesis is that features derived only from provider-independent data are adequate for an accurate detection of non-technical losses. [less ▲]

Detailed reference viewed: 86 (20 UL)
Full Text
Peer Reviewed
See detailThe Challenge of Non-Technical Loss Detection using Artificial Intelligence: A Survey
Glauner, Patrick UL; Meira, Jorge Augusto UL; Valtchev, Petko UL et al

in International Journal of Computational Intelligence Systems (2017), 10(1), 760-775

Detection of non-technical losses (NTL) which include electricity theft, faulty meters or billing errors has attracted increasing attention from researchers in electrical engineering and computer science ... [more ▼]

Detection of non-technical losses (NTL) which include electricity theft, faulty meters or billing errors has attracted increasing attention from researchers in electrical engineering and computer science. NTLs cause significant harm to the economy, as in some countries they may range up to 40% of the total electricity distributed. The predominant research direction is employing artificial intelligence to predict whether a customer causes NTL. This paper first provides an overview of how NTLs are defined and their impact on economies, which include loss of revenue and profit of electricity providers and decrease of the stability and reliability of electrical power grids. It then surveys the state-of-the-art research efforts in a up-to-date and comprehensive review of algorithms, features and data sets used. It finally identifies the key scientific and engineering challenges in NTL detection and suggests how they could be addressed in the future. [less ▲]

Detailed reference viewed: 132 (7 UL)
Full Text
Peer Reviewed
See detailThe Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study
Glauner, Patrick UL; Du, Manxing UL; Paraschiv, Victor et al

in Proceedings of the 25th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2017) (2017)

Which topics of machine learning are most commonly addressed in research? This question was initially answered in 2007 by doing a qualitative survey among distinguished researchers. In our study, we ... [more ▼]

Which topics of machine learning are most commonly addressed in research? This question was initially answered in 2007 by doing a qualitative survey among distinguished researchers. In our study, we revisit this question from a quantitative perspective. Concretely, we collect 54K abstracts of papers published between 2007 and 2016 in leading machine learning journals and conferences. We then use machine learning in order to determine the top 10 topics in machine learning. We not only include models, but provide a holistic view across optimization, data, features, etc. This quantitative approach allows reducing the bias of surveys. It reveals new and up-to-date insights into what the 10 most prolific topics in machine learning research are. This allows researchers to identify popular topics as well as new and rising topics for their research. [less ▲]

Detailed reference viewed: 79 (13 UL)
Full Text
Peer Reviewed
See detailDeep Learning on Big Data Sets in the Cloud with Apache Spark and Google TensorFlow
Glauner, Patrick UL; State, Radu UL

Scientific Conference (2016, December 09)

Machine learning is the branch of artificial intelligence giving computers the ability to learn patterns from data without being explicitly programmed. Deep Learning is a set of cutting-edge machine ... [more ▼]

Machine learning is the branch of artificial intelligence giving computers the ability to learn patterns from data without being explicitly programmed. Deep Learning is a set of cutting-edge machine learning algorithms that are inspired by how the human brain works. It allows to selflearn feature hierarchies from the data rather than modeling hand-crafted features. It has proven to significantly improve performance in challenging data analytics problems. In this tutorial, we will first provide an introduction to the theoretical foundations of neural networks and Deep Learning. Second, we will demonstrate how to use Deep Learning in a cloud using a distributed environment for Big Data analytics. This combines Apache Spark and TensorFlow, Google’s in-house Deep Learning platform made for Big Data machine learning applications. Practical demonstrations will include character recognition and time series forecasting in Big Data sets. Attendees will be provided with code snippets that they can easily amend in order to analyze their own data. A related, but shorter tutorial focusing on Deep Learning on a single computer was given at the Data Science Luxembourg Meetup in April 2016. It was attended by 70 people making it the most attended event of this Meetup series in Luxembourg ever since its beginning. [less ▲]

Detailed reference viewed: 235 (4 UL)
Full Text
Peer Reviewed
See detailBehavior Profiling for Mobile Advertising
Du, Manxing UL; State, Radu UL; Brorsson, Mats et al

in Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (2016, December)

Detailed reference viewed: 50 (11 UL)
Full Text
Peer Reviewed
See detailInterpreting Finite Automata for Sequential Data
Hammerschmidt, Christian UL; Verwer, S.; Lin, Q. et al

in Interpretable Machine Learning for Complex Systems: NIPS 2016 workshop proceedings (2016)

Detailed reference viewed: 53 (22 UL)