![]() Lagraa, Sofiane ![]() ![]() ![]() Poster (2018, August 20) Last few years have witnessed a steady growth in interest on crypto-currencies and blockchains. They are receiving considerable interest from industry and the research community, the most popular one ... [more ▼] Last few years have witnessed a steady growth in interest on crypto-currencies and blockchains. They are receiving considerable interest from industry and the research community, the most popular one being Bitcoin. However, these crypto-currencies are so far relatively poorly analyzed and investigated. Recently, many solutions, mostly based on ad-hoc engineered solutions, are being developed to discover relevant analysis from crypto-currencies, but are not sufficient to understand behind crypto-currencies. In this paper, we provide a deep analysis of crypto-currencies by proposing a new knowledge discovery approach for each crypto-currency, across crypto-currencies, blockchains, and financial stocks. The novel approach is based on a conjoint use of data mining algorithms on imbalanced time series. It automatically reports co-variation dependency patterns of the time series. The experiments on the public crypto-currencies and financial stocks markets data also demonstrate the usefulness of the approach by discovering the different relationships across multiple time series sources and insights correlations behind crypto-currencies. [less ▲] Detailed reference viewed: 198 (4 UL)![]() Steichen, Mathis ![]() ![]() ![]() in The 2018 IEEE International Conference on Blockchain (Blockchain-2018) (2018, July 30) Large files cannot be efficiently stored on blockchains. On one hand side, the blockchain becomes bloated with data that has to be propagated within the blockchain network. On the other hand, since the ... [more ▼] Large files cannot be efficiently stored on blockchains. On one hand side, the blockchain becomes bloated with data that has to be propagated within the blockchain network. On the other hand, since the blockchain is replicated on many nodes, a lot of storage space is required without serving an immediate purpose, especially if the node operator does not need to view every file that is stored on the blockchain. It furthermore leads to an increase in the price of operating blockchain nodes because more data needs to be processed, transferred and stored. IPFS is a file sharing system that can be leveraged to more efficiently store and share large files. It relies on cryptographic hashes that can easily be stored on a blockchain. Nonetheless, IPFS does not permit users to share files with selected parties. This is necessary, if sensitive or personal data needs to be shared. Therefore, this paper presents a modified version of the InterPlanetary Filesystem (IPFS) that leverages Ethereum smart contracts to provide access controlled file sharing. The smart contract is used to maintain the access control list, while the modified IPFS software enforces it. For this, it interacts with the smart contract whenever a file is uploaded, downloaded or transferred. Using an experimental setup, the impact of the access controlled IPFS is analyzed and discussed. [less ▲] Detailed reference viewed: 680 (43 UL)![]() Norvill, Robert ![]() ![]() ![]() in NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium (2018, July 09) In this work we present E-EVM, a tool that emulates and visualises the execution of smart contracts on the Ethereum Virtual Machine. By working with the readily available bytecode of smart contracts we ... [more ▼] In this work we present E-EVM, a tool that emulates and visualises the execution of smart contracts on the Ethereum Virtual Machine. By working with the readily available bytecode of smart contracts we are able to display the program's control flow graph, opcodes and stack for each step of contract execution. This tool is designed to aid the user's understanding of the Etheruem Virtual Machine as well as aid the analysis of any given smart contract. As such, it functions as both an analysis and a learning tool. It allows the user to view the code in each block of a smart contract and follow possible control flow branches. It is able to detect loops and suggest optimisation candidates. It is possible to step through a contract one opcode at a time. E-EVM achieved an average of 85.6% code coverage when tested. [less ▲] Detailed reference viewed: 114 (4 UL)![]() Camino, Ramiro Daniel ![]() ![]() ![]() Scientific Conference (2018, July) We propose a method to train generative adversarial networks on mutivariate feature vectors representing multiple categorical values. In contrast to the continuous domain, where GAN-based methods have ... [more ▼] We propose a method to train generative adversarial networks on mutivariate feature vectors representing multiple categorical values. In contrast to the continuous domain, where GAN-based methods have delivered considerable results, GANs struggle to perform equally well on discrete data. We propose and compare several architectures based on multiple (Gumbel) softmax output layers taking into account the structure of the data. We evaluate the performance of our architecture on datasets with different sparsity, number of features, ranges of categorical values, and dependencies among the features. Our proposed architecture and method outperforms existing models. [less ▲] Detailed reference viewed: 161 (24 UL)![]() Charlier, Jérémy Henri J. ![]() ![]() in International Journal of Computer & Software Engineering (2018), 3(1), Background: Past few months have seen the rise of blockchain and cryptocurrencies. In this context, the Ethereum platform, an open-source blockchain-based platform using Ether cryptocurrency, has been ... [more ▼] Background: Past few months have seen the rise of blockchain and cryptocurrencies. In this context, the Ethereum platform, an open-source blockchain-based platform using Ether cryptocurrency, has been designed to use smart contracts programs. These are self-executing blockchain contracts. Due to their high volume of transactions, analyzing their behavior is very challenging. We address this challenge in our paper. Methods: We develop for this purpose an innovative approach based on the non-negative tensor decomposition Paratuck2 combined with long short-term memory. The objective is to assess if predictive analysis can forecast smart contracts activities over time. Three statistical tests are performed on the predictive analytics, the mean absolute percentage error, the mean directional accuracy and the Jaccard distance. Results: Among dozens of GB of transactions, the Paratuck2 tensor decomposition allows asymmetric modeling of the smart contracts. Furthermore, it highlights time dependent latent groups. The latent activities are modeled by the long short term memory network for predictive analytics. The highly accurate predictions underline the accuracy of the method and show that blockchain activities are not pure randomness. Conclusion: Herein, we are able to detect the most active contracts, and predict their behavior. In the context of future regulations, our approach opens new perspective for monitoring blockchain activities. [less ▲] Detailed reference viewed: 223 (12 UL)![]() Fiz Pontiveros, Beltran ![]() ![]() ![]() in Proceedings of 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS) 2018 (2018, February) Detailed reference viewed: 209 (10 UL)![]() Charlier, Jérémy Henri J. ![]() ![]() in Charlier, Jeremy; State, Radu; Hilger, Jean (Eds.) 2018 IEEE International Conference on Big Data and Smart Computing Proceedings (2018, January) Smart contracts are programs stored and executed on a blockchain. The Ethereum platform, an open source blockchain-based platform, has been designed to use these programs offering secured protocols and ... [more ▼] Smart contracts are programs stored and executed on a blockchain. The Ethereum platform, an open source blockchain-based platform, has been designed to use these programs offering secured protocols and transaction costs reduction. The Ethereum Virtual Machine performs smart contracts runs, where the execution of each contract is limited to the amount of gas required to execute the operations described in the code. Each gas unit must be paid using Ether, the crypto-currency of the platform. Due to smart contracts interactions evolving over time, analyzing the behavior of smart contracts is very challenging. We address this challenge in our paper. We develop for this purpose an innovative approach based on the nonnegative tensor decomposition PARATUCK2 combined with long short-term memory (LSTM) to assess if predictive analysis can forecast smart contracts interactions over time. To validate our methodology, we report results for two use cases. The main use case is related to analyzing smart contracts and allows shedding some light into the complex interactions among smart contracts. In order to show the generality of our method on other use cases, we also report its performance on video on demand recommendation. [less ▲] Detailed reference viewed: 252 (23 UL)![]() Glauner, Patrick ![]() ![]() in Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2018) (2018) The underlying paradigm of big data-driven machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. Is ... [more ▼] The underlying paradigm of big data-driven machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. Is having simply more data always helpful? In 1936, The Literary Digest collected 2.3M filled in questionnaires to predict the outcome of that year's US presidential election. The outcome of this big data prediction proved to be entirely wrong, whereas George Gallup only needed 3K handpicked people to make an accurate prediction. Generally, biases occur in machine learning whenever the distributions of training set and test set are different. In this work, we provide a review of different sorts of biases in (big) data sets in machine learning. We provide definitions and discussions of the most commonly appearing biases in machine learning: class imbalance and covariate shift. We also show how these biases can be quantified and corrected. This work is an introductory text for both researchers and practitioners to become more aware of this topic and thus to derive more reliable models for their learning problems. [less ▲] Detailed reference viewed: 153 (14 UL)![]() Fiz Pontiveros, Beltran ![]() ![]() ![]() in NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium (2018) Mining pools are collection of workers that work together as a group in order to collaborate in the proof of work and reduce the variance of their rewards when mining. In order to achieve this, Mining ... [more ▼] Mining pools are collection of workers that work together as a group in order to collaborate in the proof of work and reduce the variance of their rewards when mining. In order to achieve this, Mining pools distribute amongst the workers the task of finding a block so that each worker works on a different subset of the candidate solutions. In most mining pools the selection of transactions to be part of the next block is performed by the pool manager and thus becomes more centralized. A mining Pool is expected to give priority to the most lucrative transactions in order to increase the block reward however changes to the transaction policy done without notification of workers would be difficult to detect. In this paper we treat the transaction selection policy performed by miners as a classification problem; for each block we create a dataset, separate them by mining pool and apply feature selection techniques to extract a vector of importance for each feature. We then track variations in feature importance as new blocks arrive and show using a generated scenario how a change in policy by a mining pool could be detected. [less ▲] Detailed reference viewed: 220 (10 UL)![]() Shbair, Wazen ![]() ![]() ![]() in The First IEEE/IFIP International Workshop on Managing and Managed by Blockchain (Man2Block) colocated with IEEE/IFIP NOMS 2018 (2018) Conducting experiments to evaluate blockchain applications is a challenging task for developers, because there is a range of configuration parameters that control blockchain environments. Many public ... [more ▼] Conducting experiments to evaluate blockchain applications is a challenging task for developers, because there is a range of configuration parameters that control blockchain environments. Many public testnets (e.g. Rinkeby Ethereum) can be used for testing, however, we cannot adjust their parameters (e.g. Gas limit, Mining difficulty) to further the understanding of the application in question and of the employed blockchain. This paper proposes an easy to use orchestration framework over the Grid'5000 platform. Grid'5000 is a highly reconfigurable and controllable large-scale testbed. We developed a tool that facilitates nodes reservation, deployment and blockchain configuration over the Grid'5000 platform. In addition, our tool can fine-tune blockchain and network parameters before and between experiments. The proposed framework offers insights for private and consortium blockchain developers to identify performance bottlenecks and to assess the behavior of their applications in different circumstances. [less ▲] Detailed reference viewed: 565 (30 UL)![]() Glauner, Patrick ![]() ![]() in Proceedings 13th International FLINS Conference on Data Science and Knowledge Engineering for Sensing Decision Support (FLINS 2018) (2018) In machine learning, a bias occurs whenever training sets are not representative for the test data, which results in unreliable models. The most common biases in data are arguably class imbalance and ... [more ▼] In machine learning, a bias occurs whenever training sets are not representative for the test data, which results in unreliable models. The most common biases in data are arguably class imbalance and covariate shift. In this work, we aim to shed light on this topic in order to increase the overall attention to this issue in the field of machine learning. We propose a scalable novel framework for reducing multiple biases in high-dimensional data sets in order to train more reliable predictors. We apply our methodology to the detection of irregular power usage from real, noisy industrial data. In emerging markets, irregular power usage, and electricity theft in particular, may range up to 40% of the total electricity distributed. Biased data sets are of particular issue in this domain. We show that reducing these biases increases the accuracy of the trained predictors. Our models have the potential to generate significant economic value in a real world application, as they are being deployed in a commercial software for the detection of irregular power usage. [less ▲] Detailed reference viewed: 137 (8 UL)![]() Glauner, Patrick ![]() ![]() ![]() Scientific Conference (2018) Electricity losses are a frequently appearing problem in power grids. Non-technical losses (NTL) appear during distribution and include, but are not limited to, the following causes: Meter tampering in ... [more ▼] Electricity losses are a frequently appearing problem in power grids. Non-technical losses (NTL) appear during distribution and include, but are not limited to, the following causes: Meter tampering in order to record lower consumptions, bypassing meters by rigging lines from the power source, arranged false meter readings by bribing meter readers, faulty or broken meters, un-metered supply, technical and human errors in meter readings, data processing and billing. NTLs are also reported to range up to 40% of the total electricity distributed in countries such as India, Pakistan, Malaysia, Brazil or Lebanon. This is an introductory level course to discuss how to predict if a customer causes a NTL. In the last years, employing data analytics methods such as machine learning and data mining have evolved as the primary direction to solve this problem. This course will present and compare different approaches reported in the literature. Practical case studies on real data sets will be included. As an additional outcome, attendees will understand the open challenges of NTL detection and learn how these challenges could be solved in the coming years. [less ▲] Detailed reference viewed: 212 (6 UL)![]() Glauner, Patrick ![]() ![]() Scientific Conference (2018) Detailed reference viewed: 151 (7 UL)![]() Varisteas, Georgios ![]() ![]() ![]() in Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning (2018) Python has evolved to become the most popular language for data science. It sports state-of-the-art libraries for analytics and machine learning, like Sci-Kit Learn. However, Python lacks the ... [more ▼] Python has evolved to become the most popular language for data science. It sports state-of-the-art libraries for analytics and machine learning, like Sci-Kit Learn. However, Python lacks the computational performance that a industrial system requires for high frequency real time predictions. Building upon a year long research project heavily based on SciKit Learn (sklearn), we faced performance issues in deploying to production. Replacing sklearn with a better performing framework would require re-evaluating and tuning hyperparameters from scratch. Instead we developed a python embedding in a C++ based server application that increased performance by up to 20x, achieving linear scalability up to a point of convergence. Our implementation was done for mainstream cost effective hardware, which means we observed similar performance gains on small as well as large systems, from a laptop to an Amazon EC2 instance to a high-end server. [less ▲] Detailed reference viewed: 142 (8 UL)![]() Yakubov, Alexander ![]() ![]() in The First IEEE/IFIP International Workshop on Managing and Managed by Blockchain (Man2Block) colocated with IEEE/IFIP NOMS 2018, Tapei, Tawain 23-27 April 2018 (2018) Public-Key Infrastructure (PKI) is the cornerstone technology that facilitates secure information exchange over the Internet. However, PKI is exposed to risks due to potential failures of Certificate ... [more ▼] Public-Key Infrastructure (PKI) is the cornerstone technology that facilitates secure information exchange over the Internet. However, PKI is exposed to risks due to potential failures of Certificate Authorities (CAs) that may be used to issue unauthorized certificates for end-users. Many recent breaches show that if a CA is compromised, the security of the corresponding end-users will be in risk. As an emerging solution, Blockchain technology potentially resolves the problems of traditional PKI systems - in particular, elimination of single point-of-failure and rapid reaction to CAs shortcomings. Blockchain has the ability to store and manage digital certificates within a public and immutable ledger, resulting in a fully traceable history log. In this paper we designed and developed a blockchain-based PKI management framework for issuing, validating and revoking X.509 certificates. Evaluation and experimental results confirm that the proposed framework provides more reliable and robust PKI systems with modest maintenance costs. [less ▲] Detailed reference viewed: 2404 (13 UL)![]() Glauner, Patrick ![]() ![]() ![]() Scientific Conference (2018) The field of Machine Learning grew out of the quest for artificial intelligence. It gives computers the ability to learn statistical patterns from data without being explicitly programmed. These patterns ... [more ▼] The field of Machine Learning grew out of the quest for artificial intelligence. It gives computers the ability to learn statistical patterns from data without being explicitly programmed. These patterns can then be applied to new data in order to make predictions. Machine Learning also allows to automatically adapt to changes in the data without amending the underlying model. We deal every day dozens of times with Machine Learning applications such as when doing a Google search, using spam filters, face detection, speaking to voice recognition software or when sitting in a self-driving car. In recent years, machine learning methods have evolved in the smart grid community. This change towards analyzing data rather than modeling specific problems has lead to adaptable, more generic methods, that require less expert knowledge and that are easier to deploy in a number of use cases. This is an introductory level course to discuss what machine learning is and how to apply it to data-driven smart grid applications. Practical case studies on real data sets, such as load forecasting, detection of irregular power usage and visualization of customer data, will be included. Therefore, attendees will not only understand, but rather experience, how to apply machine learning methods to smart grid data. [less ▲] Detailed reference viewed: 675 (11 UL)![]() Khan, Nida ![]() in NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium (2018) Blockchain is an emerging foundational technology with the potential to create a novel economic and social system. The complexity of the technology poses many challenges and foremost amongst these are ... [more ▼] Blockchain is an emerging foundational technology with the potential to create a novel economic and social system. The complexity of the technology poses many challenges and foremost amongst these are monitoring and management of blockchain-based decentralized applications. In this paper, we design, implement and evaluate a novel system to enable management operations in smart contracts. A key aspect of our system is that it facilitates the integration of these operations through dedicated ’managing’ smart contracts to provide data filtering as per the role of the smart contract-based application user. We evaluate the overhead costs of such data filtering operations after post-deployment analyses of five categories of smart contracts on the Ethereum public testnet, Rinkeby. We also build a monitoring tool to display public blockchain data using a dashboard coupled with a notification mechanism of any changes in private data to the administrator of the monitored decentralized application. [less ▲] Detailed reference viewed: 182 (14 UL)![]() Kaiafas, Georgios ![]() ![]() ![]() in Kaiafas, Georgios; Varisteas, Georgios; Lagraa, Sofiane (Eds.) et al IEEE/IFIP Network Operations and Management Symposium, 23-27 April 2018, Taipei, Taiwan Cognitive Management in a Cyber World (2018) Detailed reference viewed: 368 (48 UL)![]() Falk, Eric ![]() ![]() in Global Communications (2017) Security in virtualised environments is becoming increasingly important for institutions, not only for a firm’s own on-site servers and network but also for data and sites that are hosted in the cloud ... [more ▼] Security in virtualised environments is becoming increasingly important for institutions, not only for a firm’s own on-site servers and network but also for data and sites that are hosted in the cloud. Today, security is either handled globally by the cloud provider, or each customer needs to invest in its own security infrastructure. This paper proposes a Virtual Security Operation Center (VSOC) that allows to collect, analyse and visualize security related data from multiple sources. For instance, a user can forward log data from its firewalls, applications and routers in order to check for anomalies and other suspicious activities. The security analytics provided by the VSOC are comparable to those of commercial security incident and event management (SIEM) solutions, but are deployed as a cloud-based solution with the additional benefit of using big data processing tools to handle large volumes of data. This allows us to detect more complex attacks that cannot be detected with todays signature-based (i.e. rules) SIEM solutions. [less ▲] Detailed reference viewed: 186 (9 UL)![]() Falk, Eric ![]() ![]() ![]() in Advanced Data Mining and Applications - 13th International Conference, ADMA 2017 (2017, November) Smartphones became a person's constant companion. As the strictly personal devices they are, they gradually enable the replacement of well established activities as for instance payments, two factor ... [more ▼] Smartphones became a person's constant companion. As the strictly personal devices they are, they gradually enable the replacement of well established activities as for instance payments, two factor authentication or personal assistants. In addition, Internet of Things (IoT) gadgets extend the capabilities of the latter even further. Devices such as body worn fitness trackers allow users to keep track of daily activities by periodically synchronizing data with the smartphone and ultimately with the vendor's computational centers in the cloud. These fitness trackers are equipped with an array of sensors to measure the movements of the device, to derive information as step counts or make assessments about sleep quality. We capture the raw sensor data from wrist-worn activity trackers to model a biometric behavior profile of the carrier. We establish and present techniques to determine rather the original person, who trained the model, is currently wearing the bracelet or another individual. Our contribution is based on CANDECOMP/PARAFAC (CP) tensor decomposition so that computational complexity facilitates: the execution on light computational devices on low precision settings, or the migration to stronger CPUs or to the cloud, for high to very high granularity. This precision parameter allows the security layer to be adaptable, in order to be compliant with the requirements set by the use cases. We show that our approach identifies users with high confidence. [less ▲] Detailed reference viewed: 182 (18 UL) |
||