References of "Le Traon, Yves 50002182"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailA systematic review on the engineering of software for ubiquitous systems
Sanchez Guinea, Alejandro UL; Nain, Gregory; Le Traon, Yves UL

in Journal of Systems and Software (2016), 118

Context: Software engineering for ubiquitous systems has experienced an important and rapid growth, however the vast research corpus makes it difficult to obtain valuable information from it. Objective ... [more ▼]

Context: Software engineering for ubiquitous systems has experienced an important and rapid growth, however the vast research corpus makes it difficult to obtain valuable information from it. Objective: To identify, evaluate, and synthesize research about the most relevant approaches addressing the different phases of the software development life cycle for ubiquitous systems. Method: We conducted a systematic literature review of papers presenting and evaluating approaches for the different phases of the software development life cycle for ubiquitous systems. Approaches were classified according to the phase of the development cycle they addressed, identifying their main concerns and limitations. Results: We identified 128 papers reporting 132 approaches addressing issues related to different phases of the software development cycle for ubiquitous systems. Most approaches have been aimed at addressing the implementation, evolution/maintenance, and feedback phases, while others phases such as testing need more attention from researchers. Conclusion: We recommend to follow existing guidelines when conducting case studies to make the studies more reproducible and closer to real life cases. While some phases of the development cycle have been extensively explored, there is still room for research in other phases, toward a more agile and integrated cycle, from requirements to testing and feedback. [less ▲]

Detailed reference viewed: 144 (3 UL)
Full Text
Peer Reviewed
See detailTime Series Classification with Discrete Wavelet Transformed Data: Insights from an Empirical Study
Li, Daoyuan UL; Bissyande, Tegawendé François D Assise UL; Klein, Jacques UL et al

in The 28th International Conference on Software Engineering and Knowledge Engineering (SEKE 2016) (2016, July)

Time series mining has become essential for extracting knowledge from the abundant data that flows out from many application domains. To overcome storage and processing challenges in time series mining ... [more ▼]

Time series mining has become essential for extracting knowledge from the abundant data that flows out from many application domains. To overcome storage and processing challenges in time series mining, compression techniques are being used. In this paper, we investigate the loss/gain of performance of time series classification approaches when fed with lossy-compressed data. This empirical study is essential for reassuring practitioners, but also for providing more insights on how compression techniques can even be effective in reducing noise in time series data. From a knowledge engineering perspective, we show that time series may be compressed by 90% using discrete wavelet transforms and still achieve remarkable classification ac- curacy, and that residual details left by popular wavelet compression techniques can sometimes even help achieve higher classification accuracy than the raw time series data, as they better capture essential local features. [less ▲]

Detailed reference viewed: 510 (26 UL)
Full Text
Peer Reviewed
See detailDSCo: A Language Modeling Approach for Time Series Classification
Li, Daoyuan UL; Li, Li UL; Bissyande, Tegawendé François D Assise UL et al

in 12th International Conference on Machine Learning and Data Mining (MLDM 2016) (2016, July)

Time series data are abundant in various domains and are often characterized as large in size and high in dimensionality, leading to storage and processing challenges. Symbolic representation of time ... [more ▼]

Time series data are abundant in various domains and are often characterized as large in size and high in dimensionality, leading to storage and processing challenges. Symbolic representation of time series – which transforms numeric time series data into texts – is a promising technique to address these challenges. However, these techniques are essentially lossy compression functions and information are partially lost during transformation. To that end, we bring up a novel approach named Domain Series Corpus (DSCo), which builds per-class language models from the symbolized texts. To classify unlabeled samples, we compute the fitness of each symbolized sample against all per-class models and choose the class represented by the model with the best fitness score. Our work innovatively takes advantage of mature techniques from both time series mining and NLP communities. Through extensive experiments on an open dataset archive, we demonstrate that it performs similarly to approaches working with original uncompressed numeric data. [less ▲]

Detailed reference viewed: 350 (28 UL)
Full Text
Peer Reviewed
See detailOpen Data Portal Quality Comparison using AHP
Kubler, Sylvain UL; Robert, Jérémy UL; Le Traon, Yves UL et al

in Proceedings of the 17th International Digital Government Research Conference on Digital Government Research (2016, June 07)

During recent years, more and more Open Data becomes available and used as part of the Open Data movement. However, there are reported issues with the quality of the metadata in data portals and the data ... [more ▼]

During recent years, more and more Open Data becomes available and used as part of the Open Data movement. However, there are reported issues with the quality of the metadata in data portals and the data itself. This is a seri- ous risk that could disrupt the Open Data project, as well as e-government initiatives since the data quality needs to be managed to guarantee the reliability of e-government to the public. First quality assessment frameworks emerge to eval- uate the quality for a given dataset or portal along various dimensions (e.g., information completeness). Nonetheless, a common problem with such frameworks is to provide mean- ingful ranking mechanisms that are able to integrate sev- eral quality dimensions and user preferences (e.g., a portal provider is likely to have different quality preferences than a portal consumer). To address this multi-criteria decision making problem, our research work applies AHP (Analytic Hierarchy Process), which compares 146 active Open Data portals across 44 countries, powered by the CKAN software. [less ▲]

Detailed reference viewed: 190 (2 UL)
Full Text
Peer Reviewed
See detailAndroZoo: Collecting Millions of Android Apps for the Research Community
Allix, Kevin UL; Bissyande, Tegawendé François D Assise UL; Klein, Jacques UL et al

in Proceedings of the 13th International Workshop on Mining Software Repositories (2016, May)

We present a growing collection of Android Applications collected from several sources, including the official Google Play app market. Our dataset, AndroZoo, currently contains more than three million ... [more ▼]

We present a growing collection of Android Applications collected from several sources, including the official Google Play app market. Our dataset, AndroZoo, currently contains more than three million apps, each of which has been analysed by tens of different AntiVirus products to know which applications are detected as Malware. We provide this dataset to contribute to ongoing research efforts, as well as to enable new potential research topics on Android Apps. By releasing our dataset to the research community, we also aim at encouraging our fellow researchers to engage in reproducible experiments. [less ▲]

Detailed reference viewed: 1236 (35 UL)
Full Text
See detailStatic Analysis of Android Apps: A Systematic Literature Review
Li, Li UL; Bissyande, Tegawendé François D Assise UL; Papadakis, Mike UL et al

Report (2016)

Context: Static analysis approaches have been proposed to assess the security of Android apps, by searching for known vulnerabilities or actual malicious code. The literature thus has proposed a large ... [more ▼]

Context: Static analysis approaches have been proposed to assess the security of Android apps, by searching for known vulnerabilities or actual malicious code. The literature thus has proposed a large body of works, each of which attempts to tackle one or more of the several challenges that program analyzers face when dealing with Android apps. Objective: We aim to provide a clear view of the state-of-the-art works that statically analyze Android apps, from which we highlight the trends of static analysis approaches, pinpoint where the focus has been put and enumerate the key aspects where future researches are still needed. Method: We have performed a systematic literature review which involves studying around 90 research papers published in software engineering, programming languages and security venues. This review is performed mainly in five dimensions: problems targeted by the approach, fundamental techniques used by authors, static analysis sensitivities considered, android characteristics taken into account and the scale of evaluation performed. Results: Our in-depth examination have led to several key findings: 1) Static analysis is largely performed to uncover security and privacy issues; 2) The Soot framework and the Jimple intermediate representation are the most adopted basic support tool and format, respectively; 3) Taint analysis remains the most applied technique in research approaches; 4) Most approaches support several analysis sensitivities, but very few approaches consider path-sensitivity; 5) There is no single work that has been proposed to tackle all challenges of static analysis that are related to Android programming; and 6) Only a small portion of state-of-the-art works have made their artifacts publicly available. Conclusion: The research community is still facing a number of challenges for building approaches that are aware altogether of implicit-Flows, dynamic code loading features, reflective calls, native code and multi-threading, in order to implement sound and highly precise static analyzers. [less ▲]

Detailed reference viewed: 1125 (29 UL)
Full Text
Peer Reviewed
See detailTowards a Generic Framework for Automating Extensive Analysis of Android Applications
Li, Li UL; Li, Daoyuan UL; Bartel, Alexandre et al

in The 31st ACM/SIGAPP Symposium on Applied Computing (SAC 2016) (2016, April)

Despite much effort in the community, the momentum of Android research has not yet produced complete tools to perform thorough analysis on Android apps, leaving users vulnerable to malicious apps. Because ... [more ▼]

Despite much effort in the community, the momentum of Android research has not yet produced complete tools to perform thorough analysis on Android apps, leaving users vulnerable to malicious apps. Because it is hard for a single tool to efficiently address all of the various challenges of Android programming which make analysis difficult, we propose to instrument the app code for reducing the analysis complexity, e.g., transforming a hard problem to a easy-resolvable one. To this end, we introduce in this paper Apkpler, a plugin-based framework for supporting such instrumentation. We evaluate Apkpler with two plugins, demonstrating the feasibility of our approach and showing that Apkpler can indeed be leveraged to reduce the analysis complexity of Android apps. [less ▲]

Detailed reference viewed: 216 (9 UL)
Full Text
Peer Reviewed
See detailNear Real-Time Electric Load Approximation in Low Voltage Cables of Smart Grids with Models@run.time
Hartmann, Thomas UL; Moawad, Assaad UL; Fouquet, François UL et al

in 31st Annual ACM Symposium on Applied Computing (SAC'16) (2016, April)

Micro-generations and future grid usages, such as charging of electric cars, raises major challenges to monitor the electric load in low-voltage cables. Due to the highly interconnected nature, real-time ... [more ▼]

Micro-generations and future grid usages, such as charging of electric cars, raises major challenges to monitor the electric load in low-voltage cables. Due to the highly interconnected nature, real-time measurements are problematic, both economically and technically. This entails an overload risk in electricity networks when cables must be disconnected for maintenance reasons or are accidentally damaged. Therefore, it is of great interest for electricity grid providers to anticipate the load in networks and quicker detect failures. However, computing the electric load in cables requires computational intensive power flow calculations and live consumption measurements. Today’s view of the grid is usually based on on-field documentation of cables, fuses, and measurements by technicians and therefore often outdated. Thus, the electric load is usually only simulated in case of major topology variations. However, live measurements of smart meters provide new opportunities. In this paper we present a novel approach for a near real-time electric load approximation by deriving in live the current electric topology and cable loads from smart meter data. We leverage the models@run.time paradigm to combine live measurements with topology characteristics of the grid. Our approach enables to approximate the load in cables, not only for the current grid topology, but also to simulate topology changes for maintenance purposes. We showed that this allows a near real-time approximation while remaining very accurate (average deviation of 1.89% compared to offline power-flow calculation tools). Developed with a grid operator, this approach will be integrated in a monitoring and warning system and as an embeddable solution for on-field simulation. [less ▲]

Detailed reference viewed: 221 (14 UL)
Full Text
Peer Reviewed
See detailAn Investigation into the Use of Common Libraries in Android Apps
Li, Li UL; Bissyande, Tegawendé François D Assise UL; Klein, Jacques UL et al

in The 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016) (2016, March)

The packaging model of Android apps requires the entire code necessary for the execution of an app to be shipped into one single apk file. Thus, an analysis of Android apps often visits code which is not ... [more ▼]

The packaging model of Android apps requires the entire code necessary for the execution of an app to be shipped into one single apk file. Thus, an analysis of Android apps often visits code which is not part of the functionality delivered by the app. Such code is often contributed by the common libraries which are used pervasively by all apps. Unfortunately, Android analyses, e.g., for piggybacking detection and malware detection, can produce inaccurate results if they do not take into account the case of library code, which constitute noise in app features. Despite some efforts on investigating Android libraries, the momentum of Android research has not yet produced a complete set of common libraries to further support in-depth analysis of Android apps. In this paper, we leverage a dataset of about 1.5 million apps from Google Play to harvest potential common libraries, including advertisement libraries. With several steps of refinements, we finally collect by far the largest set of 1,113 libraries supporting common functionality and 240 libraries for advertisement. We use the dataset to investigates several aspects of Android libraries, including their popularity and their proportion in Android app code. Based on these datasets, we have further performed several empirical investigations to confirm the motivations behind our work. [less ▲]

Detailed reference viewed: 220 (10 UL)
Full Text
Peer Reviewed
See detailProfiling household appliance electricity usage with n-gram language modeling
Li, Daoyuan UL; Bissyande, Tegawendé François D Assise UL; Kubler, Sylvain UL et al

in The 2016 IEEE International Conference on Industrial Technology (ICIT 2016) (2016, March)

Detailed reference viewed: 287 (38 UL)
Full Text
Peer Reviewed
See detailParameter Values of Android APIs: A Preliminary Study on 100,000 Apps
Li, Li UL; Bissyande, Tegawendé François D Assise UL; Klein, Jacques UL et al

in The 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016) (2016, March)

Parameter values are important elements for un- derstanding how Application Programming Interfaces (APIs) are used in practice. In the context of Android, a few number of API methods are used pervasively ... [more ▼]

Parameter values are important elements for un- derstanding how Application Programming Interfaces (APIs) are used in practice. In the context of Android, a few number of API methods are used pervasively by millions of apps, where these API methods provide app core functionality. In this paper, we present preliminary insights from ParamHarver, a purely static analysis approach for automatically extracting parameter values from Android apps. Investigations on 100,000 apps illustrate how an in-depth study of parameter values can be leveraged in various scenarios (e.g., to recommend relevant parameter values, or even, to some extent, to identify malicious apps). [less ▲]

Detailed reference viewed: 236 (6 UL)
Full Text
Peer Reviewed
See detailCombining Static Analysis with Probabilistic Models to Enable Market-Scale Android Inter-component Analysis
Octeau, Damien; Jha, Somesh; Dering, Matthew et al

in The 43rd Symposium on Principles of Programming Languages (POPL 2016) (2016, January)

Static analysis has been successfully used in many areas, from verifying mission-critical software to malware detection. Unfortunately, static analysis often produces false positives, which require ... [more ▼]

Static analysis has been successfully used in many areas, from verifying mission-critical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer Inter-Component Communication (ICC) yield many potential inter-component and inter-application links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the inter-component links is a prerequisite to inter-component analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static inter-application analysis more tractable, even at large scales. [less ▲]

Detailed reference viewed: 178 (0 UL)
Full Text
Peer Reviewed
See detailFeature Location Benchmark for Software Families using Eclipse Community Releases
Martinez, Jabier UL; Ziadi, Tewfik; Papadakis, Mike UL et al

in Software Reuse: Bridging with Social-Awareness, ICSR 2016 Proceedings (2016)

Detailed reference viewed: 188 (11 UL)
Full Text
Peer Reviewed
See detailOn the Lack of Consensus in Anti-Virus Decisions: Metrics and Insights on Building Ground Truths of Android Malware
Hurier, Médéric UL; Allix, Kevin UL; Bissyande, Tegawendé François D Assise UL et al

in Detection of Intrusions and Malware, and Vulnerability Assessment - 13th International Conference (2016)

There is generally a lack of consensus in Antivirus (AV) engines' decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners ... [more ▼]

There is generally a lack of consensus in Antivirus (AV) engines' decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners may rely on unvalidated approaches to build their ground truth, e.g., by considering decisions from a selected set of Antivirus vendors or by setting up a threshold number of positive detections before classifying a sample. Both approaches are biased as they implicitly either decide on ranking AV products, or they consider that all AV decisions have equal weights. In this paper, we extensively investigate the lack of agreement among AV engines. To that end, we propose a set of metrics that quantitatively describe the different dimensions of this lack of consensus. We show how our metrics can bring important insights by using the detection results of 66 AV products on 2 million Android apps as a case study. Our analysis focuses not only on AV binary decision but also on the notoriously hard problem of labels that AVs associate with suspicious files, and allows to highlight biases hidden in the collection of a malware ground truth---a foundation stone of any machine learning-based malware detection approach. [less ▲]

Detailed reference viewed: 443 (31 UL)
Full Text
Peer Reviewed
See detailComparing White-box and Black-box Test Prioritization
Henard, Christopher UL; Papadakis, Mike UL; Harman, Mark et al

in 38th International Conference on Software Engineering (ICSE'16) (2016)

Although white-box regression test prioritization has been well-studied, the more recently introduced black-box prioritization approaches have neither been compared against each other nor against more ... [more ▼]

Although white-box regression test prioritization has been well-studied, the more recently introduced black-box prioritization approaches have neither been compared against each other nor against more well-established white-box techniques. We present a comprehensive experimental comparison of several test prioritization techniques, including well-established white-box strategies and more recently introduced black-box approaches. We found that Combinatorial Interaction Testing and diversity-based techniques (Input Model Diversity and Input Test Set Diameter) perform best among the black-box approaches. Perhaps surprisingly, we found little difference between black-box and white-box performance (at most 4% fault detection rate difference). We also found the overlap between black- and white-box faults to be high: the first 10% of the prioritized test suites already agree on at least 60% of the faults found. These are positive findings for practicing regression testers who may not have source code available, thereby making white-box techniques inapplicable. We also found evidence that both black-box and white-box prioritization remain robust over multiple system releases. [less ▲]

Detailed reference viewed: 253 (12 UL)
Full Text
Peer Reviewed
See detailDynamic Risk Analyses and Dependency-Aware Root Cause Model for Critical Infrastructures
Muller, Steve UL; Harpes, Carlo; Le Traon, Yves UL et al

in International Conference on Critical Information Infrastructures Security (2016)

Critical Infrastructures are known for their complexity and the strong interdependencies between the various components. As a result, cascading effects can have devastating consequences, while foreseeing ... [more ▼]

Critical Infrastructures are known for their complexity and the strong interdependencies between the various components. As a result, cascading effects can have devastating consequences, while foreseeing the overall impact of a particular incident is not straight-forward at all and goes beyond performing a simple risk analysis. This work presents a graph-based approach for conducting dynamic risk analyses, which are programmatically generated from a threat model and an inventory of assets. In contrast to traditional risk analyses, they can be kept automatically up-to-date and show the risk currently faced by a system in real-time. The concepts are applied to and validated in the context of the smart grid infrastructure currently being deployed in Luxembourg. [less ▲]

Detailed reference viewed: 123 (6 UL)
Full Text
Peer Reviewed
See detailO-MI/O-DF Standards as Interoperability Enablers for Industrial Internet: a Performance Analysis
Robert, Jérémy UL; Kubler, Sylvain UL; Le Traon, Yves UL et al

in O-MI/O-DF Standards as Interoperability Enablers for Industrial Internet: a Performance Analysis (2016)

The Industrial Internet should provide means to create ad hoc and loosely coupled information flows between objects, users, services, and business domain systems. However, today’s technologies and ... [more ▼]

The Industrial Internet should provide means to create ad hoc and loosely coupled information flows between objects, users, services, and business domain systems. However, today’s technologies and products often feed ‘vertical silos’ (e.g., vertical/siloed apps), which inevitably result in multiple and non-interoperable systems. Standardization will play an ever-increasing part in enabling information to flow between such vertically-oriented closed systems. This paper presents recent IoT messaging standards, notably O-MI (Open Messaging Interface) and O-DF (Open Data Format), whose initial requirements were defined for enhanced collaboration and interoperability in product lifecycle management. A first analytical model of the minimal traffic load (in bytes) to fulfil the required/basic standard specifications is then proposed. A smart maintenance use case relying on the first version of the standard reference implementation is developed, based on which our analytical model is applied to evaluate the degree of deviation (w.r.t. the standard specifications) of this reference implementation. [less ▲]

Detailed reference viewed: 254 (16 UL)
Full Text
Peer Reviewed
See detailMining Families of Android Applications for Extractive SPL Adoption
Li, Li UL; Martinez, Jabier UL; Ziadi, Tewfik et al

in The 20th International Systems and Software Product Line Conference (SPLC 2016) (2016)

The myriads of smart phones around the globe gave rise to a vast proliferation of mobile applications. These applications target an increasing number of user profiles and tasks. In this context, Android ... [more ▼]

The myriads of smart phones around the globe gave rise to a vast proliferation of mobile applications. These applications target an increasing number of user profiles and tasks. In this context, Android is a leading technology for their development and on-line markets are the main means for their distribution. In this paper we motivate, from two perspectives, the mining of these markets with the objective to identify families of apps variants in the wild. The first perspective is related to research activities where building realistic case studies for evaluating extractive SPL adoption techniques are needed. The second is related to a large- scale, world-wide and time-aware study of reuse practice in an industry which is now flourishing among all others within the software engineering community. This study is relevant to assess potential for SPLE practices adoption. We present initial implementations of the mining process and we discuss analyses of variant families. [less ▲]

Detailed reference viewed: 243 (14 UL)
Full Text
Peer Reviewed
See detail“Overloaded!” — A Model-based Approach to Database Stress Testing
Meira, Jorge Augusto UL; Almeira, Eduardo Cunha de; Kim, Dongsun UL et al

in International Conference on Database and Expert Systems Applications, Porto 5-8 September 2016 (2016)

Detailed reference viewed: 179 (3 UL)