References of "Lo, David"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailWatch out for This Commit! A Study of Influential Software Changes
Li, Daoyuan UL; Li, Li UL; Kim, Dongsun UL et al

in Journal of Software: Evolution and Process (2019)

One single code change can significantly influence a wide range of software systems and their users. For example, 1) adding a new feature can spread defects in several modules, while 2) changing an API ... [more ▼]

One single code change can significantly influence a wide range of software systems and their users. For example, 1) adding a new feature can spread defects in several modules, while 2) changing an API method can improve the performance of all client programs. Developers often may not clearly know whether their or others’ changes are influential at commit time. Rather, it turns out to be influential after affecting many aspects of a system later. This paper investigates influential software changes and proposes an approach to identify them early, i.e., immediately when they are applied. We first conduct a post- mortem analysis to discover existing influential changes by using intuitions such as isolated changes and changes referred by other changes in 10 open source projects. Then we re-categorize all identified changes through an open-card sorting process. Subsequently, we conduct a survey with 89 developers to confirm our influential change categories. Finally, from our ground truth we extract features, including metrics such as the complexity of changes, terms in commit logs and file centrality in co-change graphs, to build ma- chine learning classifiers. The experiment results show that our prediction model achieves overall with random samples 86.8% precision, 74% recall and 80.4% F-measure respectively. [less ▲]

Detailed reference viewed: 270 (22 UL)
Full Text
Peer Reviewed
See detailAugmenting and Structuring User Queries to Support Efficient Free-Form Code Search
Sirres, Raphael; Bissyande, Tegawendé François D Assise UL; Kim, Dongsun et al

in Empirical Software Engineering (2018), 90

Detailed reference viewed: 139 (6 UL)
Full Text
Peer Reviewed
See detailOn Locating Malicious Code in Piggybacked Android Apps
Li, Li UL; Li, Daoyuan UL; Bissyande, Tegawendé François D Assise UL et al

in Journal of Computer Science and Technology (2017)

To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to ... [more ▼]

To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to provide a framework for dissecting malware and locating malicious program fragments within app code in order to build a comprehensive dataset of malicious samples. Towards addressing this need, we propose in this work a tool-based approach called HookRanker, which provides ranked lists of potentially malicious packages based on the way malware behaviour code is triggered. With experiments on a ground truth of piggybacked apps, we are able to automatically locate the malicious packages from piggybacked Android apps with an accuracy@5 of 83.6% for such packages that are triggered through method invocations and an accuracy@5 of 82.2% for such packages that are triggered independently. [less ▲]

Detailed reference viewed: 230 (10 UL)
Full Text
Peer Reviewed
See detailUnderstanding Android App Piggybacking
Li, Li UL; Li, Daoyuan UL; Bissyande, Tegawendé François D Assise UL et al

Poster (2017, May)

The Android packaging model offers adequate opportunities for attackers to inject malicious code into popular benign apps, attempting to develop new malicious apps that can then be easily spread to a ... [more ▼]

The Android packaging model offers adequate opportunities for attackers to inject malicious code into popular benign apps, attempting to develop new malicious apps that can then be easily spread to a large user base. Despite the fact that the literature has already presented a number of tools to detect piggybacked apps, there is still lacking a comprehensive investigation on the piggybacking processes. To fill this gap, in this work, we collect a large set of benign/piggybacked app pairs that can be taken as benchmark apps for further investigation. We manually look into these benchmark pairs for understanding the characteristics of piggybacking apps and eventually we report 20 interesting findings. We expect these findings to initiate new research directions such as practical and scalable piggybacked app detection, explainable malware detection, and malicious code location. [less ▲]

Detailed reference viewed: 289 (12 UL)
Full Text
Peer Reviewed
See detailAutomatically Locating Malicious Packages in Piggybacked Android Apps
Li, Li UL; Li, Daoyuan UL; Bissyande, Tegawendé François D Assise UL et al

in Abstract book of the 4th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MobileSoft 2017) (2017, May)

To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to ... [more ▼]

To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to provide a framework for dissecting malware and locating malicious program fragments within app code in order to build a comprehensive dataset of malicious samples. Towards addressing this need, we propose in this work a tool-based approach called HookRanker, which provides ranked lists of potentially malicious packages based on the way malware behaviour code is triggered. With experiments on a ground truth set of piggybacked apps, we are able to automatically locate the malicious packages from piggybacked Android apps with an accuracy of 83.6% in verifying the top five reported items. [less ▲]

Detailed reference viewed: 334 (23 UL)
Full Text
Peer Reviewed
See detailUnderstanding Android App Piggybacking: A Systematic Study of Malicious Code Grafting
Li, Li UL; Li, Daoyuan UL; Bissyande, Tegawendé François D Assise UL et al

in IEEE Transactions on Information Forensics and Security (2017)

The Android packaging model offers ample opportunities for malware writers to piggyback malicious code in popular apps, which can then be easily spread to a large user base. Although recent research has ... [more ▼]

The Android packaging model offers ample opportunities for malware writers to piggyback malicious code in popular apps, which can then be easily spread to a large user base. Although recent research has produced approaches and tools to identify piggybacked apps, the literature lacks a comprehensive investigation into such phenomenon. We fill this gap by 1) systematically building a large set of piggybacked and benign apps pairs, which we release to the community, 2) empirically studying the characteristics of malicious piggybacked apps in comparison with their benign counterparts, and 3) providing insights on piggybacking processes. Among several findings providing insights, analysis techniques should build upon to improve the overall detection and classification accuracy of piggybacked apps, we show that piggybacking operations not only concern app code but also extensively manipulates app resource files, largely contradicting common beliefs. We also find that piggybacking is done with little sophistication, in many cases automatically, and often via library code. [less ▲]

Detailed reference viewed: 356 (28 UL)
Full Text
Peer Reviewed
See detailComprehending Malicious Android Apps By Mining Topic-Specific Data Flow Signatures
Yang, Xinli; Lo, David; Li, Li UL et al

in Information and Software Technology (2017)

Context: State-of-the-art works on automated detection of Android malware have leveraged app descriptions to spot anomalies w.r.t the functionality implemented, or have used data flow information as a ... [more ▼]

Context: State-of-the-art works on automated detection of Android malware have leveraged app descriptions to spot anomalies w.r.t the functionality implemented, or have used data flow information as a feature to discriminate malicious from benign apps. Although these works have yielded promising performance, we hypothesize that these performances can be improved by a better understanding of malicious behavior. Objective: To characterize malicious apps, we take into account both information on app descriptions, which are indicative of apps’ topics, and information on sensitive data flow, which can be relevant to discriminate malware from benign apps. Method: In this paper, we propose a topic-specific approach to malware comprehension based on app descriptions and data-flow information. First, we use an advanced topic model, adaptive LDA with GA, to cluster apps according to their descriptions. Then, we use information gain ratio of sensitive data flow information to build so-called “topic-specific data flow signatures”. Results: We conduct an empirical study on 3691 benign and 1612 malicious apps. We group them into 118 topics and generate topic-specific data flow signature. We verify the effectiveness of the topic-specific data flow signatures by comparing them with the overall data flow signature. In addition, we perform a deeper analysis on 25 representative topic-specific signatures and yield several implications. Conclusion: Topic-specific data flow signatures are efficient in highlighting the malicious behavior, and thus can help in characterizing malware. [less ▲]

Detailed reference viewed: 259 (13 UL)
Full Text
See detailAugmenting and Structuring User Queries to Support Efficient Free-Form Code Search
Sirres, Raphael; Bissyande, Tegawendé François D Assise UL; Kim, Dongsun UL et al

Report (2017)

Source code terms such as method names and variable types are often different from conceptual words mentioned in a search query. This vocabulary mismatch problem can make code search inefficient. In this ... [more ▼]

Source code terms such as method names and variable types are often different from conceptual words mentioned in a search query. This vocabulary mismatch problem can make code search inefficient. In this paper, we present Code voCABUlary (CoCaBu), an approach to resolving the vocabulary mismatch problem when dealing with free-form code search queries. Our approach leverages common developer questions and the associated expert answers to augment user queries with the relevant, but missing, structural code entities in order to improve the performance of matching relevant code examples within large code repositories. To instantiate this approach, we build GitSearch, a code search engine, on top of GitHub and StackOverflow Q\&A data. We evaluate GitSearch in several dimensions to demonstrate that (1) its code search results are correct with respect to user-accepted answers; (2) the results are qualitatively better than those of existing Internet-scale code search engines; (3) our engine is competitive against web search engines, such as Google, in helping users complete solve programming tasks; and (4) GitSearch provides code examples that are acceptable or interesting to the community as answers for StackOverflow questions. [less ▲]

Detailed reference viewed: 318 (31 UL)
Full Text
Peer Reviewed
See detailGot Issues? Who Cares About It? A Large Scale Investigation of Issue Trackers from GitHub
Bissyande, Tegawendé François D Assise UL; Lo, David; Jiang, Lingxiao et al

in Proceedings of the 24th International Symposium on Software Reliability Engineering (ISSRE 2013) (2013, November)

Detailed reference viewed: 227 (12 UL)
Full Text
Peer Reviewed
See detailAn Empirical Study of Adoption of Software Testing in Open Source Projects
Pavneet Singh, Kochaar; Bissyande, Tegawendé François D Assise UL; Lo, David et al

in Proceedings of the 13th International Conference on Quality Software (QSIC 2013) (2013)

Testing is an indispensable part of software development efforts. It helps to improve the quality of software systems by finding bugs and errors during development and deployment. Huge amount of resources ... [more ▼]

Testing is an indispensable part of software development efforts. It helps to improve the quality of software systems by finding bugs and errors during development and deployment. Huge amount of resources are spent on testing efforts. However, to what extent are they used in practice? In this study, we investigate the adoption of testing in open source projects. We study more than 20,000 non-trivial software projects and explore the correlation of test cases with various project development characteristics including: project size, development team size, number of bugs, number of bug reporters, and the programming languages of these projects. [less ▲]

Detailed reference viewed: 260 (3 UL)