Article (Scientific journals)
BugDoc: Iterative debugging and explanation of pipeline
DE PAULA LOURENCO, Raoni; Freire, Juliana; Simon, Eric et al.
2023In VLDB Journal, 32 (1), p. 75 - 101
Peer Reviewed verified by ORBi
 

Files


Full Text
s00778-022-00733-5.pdf
Author postprint (2.19 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Data input; Data output; Enterprise analytics; Input and outputs; Input datas; Large scale simulations; Parameter data; Potential sources; Root cause; Software updates; Information Systems; Hardware and Architecture
Abstract :
[en] Applications in domains ranging from large-scale simulations in astrophysics and biology to enterprise analytics rely on computational pipelines. A pipeline consists of modules and their associated parameters, data inputs, and outputs, which are orchestrated to produce a set of results. If some modules derive unexpected outputs, the pipeline can crash or lead to incorrect results. Debugging these pipelines is difficult since there are many potential sources of errors including: bugs in the code, input data, software updates, and improper parameter settings. We present BugDoc, a system that automatically infers the root causes and derive succinct explanations of failures for black-box pipelines. BugDoc does so by using provenance from previous runs of a given pipeline to derive hypotheses for the errors, and then iteratively runs new pipeline configurations to test these hypotheses. Besides identifying issues associated with computational modules in a pipeline, we also propose methods for: “opportunistic group testing” to identify portions of data inputs that might be responsible for failed executions (what we call), helping users narrow down the cause of failure; and “selective instrumentation” to determine nodes in pipelines that should be instrumented to improve efficiency and reduce the number of iterations to test. Through a case study of deployed workflows at a software company and an experimental evaluation using synthetic pipelines, we assess the effectiveness of BugDoc and show that it requires fewer iterations to derive root causes and/or achieves higher quality results than previous approaches.
Disciplines :
Computer science
Author, co-author :
DE PAULA LOURENCO, Raoni  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Freire, Juliana;  Tandon School of Engineering, New York University, Brooklyn, United States
Simon, Eric;  SAP, Levallois-Perret, France
Weber, Gabriel;  Amazon, São Paulo, Brazil
Shasha, Dennis;  Courant Institute of Mathematical Sciences, New York University, New York, United States
External co-authors :
yes
Language :
English
Title :
BugDoc: Iterative debugging and explanation of pipeline
Publication date :
2023
Journal title :
VLDB Journal
ISSN :
1066-8888
eISSN :
0949-877X
Publisher :
Springer Science and Business Media Deutschland GmbH
Volume :
32
Issue :
1
Pages :
75 - 101
Peer reviewed :
Peer Reviewed verified by ORBi
Funders :
National Science Foundation
Conselho Nacional de Desenvolvimento Científico e Tecnológico
Defense Advanced Research Projects Agency
Funding text :
We thank the Data X-Ray and Explanation Tables authors for sharing their code with us. We are also grateful to Fernando Chirigati, Neel Dey, and Peter Bailis for providing the real-world pipelines. This work has been supported in part by NSF grants IIS-1916505, IIS-2106888, IOS-1339362, MCB-1158273, MCB-1412232, and OAC-1934464; CNPq (Brazil) grant 209623/2014-4; the DARPA D3M program; and NYU WIRELESS. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of funding agencies.
Available on ORBilu :
since 22 November 2023

Statistics


Number of views
61 (3 by Unilu)
Number of downloads
79 (1 by Unilu)

Scopus citations®
 
4
Scopus citations®
without self-citations
3
OpenCitations
 
1
OpenAlex citations
 
3
WoS citations
 
3

Bibliography


Similar publications



Contact ORBilu