Preprocessing is What You Need: Understanding and Predicting the Complexity of SAT-based Uniform Random Sampling

ZEYEN, Olivier; CORDY, Maxime; Perrouin, Gilles; Acher, Mathieu

doi:10.1145/3644033.3644371

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Preprocessing is What You Need: Understanding and Predicting the Complexity of SAT-based Uniform Random Sampling

ZEYEN, Olivier; CORDY, Maxime; Perrouin, Gilles et al.

2024 • In Proceedings - 2024 IEEE/ACM 12th International Conference on Formal Methods in Software Engineering, FormaliSE 2024

Peer reviewed

Permalink
https://hdl.handle.net/10993/61583

DOI
10.1145/3644033.3644371

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

FormaliSE2024_UnderstandingSATScalability-2.pdf

Publisher postprint (616.08 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Boolean formulae; Boolean satisfiability problems; Performance; Random sampling; SAT solvers; SAT-based; Search spaces; Software; Uniform Random Sampling

Abstract :

[en] Despite its NP-completeness, the Boolean satisfiability problem gave birth to highly efficient tools that are able to find solutions to a Boolean formula and compute their number. Boolean formulae compactly encode huge, constrained search spaces for variability-intensive systems, e.g., the possible configurations of the Linux kernel. These search spaces are generally too big to explore exhaustively, leading most testing approaches to sample a few solutions before analysing them. A desirable property of such samples is uniformity: each solution should get the same selection probability. This property motivated the design of uniform random samplers, relying on SAT solvers and counters and achieving different tradeoffs between uniformity and scalability. Though we can observe their performance in practice, understanding the complexity these tools face and accurately predicting it is an under-explored problem. Indeed, structural metrics such as the number of variables and clauses involved in a formula poorly predict the sampling complexity. More elaborated ones, such as minimal independent support (MIS), are intractable to compute on large formulae. We provide an efficient parallel algorithm to compute a related metric, the number of equivalence classes, and demonstrate that this metric is highly correlated to time and memory usage of uniform random sampling and model counting tools. We explore the role of formula preprocessing on various metrics and show its positive influence on correlations. Relying on these correlations, we train an efficient classifier (F1-score 0.97) to predict whether uniformly sampling a given formula will exceed a specified budget. Our results allow us to characterise the similarities and differences between (uniform) sampling, solving and counting. c 2024

Disciplines :

Computer science

Author, co-author :

ZEYEN, Olivier ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

Perrouin, Gilles ; PReCISE/NaDI, University of Namur, Belgium

Acher, Mathieu ; Univ Rennes, Inria, CNRS, IRISA, France

External co-authors :

yes

Language :

English

Title :

Preprocessing is What You Need: Understanding and Predicting the Complexity of SAT-based Uniform Random Sampling

Publication date :

14 April 2024

Event name :

Proceedings of the 2024 IEEE/ACM 12th International Conference on Formal Methods in Software Engineering (FormaliSE)

Event place :

Lisbon, Prt

Event date :

14-04-2024 => 15-04-2024

Main work title :

Proceedings - 2024 IEEE/ACM 12th International Conference on Formal Methods in Software Engineering, FormaliSE 2024

Publisher :

Association for Computing Machinery, Inc

ISBN/EAN :

9798400705892

Peer reviewed :

Peer reviewed

Additional URL :

https://dl.acm.org/doi/pdf/10.1145/3644033.3644371

Funders :

Luxembourg National Research Fund (FNR)

Funding text :

This research was funded in whole, or in part, by the Luxembourg National Research Fund (FNR). Gilles Perrouin is a FNRS Research Associate. Maxime Cordy and Olivier Zeyen are supported by FNR Luxembourg (grants C19/IS/13566661/BEEHIVE/Cordy and AFR Grant 17047437).

Available on ORBilu :

since 10 July 2024

Statistics

Number of views

113 (7 by Unilu)

Number of downloads

50 (2 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenAlex citations

Bibliography

Mathieu Acher, Gilles Perrouin, and Maxime Cordy. 2021. BURST: A benchmarking platform for uniform random sampling techniques. In SPLC '21: 25th ACM International Systems and Software Product Line Conference, Leicester, United Kindom, September 6-11, 2021, Volume B, Mohammad Reza Mousavi and Pierre-Yves Schobbens (Eds.). ACM, 36-40. https://doi.org/10.1145/3461002.3473070
D. Achlioptas, Zayd Hammoudeh, and P. Theodoropoulos. 2018. Fast Sampling of Perfectly Uniform Satisfying Assignments. In SAT.
Tasniem Nasser Alyahya, Mohamed El Bachir Menai, and Hassan Mathkour. 2022. On the Structure of the Boolean Satisfiability Problem: A Survey. ACM Comput. Surv. 55, 3, Article 46 (mar 2022), 34 pages. https://doi.org/10.1145/3491210
Anonym. 2023. EQV and preprocessing for URS prediction. https://github.com/serval-uni-lu/urs-scal.
Teodora Baluta, Zheng Leong Chua, Kuldeep S. Meel, and Prateek Saxena. 2021. Scalable Quantitative Verification For Deep Neural Networks. In 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, 312-323. https://doi.org/10.1109/ICSE43902.2021.00039
D. Batory, D. Benavides, and A. Ruiz-Cortés. 2006. Automated Analysis of Feature Models: Challenges Ahead. Commun. ACM (December 2006).
Supratik Chakraborty, Daniel J. Fremont, Kuldeep S. Meel, Sanjit A. Seshia, and Moshe Y. Vardi. 2015. On Parallel Scalable Uniform SAT Witness Generation. In Tools and Algorithms for the Construction and Analysis of Systems TACAS'15 2015, London, UK, April 11-18, 2015. Proceedings. 304-319.
Supratik Chakraborty, Kuldeep S. Meel, and Moshe Y. Vardi. 2014. Balancing Scalability and Uniformity in SAT Witness Generator. In Proceedings of the 51st Annual Design Automation Conference (San Francisco, CA, USA) (DAC '14). ACM, New York, NY, USA, Article 60, 6 pages. https://doi.org/10.1145/2593069.2593097
Michael Codish, Yoav Fekete, and Amit Metodi. 2013. Backbones for Equality. In Haifa Verification Conference.
Maxime Cordy, Mike Papadakis, and Axel Legay. 2020. Statistical Model Checking for Variability-Intensive Systems. In Fundamental Approaches to Software Engineering-23rd International Conference, FASE 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings (Lecture Notes in Computer Science, Vol. 12076), Heike Wehrheim and Jordi Cabot (Eds.). Springer, 294-314. https://doi.org/10.1007/978-3-030-45234-6-15
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In TACAS'08 (Budapest, Hungary). Springer-Verlag, 337-340.
Axel de Perthuis de Laillevault, Benjamin Doerr, and Carola Doerr. 2015. Money for Nothing: Speeding Up Evolutionary Algorithms Through Better Initialization. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation (Madrid, Spain) (GECCO '15). ACM, New York, NY, USA, 815-822. https://doi.org/10.1145/2739480.2754760
Rafael Dutra, Kevin Laeufer, Jonathan Bachrach, and Koushik Sen. 2018. Efficient sampling of SAT solutions for testing. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27-June 03, 2018. 549-559. https://doi.org/10.1145/3180155.3180248
Niklas Eén and Niklas Sörensson. 2003. An Extensible SAT-solver. In SAT'03. Springer, 502-518.
Vijay Ganesh and Moshe Y. Vardi. 2021. On The Unreasonable Effectiveness of SAT Solvers. In Beyond the worst-case analysis of algorithms, Tim Roughgarden (Ed.). Cambridge University Press, 547-563.
Priyanka Golia, M. Soos, Sourav Chakraborty, and Kuldeep S. Meel. 2021. Designing Samplers is Easy: The Boon of Testers. 2021 Formal Methods in Computer Aided Design (FMCAD) (2021), 222-230.
Axel Halin, Alexandre Nuttinck, Mathieu Acher, Xavier Devroey, Gilles Perrouin, and Benoit Baudry. 2018. Test them all, is it worth it? Assessing configuration sampling on the JHipsterWeb development stack. Empirical Software Engineering (17 Jul 2018). https://doi.org/10.1007/s10664-018-9635-4
Michael Hamann and Ben Strasser. 2015. Graph Bisection with Pareto Optimization. Journal of Experimental Algorithmics (JEA) 23 (2015), 1-34. https://api.semanticscholar.org/CorpusID:3395784
Alexander Ivrii, Sharad Malik, Kuldeep S Meel, and Moshe Y Vardi. 2016. On computing minimal independent support and its applications to sampling and counting. Constraints 21, 1 (2016), 41-58.
Martin Fagereng Johansen, Øystein Haugen, and Franck Fleurey. 2011. Properties of Realistic Feature Models Make Combinatorial Testing of Product Lines Feasible. In Model Driven Engineering Languages and Systems: 14th International Conference, MODELS '11, Jon Whittle, Tony Clark, and Thomas Köhne (Eds.). Springer, 638-652.
Christian Kaltenecker, Alexander Grebhahn, Norbert Siegmund, Jianmei Guo, and Sven Apel. 2019. Distance-based sampling of software configuration spaces. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE /ACM, 1084-1094.
K. Kang, S. Cohen, J. Hess, W. Novak, and S. Peterson. 1990. Feature-Oriented Domain Analysis (FODA). Technical Report CMU/SEI-90-TR-21. SEI.
Alexander Knöppel, Thomas Thöm, Stephan Mennicke, Jens Meinicke, and Ina Schaefer. 2017. Is there a mismatch between real-world feature models and product-line research?. In Proceedings of the 2017 11th JointMeeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017. 291-302. https://doi.org/10.1145/3106237.3106252
Sebastian Krieter, Thomas Thöm, Sandro Schulze, Reimar Schröter, and Gunter Saake. 2018. Propagating configuration decisions with modal implication graphs. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27-June 03, 2018. 898-909. https://doi.org/10. 1145/3180155.3180159
Jean-Marie Lagniez and Pierre Marquis. 2017. An Improved Decision-DNNF Compiler. In IJCAI.
Jia Hui Liang, Vijay Ganesh, Krzysztof Czarnecki, and Venkatesh Raman. 2015. SAT-based Analysis of Large Real-world Feature Models is Easy. In Proceedings of the 19th International Conference on Software Product Line (Nashville, Tennessee) (SPLC '15). ACM, New York, NY, USA, 91-100. https://doi.org/10.1145/2791060. 2791070
R. Mateescu. 2011. Treewidth in Industrial SAT Benchmarks.
Marcilio Mendonca, Andrzej Wasowski, and Krzysztof Czarnecki. 2009. SATbased analysis of feature models is easy. In SPLC'09 (San Francisco, California). IEEE, 231-240.
Rémi Monasson, Riccardo Zecchina, Scott Kirkpatrick, Bart Selman, and Lidror Troyansky. 1999. Determining computational complexity from characteristic 'phase transitions'. Nature 400, 6740 (1999), 133-137.
Jeho Oh, Don S. Batory, Margaret Myers, and Norbert Siegmund. 2017. Finding near-optimal configurations in product lines by random sampling. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017, Eric Bodden, Wilhelm Schäfer, Arie van Deursen, and Andrea Zisman (Eds.). ACM, 61-71. https://doi.org/10. 1145/3106237.3106273
Jeho Oh, Paul Gazzillo, and Don S. Batory. 2019.T-wise coverage by uniform sampling. In Proceedings of the 23rd International Systems and Software Product Line Conference, SPLC 2019, Volume A, Paris, France, September 9-13, 2019, Thorsten Berger, Philippe Collet, Laurence Duchien, Thomas Fogdal, Patrick Heymans, Timo Kehrer, Jabier Martinez, Raúl Mazo, Leticia Montalvillo, Camille Salinesi, Xhevahire Tërnava, Thomas Thöm, and Tewfik Ziadi (Eds.). ACM, 15:1-15:4.
Johan Oudinet, Alain Denise, Marie-Claude Gaudel, Richard Lassaigne, and Sylvain Peyronnet. 2011. Uniform Monte-Carlo Model Checking. In Fundamental Approaches to Software Engineering-14th International Conference, FASE 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbröcken, Germany, March 26-April 3, 2011. Proceedings. 127-140.
Umut Oztok and Adnan Darwiche. 2014. On Compiling CNF into Decision-DNNF. In CP.
Daniël Paulusma and Stefan Szeider. 2019. On the parameterized complexity of (k, s)-SAT. Inf. Process. Lett. 143 (2019), 34-36.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825-2830.
Tobias Pett, Thomas Thöm, Tobias Runge, Sebastian Krieter, Malte Lochau, and Ina Schaefer. 2019. Product Sampling for Product Lines: The Scalability Challenge. Proceedings of the 23rd International Systems and Software Product Line Conference-Volume A (2019). https://api.semanticscholar.org/CorpusID:85462202
Quentin Plazar, Mathieu Acher, Gilles Perrouin, Xavier Devroey, and Maxime Cordy. 2019. Uniform Sampling of SAT Solutions for Configurable Systems: Are We There Yet?. In 12th IEEE Conference on Software Testing, Validation and Verification, ICST 2019, Xi'an, China, April 22-27, 2019. 240-251.
Matt Raible. 2015. The JHipster mini-book. C4Media.
Pierre-Yves Schobbens, Patrick Heymans, and Jean-Christophe Trigaux. 2006. Feature Diagrams: A Survey and a Formal Semantics. In RE '06: Proceedings of the 14th IEEE International Requirements Engineering Conference (RE'06). IEEE Computer Society, Washington, DC, USA, 136-145. https://doi.org/10.1109/RE. 2006.23
Shubham Sharma, Rahul Gupta, Subhajit Roy, and Kuldeep S. Meel. 2018. Knowledge Compilation meets Uniform Sampling. In Proceedings of International Conference on Logic for Programming Artificial Intelligence and Reasoning (LPAR).
Mate Soos, Stephan Gocht, and Kuldeep S. Meel. 2020. Tinted, Detached, and Lazy CNF-XOR solving and its Applications to Counting and Sampling. In Proceedings of International Conference on Computer-Aided Verification (CAV).
Mate Soos and Kuldeep S Meel. 2021. Arjun: An Efficient Independent Support Computation Technique and its Applications to Counting and Sampling. arXiv preprint arXiv:2110.09026 (2021).
Marc Thurley. 2006. sharpSAT-Counting Models with Advanced Component Caching and Implicit BCP. In Theory and Applications of Satisfiability Testing-SAT 2006, Armin Biere and Carla P. Gomes (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 424-429.