FuMi: A Runtime Fuzz-based Machine Learning Precision Measurement and Testing Framework

[en] The rapid evolution of machine learning model training has outpaced the development of corresponding measurement and testing tools, leading to two significant challenges. Firstly, developers of deep learning frameworks struggle to identify operators that fail to meet precision criteria, as these issues may only manifest in a few data points. Secondly, model trainers lack effective methods to estimate precision loss caused by operators during training. To address these issues, we introduce a Pythonic framework inspired by common network layer definitions in deep learning. Our framework includes two new layers: the Fuzz Layer and the Check Layer, designed to aid in measurement and testing. The Fuzz Layer introduces minor perturbations to tensor inputs for any deterministic layer under testing (LUT). The Check Layer then measures precision by analyzing the differences before and after the perturbation. This approach estimates a lower bound of the maximal relative error and alerts developers or trainers of potential bugs if the difference exceeds a predefined tolerance. Additionally, Check Layers can be used independently to conduct precision tests for specific layers, ensuring the precision of operators during runtime. Despite the additional memory and time requirements, this runtime testing ensures proper training of the original model. We demonstrate the utility of our framework, FuMi, through two experiments. First, we tested 21 torch operators across 9 popular machine learning models using PyTorch for various tasks, finding that the conv2d and linear operators often fail to meet precision requirements. Second, to showcase the generalizability of our framework, we tested the ATTENTION operator. By comparing different implementations of state-of-the-art ATTENTION operators, we found that the maximum relative error of the ATTENTION operator is not less than 1%, which is 13 times larger than that measured by Predoo (a unit test tool). This framework provides a robust solution for identifying precision issues in deep learning models, ensuring reliable and accurate model training.

Disciplines :

Computer science

Author, co-author :

ZHANG, Peng ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV

PAPADAKIS, Mike ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

Zhou, Yuming ; State Key Laboratory for Novel Software Technology, Nanjing University, China

External co-authors :

yes

Language :

English

Title :

FuMi: A Runtime Fuzz-based Machine Learning Precision Measurement and Testing Framework

Publication date :

08 May 2025

Journal title :

ACM Transactions on Software Engineering and Methodology

ISSN :

1049-331X

Publisher :

Association for Computing Machinery (ACM)

Peer reviewed :

Peer Reviewed verified by ORBi

Additional URL :

https://dl.acm.org/doi/pdf/10.1145/3734866

Available on ORBilu :

since 30 June 2025

Statistics

Number of views

70 (2 by Unilu)

Number of downloads

24 (0 by Unilu)

More statistics

OpenCitations

OpenAlex citations