Article (Scientific journals)
FuMi: A Runtime Fuzz-based Machine Learning Precision Measurement and Testing Framework
ZHANG, Peng; PAPADAKIS, Mike; Zhou, Yuming
2025In ACM Transactions on Software Engineering and Methodology
Peer Reviewed verified by ORBi
 

Files


Full Text
3734866.pdf
Author postprint (1.88 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] The rapid evolution of machine learning model training has outpaced the development of corresponding measurement and testing tools, leading to two significant challenges. Firstly, developers of deep learning frameworks struggle to identify operators that fail to meet precision criteria, as these issues may only manifest in a few data points. Secondly, model trainers lack effective methods to estimate precision loss caused by operators during training. To address these issues, we introduce a Pythonic framework inspired by common network layer definitions in deep learning. Our framework includes two new layers: the Fuzz Layer and the Check Layer, designed to aid in measurement and testing. The Fuzz Layer introduces minor perturbations to tensor inputs for any deterministic layer under testing (LUT). The Check Layer then measures precision by analyzing the differences before and after the perturbation. This approach estimates a lower bound of the maximal relative error and alerts developers or trainers of potential bugs if the difference exceeds a predefined tolerance. Additionally, Check Layers can be used independently to conduct precision tests for specific layers, ensuring the precision of operators during runtime. Despite the additional memory and time requirements, this runtime testing ensures proper training of the original model. We demonstrate the utility of our framework, FuMi, through two experiments. First, we tested 21 torch operators across 9 popular machine learning models using PyTorch for various tasks, finding that the conv2d and linear operators often fail to meet precision requirements. Second, to showcase the generalizability of our framework, we tested the ATTENTION operator. By comparing different implementations of state-of-the-art ATTENTION operators, we found that the maximum relative error of the ATTENTION operator is not less than 1%, which is 13 times larger than that measured by Predoo (a unit test tool). This framework provides a robust solution for identifying precision issues in deep learning models, ensuring reliable and accurate model training.
Disciplines :
Computer science
Author, co-author :
ZHANG, Peng  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
PAPADAKIS, Mike  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Zhou, Yuming ;  State Key Laboratory for Novel Software Technology, Nanjing University, China
External co-authors :
yes
Language :
English
Title :
FuMi: A Runtime Fuzz-based Machine Learning Precision Measurement and Testing Framework
Publication date :
08 May 2025
Journal title :
ACM Transactions on Software Engineering and Methodology
ISSN :
1049-331X
Publisher :
Association for Computing Machinery (ACM)
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBilu :
since 30 June 2025

Statistics


Number of views
49 (2 by Unilu)
Number of downloads
14 (0 by Unilu)

OpenCitations
 
0
OpenAlex citations
 
0

Bibliography


Similar publications



Contact ORBilu