Article (Scientific journals)
On the suitability of hugging face hub for empirical studies
AIT-MIMOUNE FONOLLA, Adem; Izquierdo, JLC; CABOT, Jordi
2025In Empirical Software Engineering, 30 (2)
Peer Reviewed verified by ORBi
 

Files


Full Text
EMSE___HF_for_Empirical_Studies-4.pdf
Author preprint (5.31 MB) Creative Commons License - Attribution, Non-Commercial, ShareAlike
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Mining software repositories; Data analysis; Empirical study; ML; Hugging face hub
Abstract :
[en] Context. Empirical studies in software engineering mainly rely on the data available on code-hosting platforms, being GitHub the most representative. Nevertheless, in the last years, the emergence of Machine Learning (ML) has led to the development of platforms specifically designed for hosting ML-based projects, with Hugging Face Hub (HFH) as the most popular one. So far, there have been no studies evaluating the potential of HFH for such studies. Objective. We aim at performing an exploratory study of the current state of HFH and its suitability to be used as a source platform for empirical studies. Method. We conduct a qualitative and quantitative analysis of HFH. The former will be performed by comparing the features of HFH with those of other code-hosting platforms, such as GitHub and GitLab. The latter will be performed by analyzing the data available in HFH. Results. We propose a feature framework to characterize HFH and report on the current usage of the platform, both in terms of number and types of projects (and surrounding community) and the features they mostly rely on. Conclusions. The results confirm that HFH offers enough features and diverse enough data to be the source of relevant empirical studies on the development, evolution and usage of AI-related projects. The results also triggered a discussion on aspects of HFH that should be considered when performing such empirical studies.
Disciplines :
Computer science
Author, co-author :
AIT-MIMOUNE FONOLLA, Adem  ;  University of Luxembourg
Izquierdo, JLC
CABOT, Jordi  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > PI Cabot
External co-authors :
yes
Language :
English
Title :
On the suitability of hugging face hub for empirical studies
Publication date :
18 January 2025
Journal title :
Empirical Software Engineering
ISSN :
1382-3256
eISSN :
1573-7616
Publisher :
Springer Science and Business Media LLC
Volume :
30
Issue :
2
Peer reviewed :
Peer Reviewed verified by ORBi
FnR Project :
FNR16544475 - Better Smart Software Faster (Besser) - An Intelligent Low-code Infrastructure For Smart Software, 2020 (01/01/2022-...) - Jordi Cabot
Name of the research project :
U-AGR-7344 - P20/IS/16544475/BESSER/Cabot - CABOT Jordi
Funders :
Ministerio de Ciencia e Innovación
Fonds National de la Recherche Luxembourg
Funding text :
This work is part of the project TED2021-130331B-I00 funded by MCIN/AEI/10.13039/501100011033 and European Union NextGenerationEU/PRTR; and BESSER, funded by the Luxembourg National Research Fund (FNR) PEARL program, grant agreement 16544475.
Available on ORBilu :
since 07 February 2025

Statistics


Number of views
115 (8 by Unilu)
Number of downloads
17 (1 by Unilu)

Scopus citations®
 
2
Scopus citations®
without self-citations
2
OpenCitations
 
0
OpenAlex citations
 
1
WoS citations
 
1

Bibliography


Similar publications



Contact ORBilu