Feature Model; Large Language Model; Machine Learning; Mixture of Experts; Model Merging; Software Product Line; Feature models; Key factors; Language model; Large language model; Machine learning models; Machine-learning; Mixture of experts; Model learning; Model merging; Human-Computer Interaction; Computer Networks and Communications; Computer Vision and Pattern Recognition; Software
Résumé :
[en] The size of Large Language Models (LLMs), and Machine Learning (ML) models in general, is a key factor of their capacity and quality of their responses. But it comes with a high cost, both during the training and the model execution phase. Recently, various model merging techniques and Mixture of Experts (MoE) architectures are gaining popularity as they enable the creation of large models by combining other existing ones (the "experts" in the MoE approach). Creating these combinations remains a deep technical task with many possible configurations to consider. In this sense, this paper aims to democratize the creation of combined ML models by presenting a product line approach to the specification and training of this type of ML architectures from an initial feature model that helps users define, among other aspects, the type of models they want to combine, the combination strategy and even, for the MoE approach, the tasks that should be associated to each expert.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
Gomez-Vazquez, Marcos ; Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg
CABOT, Jordi ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > PI Cabot ; Luxembourg Institute of Science and Technology, Esch-sur-Alzette, Luxembourg
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Exploring the Use of Software Product Lines for the Combination of Machine Learning Models
Date de publication/diffusion :
02 septembre 2024
Nom de la manifestation :
28th ACM International Systems and Software Product Line Conference
Lieu de la manifestation :
Dommeldange, Lux
Date de la manifestation :
02-09-2024 => 06-09-2024
Manifestation à portée :
International
Titre de l'ouvrage principal :
SPLC 2024 - 28th ACM International Systems and Software Product Line Conference, Proceedings
Editeur scientifique :
CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Maison d'édition :
Association for Computing Machinery
ISBN/EAN :
9798400705939
Peer reviewed :
Peer reviewed
Projet FnR :
FNR16544475 - Better Smart Software Faster (Besser) - An Intelligent Low-code Infrastructure For Smart Software, 2020 (01/01/2022-...) - Jordi Cabot
Organisme subsidiant :
Luxembourg National Research Fund Namur Digital Institute
Subventionnement (détails) :
This project is supported by the Luxembourg National Research Fund (FNR) PEARL program, grant agreement 16544475.
Mathieu Acher, Philippe Collet, Philippe Lahire, and Robert France. 2010. Comparing Approaches to Implement Feature Model Composition. In Modelling Foundations and Applications, Thomas Kühne, Bran Selic, Marie-Pierre Gervais, and François Terrier (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 3–19.
Takuya Akiba, Makoto Shing, Yujin Tang, Qi Sun, and David Ha. 2024. Evolutionary Optimization of Model Merging Recipes. (2024). arXiv:2403.13187 [cs.NE]
Iván Alfonso, Aaron Conrardy, Armen Sulejmani, Atefeh Nirumand, Fitash Ul Haq, Marcos Gomez-Vazquez, Jean-Sébastien Sottet, and Jordi Cabot. 2024. Building BESSER: An Open-Source Low-Code Platform. In Enterprise, Business-Process and Information Systems Modeling, Han van der Aa, Dominik Bork, Rainer Schmidt, and Arnon Sturm (Eds.). Springer Nature Switzerland, Cham, 203–212.
David Benavides, Pablo Trinidad, and Antonio Ruiz-Cortés. 2005. Automated Reasoning on Feature Models. In Advanced Information Systems Engineering, Oscar Pastor and João Falcão e Cunha (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 491–503.
Jordi Cabot and Robert Clarisó. 2023. Low Code for Smart Software Development. IEEE Software 40, 1 (2023), 89–93. https://doi.org/10.1109/MS.2022.3211352
Cécile Camillieri, Luca Parisi, Mireille Blay-Fornarino, Frédéric Precioso, Michel Riveill, and Joël Cancela-Vaz. 2016. Towards a Software Product Line for Machine Learning Workflows: Focus on Supporting Evolution. In 10th Workshop on Models and Evolution co-located with ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems (MODELS 2016). Saint Malo, France. https://hal.science/hal-01484050
Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, and Ion Stoica. 2024. Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference. arXiv:2403.04132 [cs.AI]
Krzysztof Czarnecki and Chang Hwan Peter Kim. 2005. Cardinality-based feature modeling and constraints: A progress report. In International Workshop on Software Factories. ACM San Diego, California, USA, 16–20.
Javad Ghofrani, Ehsan Kozegar, Anna Lena Fehlhaber, and Mohammad Divband Soorati. 2019. Applying Product Line Engineering Concepts to Deep Neural Networks. In Proceedings of the 23rd International Systems and Software Product Line Conference - Volume A (Paris, France) (SPLC’19). Association for Computing Machinery, New York, NY, USA, 72–77. https://doi.org/10.1145/3336294.3336321
Joan Giner-Miguelez, Abel Gómez, and Jordi Cabot. 2022. DescribeML: a tool for describing machine learning datasets. In Companion of the 25th Int. Conf. on Model Driven Engineering MODELS 2022. ACM, 22–26.
Charles Goddard, Shamane Siriwardhana, Malikeh Ehghaghi, Luke Meyers, Vlad Karpukhin, Brian Benedict, Mark McQuade, and Jacob Solawetz. 2024. Arcee’s MergeKit: A Toolkit for Merging Large Language Models. arXiv preprint arXiv:2403.13257 (2024).
Sandra Greiner, Klaus Schmid, Thorsten Berger, Sebastian Krieter, and Kristof Meixner. 2024. Generative AI And Software Variability - A Research Vision. In Proceedings of the 18th International Working Conference on Variability Modelling of Software-Intensive Systems (, Bern, Switzerland, ) (VaMoS’24). Association for Computing Machinery, New York, NY, USA, 71–76. https://doi.org/10.1145/3634713.3634722
Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. 2023. Editing Models with Task Arithmetic. (2023). arXiv:2212.04089 [cs.LG]
Robert A. Jacobs, Michael I. Jordan, Steven J. Nowlan, and Geoffrey E. Hinton. 1991. Adaptive Mixtures of Local Experts. Neural Computation 3, 1 (03 1991), 79–87. https://doi.org/10.1162/neco.1991.3.1.79 arXiv:https://direct.mit.edu/neco/articlepdf/3/1/79/812104/neco.1991.3.1.79.pdf
Kyo Kang, Sholom Cohen, James Hess, William Novak, and A. Peterson. 1990. Feature-Oriented Domain Analysis (FODA) Feasibility Study. Technical Report CMU/SEI-90-TR-021. https://insights.sei.cmu.edu/library/feature-orienteddomain-analysis-foda-feasibility-study/ Accessed: 2024-Apr-4.
Matthias Riebisch. 2003. Towards a more precise definition of feature models. Modelling variability for object-oriented product lines (2003), 64–76.
Ashish Saini, Rajkumar, Amrita Kumari, and Satender Kumar. 2022. A Proposed Method of Machine Learning based Framework for Software Product Line Testing. In 2022 International Conference on Fourth Industrial Revolution Based Technology and Practices (ICFIRTP). 10–13. https://doi.org/10.1109/ICFIRTP56122.2022. 10059409
Samuel Sepúlveda, Carlos Cares, and Cristina Cachero. 2012. Towards a unified feature metamodel: A systematic comparison of feature languages. In 7th Iberian Conference on Information Systems and Technologies (CISTI 2012). 1–7.
Paul Temple, José A. Galindo, Mathieu Acher, and Jean-Marc Jézéquel. 2016. Using machine learning to infer constraints for product lines. In Proceedings of the 20th International Systems and Software Product Line Conference (Beijing, China) (SPLC’16). Association for Computing Machinery, New York, NY, USA, 209–218. https://doi.org/10.1145/2934466.2934472
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, et al. 2023. A survey of large language models. arXiv preprint arXiv:2303.18223 (2023).