Article (Scientific journals)
Atomic orbits in molecules and materials for improving machine learning force fields
CHARKIN-GORBULIN, Anton; KOKORIN, Artem; Sauceda, Huziel E. et al.
2025In Machine learning: science and technology, 6 (3), p. 035005
Peer Reviewed verified by ORBi Dataset
 

Files


Full Text
Charkin-Gorbulin_2025_Mach._Learn.__Sci._Technol._6_035005.pdf
Publisher postprint (2.16 MB) Creative Commons License - Attribution
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
atomistic simulations; force fields; graphene; machine learning; molecular dynamics; perovskites; symmetry search; Atomistic simulations; Chemical environment; Data-driven approach; Force field models; Forcefields; Gradient domain; Graphenes; Machine-learning; Symmetry searches; Software; Artificial Intelligence
Abstract :
[en] The accurate representation of atoms within their environment forms the backbone of any reliable machine learning force field (MLFF). While modern MLFFs treat atoms of the same type as indistinguishable, their identities can be further resolved by accounting for the composition of their chemical environment. This can improve the parameterization of the MLFF model in chemically diverse systems. In this work, we introduce a novel, data-driven approach designed to find permutation symmetries in isolated and periodic systems, delivering key insights that enable the identification of atomic ‘orbits’, atoms that share consistent chemical and structural environments throughout the dataset. We demonstrate the effectiveness of the orbit representation by incorporating it into the kernel-based symmetric gradient-domain ML (sGDML) model and the equivariant message-passing neural network, MACE. For sGDML, trained on ethanol, 1,8-naphthyridine, D-alanine, and D-histidine adsorbed on graphene, we establish a strong correlation between force prediction accuracy and chemical diversity, quantified by orbit count. The results for the Ac-Phe-Ala5-Lys molecule further underscore the critical role of orbits in force reconstruction across various MLFF architectures. Incorporating orbits into MACE enables us to reduce the model size by an order of magnitude while preserving predictive accuracy, as demonstrated for the CsPbI3 perovskite slab and graphene with a pyridinic-N defect. Overall, our approach provides a scalable and efficient solution for modeling complex chemical systems with state-of-the-art MLFFs.
Disciplines :
Physical, chemical, mathematical & earth Sciences: Multidisciplinary, general & others
Author, co-author :
CHARKIN-GORBULIN, Anton  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Physics and Materials Science (DPHYMS) ; Laboratory for Chemistry of Novel Materials, University of Mons, Mons, Belgium
KOKORIN, Artem  ;  University of Luxembourg > Luxembourg Centre for Systems Biomedicine > Gene Expression and Metabolism > Team Evan WILLIAMS
Sauceda, Huziel E. ;  Instituto de Física Universidad Nacional Autónoma de México, México, Mexico ; BASLEARN, BASF-TU joint Lab, Technische Universität Berlin, Berlin, Germany
Chmiela, Stefan ;  Machine Learning Group, Technische Universität Berlin, Berlin, Germany ; BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
Quarti, Claudio ;  Laboratory for Chemistry of Novel Materials, University of Mons, Mons, Belgium
Beljonne, David ;  Laboratory for Chemistry of Novel Materials, University of Mons, Mons, Belgium
TKATCHENKO, Alexandre  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Physics and Materials Science (DPHYMS)
Poltavsky, Igor ;  Department of Physics and Materials Science, University of Luxembourg, Luxembourg, Luxembourg
External co-authors :
yes
Language :
English
Title :
Atomic orbits in molecules and materials for improving machine learning force fields
Publication date :
16 July 2025
Journal title :
Machine learning: science and technology
eISSN :
2632-2153
Publisher :
Institute of Physics
Volume :
6
Issue :
3
Pages :
035005
Peer reviewed :
Peer Reviewed verified by ORBi
European Projects :
HE - 101054629 - FITMOL - Field-Theory Approach to Molecular Interactions
FnR Project :
FNR13718694 - QML-FLEX - Quantum-based Machine Learning For Flexible Molecules, 2019 (01/09/2020-31/08/2023) - Igor Poltavskyi
FNR16521502 - PHANTASTIC - Physics- And Data-driven Multiscale Modelling Design Of Layered Lead Halide Perovskite Materials For Stable Photovoltaics, 2022 (01/09/2022-31/08/2025) - Alexandre Tkatchenko
Name of the research project :
ERCEA
Funders :
REA - European Commission. Research Executive Agency
F.R.S.-FNRS - Fonds de la Recherche Scientifique
FNR - Fonds National de la Recherche Luxembourg
M-ERA.NET
Dirección General de Asuntos del Personal Académico, Universidad Nacional Autónoma de México
LuxProvide
ERC - European Research Council
BMBF - Bundesministerium für Bildung und Forschung
European Union
Funding text :
The work at the University of Luxembourg was performed with funding from the European Research Council Executive Agency (ERCEA) under Project 101054629 (FITMOL) and from the Luxembourg National Research Fund (FNR) under the CORE project C19/MS/13718694/QML-FLEX. The work at the University of Mons and the University of Luxembourg was carried out within the framework of the M-ERA.NET project PHANTASTIC, supported by the Luxembourg National Research Fund (FNR) (INTER/MERA22/16521502/PHANTASTIC) and by the Belgian National Fund for Scientific Research (F.R.S.-FNRS) under Grant R.8003.22. H.E.S. acknowledges support from DGAPA-UNAM Project PAPIIT No. IA106023 and CONAHCyT project CF-2023-I-468. S.C. acknowledges support by the German Federal Ministry of Education and Research (BMBF) for BIFOLD (01IS18037A). C.Q. is a F.R.S.-FNRS Research Associate, and D.B. is a F.R.S.-FNRS Research Director.The simulations were performed on the Luxembourg national supercomputer MeluXina. The authors gratefully acknowledge the LuxProvide teams for their expert support. Computational resources were provided by the Consortium des \u00C9quipements de Calcul Intensif (C\u00C9CI), funded by the F.R.S.-FNRS under Grant 2.5020.11. The present research also benefited from access to Lucia, the Tier-1 supercomputer of the Walloon Region, an infrastructure funded by the Walloon Region under Grant Agreement No. 1910247.
Data Set :
Atomic orbits in molecules and materials for improving machine learning force fields

The datasets and trained models used in the publication "Atomic orbits in molecules and materials for improving machine learning force fields".

Available on ORBilu :
since 10 February 2026

Statistics


Number of views
50 (1 by Unilu)
Number of downloads
22 (0 by Unilu)

Scopus citations®
 
0
Scopus citations®
without self-citations
0
OpenCitations
 
0
OpenAlex citations
 
1
WoS citations
 
1

Bibliography


Similar publications



Contact ORBilu