atomistic simulations; force fields; graphene; machine learning; molecular dynamics; perovskites; symmetry search; Atomistic simulations; Chemical environment; Data-driven approach; Force field models; Forcefields; Gradient domain; Graphenes; Machine-learning; Symmetry searches; Software; Artificial Intelligence
Abstract :
[en] The accurate representation of atoms within their environment forms the backbone of any reliable machine learning force field (MLFF). While modern MLFFs treat atoms of the same type as indistinguishable, their identities can be further resolved by accounting for the composition of their chemical environment. This can improve the parameterization of the MLFF model in chemically diverse systems. In this work, we introduce a novel, data-driven approach designed to find permutation symmetries in isolated and periodic systems, delivering key insights that enable the identification of atomic ‘orbits’, atoms that share consistent chemical and structural environments throughout the dataset. We demonstrate the effectiveness of the orbit representation by incorporating it into the kernel-based symmetric gradient-domain ML (sGDML) model and the equivariant message-passing neural network, MACE. For sGDML, trained on ethanol, 1,8-naphthyridine, D-alanine, and D-histidine adsorbed on graphene, we establish a strong correlation between force prediction accuracy and chemical diversity, quantified by orbit count. The results for the Ac-Phe-Ala5-Lys molecule further underscore the critical role of orbits in force reconstruction across various MLFF architectures. Incorporating orbits into MACE enables us to reduce the model size by an order of magnitude while preserving predictive accuracy, as demonstrated for the CsPbI3 perovskite slab and graphene with a pyridinic-N defect. Overall, our approach provides a scalable and efficient solution for modeling complex chemical systems with state-of-the-art MLFFs.
Disciplines :
Physical, chemical, mathematical & earth Sciences: Multidisciplinary, general & others
Author, co-author :
CHARKIN-GORBULIN, Anton ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Physics and Materials Science (DPHYMS) ; Laboratory for Chemistry of Novel Materials, University of Mons, Mons, Belgium
KOKORIN, Artem ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine > Gene Expression and Metabolism > Team Evan WILLIAMS
Sauceda, Huziel E. ; Instituto de Física Universidad Nacional Autónoma de México, México, Mexico ; BASLEARN, BASF-TU joint Lab, Technische Universität Berlin, Berlin, Germany
Chmiela, Stefan ; Machine Learning Group, Technische Universität Berlin, Berlin, Germany ; BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
Quarti, Claudio ; Laboratory for Chemistry of Novel Materials, University of Mons, Mons, Belgium
Beljonne, David ; Laboratory for Chemistry of Novel Materials, University of Mons, Mons, Belgium
TKATCHENKO, Alexandre ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Physics and Materials Science (DPHYMS)
Poltavsky, Igor ; Department of Physics and Materials Science, University of Luxembourg, Luxembourg, Luxembourg
External co-authors :
yes
Language :
English
Title :
Atomic orbits in molecules and materials for improving machine learning force fields
HE - 101054629 - FITMOL - Field-Theory Approach to Molecular Interactions
FnR Project :
FNR13718694 - QML-FLEX - Quantum-based Machine Learning For Flexible Molecules, 2019 (01/09/2020-31/08/2023) - Igor Poltavskyi FNR16521502 - PHANTASTIC - Physics- And Data-driven Multiscale Modelling Design Of Layered Lead Halide Perovskite Materials For Stable Photovoltaics, 2022 (01/09/2022-31/08/2025) - Alexandre Tkatchenko
Name of the research project :
ERCEA
Funders :
REA - European Commission. Research Executive Agency F.R.S.-FNRS - Fonds de la Recherche Scientifique FNR - Fonds National de la Recherche Luxembourg M-ERA.NET Dirección General de Asuntos del Personal Académico, Universidad Nacional Autónoma de México LuxProvide ERC - European Research Council BMBF - Bundesministerium für Bildung und Forschung European Union
Funding text :
The work at the University of Luxembourg was performed with funding from the European Research Council Executive Agency (ERCEA) under Project 101054629 (FITMOL) and from the Luxembourg National Research Fund (FNR) under the CORE project C19/MS/13718694/QML-FLEX. The work at the University of Mons and the University of Luxembourg was carried out within the framework of the M-ERA.NET project PHANTASTIC, supported by the Luxembourg National Research Fund (FNR) (INTER/MERA22/16521502/PHANTASTIC) and by the Belgian National Fund for Scientific Research (F.R.S.-FNRS) under Grant R.8003.22. H.E.S. acknowledges support from DGAPA-UNAM Project PAPIIT No. IA106023 and CONAHCyT project CF-2023-I-468. S.C. acknowledges support by the German Federal Ministry of Education and Research (BMBF) for BIFOLD (01IS18037A). C.Q. is a F.R.S.-FNRS Research Associate, and D.B. is a F.R.S.-FNRS Research Director.The simulations were performed on the Luxembourg national supercomputer MeluXina. The authors gratefully acknowledge the LuxProvide teams for their expert support. Computational resources were provided by the Consortium des \u00C9quipements de Calcul Intensif (C\u00C9CI), funded by the F.R.S.-FNRS under Grant 2.5020.11. The present research also benefited from access to Lucia, the Tier-1 supercomputer of the Walloon Region, an infrastructure funded by the Walloon Region under Grant Agreement No. 1910247.
Bartók A P Payne M C Kondor R Csányi G 2010 Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons Phys. Rev. Lett. 104 136403 10.1103/PhysRevLett.104.136403
Bartók A P Kondor R Csányi G 2013 On representing chemical environments Phys. Rev. B 87 184115 10.1103/PhysRevB.87.184115
Deringer V L Bartók A P Bernstein N Wilkins D M Ceriotti M Csányi G 2021 Gaussian process regression for materials and molecules Chem. Rev. 121 10073 141 10073-141 10.1021/acs.chemrev.1c00022
Bartók A P Csányi G 2015 Gaussian approximation potentials: a brief tutorial introduction Int. J. Quantum Chem. 115 1051 7 1051-7 10.1002/qua.24927
Behler J 2011 Atom-centered symmetry functions for constructing high-dimensional neural network potentials J. Chem. Phys. 134 074106 10.1063/1.3553717
Chmiela S Tkatchenko A Sauceda H E Poltavsky I Schütt K T Müller K-R 2017 Machine learning of accurate energy-conserving molecular force fields Sci. Adv. 3 e1603015 10.1126/sciadv.1603015
Chmiela S Sauceda H E Poltavsky I Müller K-R Tkatchenko A 2019 sGDML: constructing accurate and data efficient molecular force fields using machine learning Comput. Phys. Commun. 240 38 45 38-45 10.1016/j.cpc.2019.02.007
Chmiela S Sauceda H E Müller K-R Tkatchenko A 2018 Towards exact molecular dynamics simulations with machine-learned force fields Nat. Commun. 9 3887 10.1038/s41467-018-06169-2
Sauceda H E Gálvez-González L E Chmiela S Paz-Borbón L O Müller K-R Tkatchenko A 2022 BIGDML—towards accurate quantum machine learning force fields for materials Nat. Commun. 13 3733 10.1038/s41467-022-31093-x
Chmiela S Vassilev-Galindo V Unke O T Kabylda A Sauceda H E Tkatchenko A Müller K-R 2023 Accurate global machine learning force fields for molecules with hundreds of atoms Sci. Adv. 9 eadf0873 10.1126/sciadv.adf0873
Schütt K T Kindermans P-J Sauceda H E Chmiela S Tkatchenko A Müller K-R 2017 SchNet: a continuous-filter convolutional neural network for modeling quantum interactions Proc. 31st Int. Conf. on Neural Information Processing Systems NIPS’17 992 1002 992-1002 Curran Associates Inc.
Schütt K T Unke O T Gastegger M 2021 Equivariant message passing for the prediction of tensorial properties and molecular spectra Int. Conf. on Machine Learning PMLR pp 9377 88 pp 9377-88
Batzner S Musaelian A Sun L Geiger M Mailoa J P Kornbluth M Molinari N Smidt T E Kozinsky B 2022 E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials Nat. Commun. 13 2453 10.1038/s41467-022-29939-5
Musaelian A Batzner S Johansson A Sun L Owen C J Kornbluth M Kozinsky B 2023 Learning local equivariant representations for large-scale atomistic dynamics Nat. Commun. 14 579 10.1038/s41467-023-36329-y
Musaelian A Johansson A Batzner S Kozinsky B 2023 Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size (arXiv: 2304.10061)
Unke O T Meuwly M 2019 Physnet: a neural network for predicting energies, forces, dipole moments and partial charges J. Chem. Theory Comput. 15 3678 93 3678-93 10.1021/acs.jctc.9b00181
Drautz R 2019 Atomic cluster expansion for accurate and transferable interatomic potentials Phys. Rev. B 99 014104 10.1103/PhysRevB.99.014104
Batatia I Batzner S Kovács D P Musaelian A Simm G N C Drautz R Ortner C Kozinsky B Csányi G 2022 The design space of E(3)-equivariant atom-centered interatomic potentials (arXiv: 2205.06643)
Batatia I Kovacs D P Simm G N C Ortner C Csanyi G 2022 MACE: higher order equivariant message passing neural networks for fast and accurate force fields Advances in Neural Information Processing Systems vol 35 S Koyejo S Mohamed Agarwal A Belgrave D Cho K A Oh Curran Associates, Inc. pp11423 36 pp11423-36
Frank J T Unke O T Müller K-R 2022 So3krates: equivariant attention for interactions on arbitrary length-scales in molecular systems Advances in Neural Information Processing Systems vol 35 S Koyejo S Mohamed A Agarwal D Belgrave K Cho A Oh Curran Associates, Inc. pp11423 36 pp11423-36
Frank J T Unke O T Müller K-R Chmiela S 2024 A Euclidean transformer for fast and stable machine learned force fields Nat. Commun. 15 6539 10.1038/s41467-024-50620-6
Unke O T et al 2024 Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments Sci. Adv. 10 eadn4397 10.1126/sciadv.adn4397
Kabylda A Frank J T Dou S S Khabibrakhmanov A Sandonas L M Unke O T Chmiela S Müller K-R Tkatchenko A 2024 Molecular simulations with a pretrained neural network and universal pairwise force fields Chemxiv Preprint 10.26434/chemrxiv-2024-bdfr0 (accessed 08 October 2024)
Kovács D P Moore J H Browning N J Batatia I Horton J T Kapil V Witt W C Magdău I-B Cole D J Csányi G 2023 MACE-OFF23: transferable machine learning force fields for organic molecules (arXiv: 2312.15211)
Batatia I et al 2024 A foundation model for atomistic materials chemistry (arXiv: 2401.00096)
Smith J S Isayev O Roitberg A E 2017 ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost Chem. Sci. 8 3192 203 3192-203 10.1039/C6SC05720A
Anstine D Zubatyuk R Isayev O 2024 AIMNet2: a neural network potential to meet your neutral, charged, organic, and elemental-organic needs Chem. Sci. 16 10228 44 10228-44 10.1039/D4SC08572H
Plé T Lagardère L Piquemal J-P 2023 Force-field-enhanced neural network interactions: from local equivariant embedding to atom-in-molecule properties and long-range effects Chem. Sci. 14 12554 69 12554-69 10.1039/D3SC02956A
Qiao Z Christensen A S Welborn M Manby F R Anandkumar A Miller T F III 2022 Informing geometric deep learning with electronic interactions to accelerate quantum chemistry PNAS 119 e22052211 19 10.1073/pnas.2205221119
Christensen A S Bratholm L A Faber F A Anatole von Lilienfeld O von Lilienfeld O A 2020 FCHL revisited: faster and more accurate quantum machine learning J. Chem. Phys. 152 044107 10.1063/1.5126701
Browning N J Faber F A Anatole von Lilienfeld O 2022 GPU-accelerated approximate kernel method for quantum machine learning J. Chem. Phys. O 157 214801 10.1063/5.0108967
Choudhary K DeCost B Major L Butler K Thiyagalingam J Tavazza F 2023 Unified graph neural network force-field for the periodic table: solid state applications Digit. Discov. 2 346 55 346-55 10.1039/D2DD00096B
Rodriguez A et al 2023 Million-scale data integrated deep neural network for phonon properties of heuslers spanning the periodic table npj Comput. Mater. 9 20 10.1038/s41524-023-00974-0
Zhang Y Jiang B 2023 Universal machine learning for the response of atomistic systems to external fields Nat. Commun. 14 6424 10.1038/s41467-023-42148-y
Zeng J et al 2023 DeePMD-kit v2: a software package for deep potential models J. Chem. Phys. 159 054801 10.1063/5.0155600
Wang H Zhang L Han J Weinan E 2018 Deepmd-kit: a deep learning package for many-body potential energy representation and molecular dynamics Comput. Phys. Commun. 228 178 84 178-84 10.1016/j.cpc.2018.03.016
Shapeev A V 2016 Moment tensor potentials: a class of systematically improvable interatomic potentials Multiscale Model. Sim. 14 1153 73 1153-73 10.1137/15M1054183
Thompson A P Swiler L P Trott C R Foiles S M Tucker G J 2015 Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials J. Comput. Phys. 285 316 30 316-30 10.1016/j.jcp.2014.12.018
Chen C Ong S P 2022 A universal graph deep learning interatomic potential for the periodic table Nat. Comput. Sci. 2 718 28 718-28 10.1038/s43588-022-00349-3
Gokcan H Isayev O 2022 Learning molecular potentials with neural networks Wiley Interdiscip. Rev. Comput. Mol. Sci. 12 e1564 10.1002/wcms.1564
Vandermause J Torrisi S B Batzner S Xie Y Sun L Kolpak A M Kozinsky B 2020 On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events npj Comput. Mater. 6 20 10.1038/s41524-020-0283-z
Park Y Kim J Hwang S Han S 2024 Scalable parallel algorithm for graph neural network interatomic potentials in molecular dynamics simulations J. Chem. Theory Comput. 20 4857 68 4857-68 10.1021/acs.jctc.4c00190
Jinnouchi R Lahnsteiner J Karsai F Kresse G Bokdam M 2019 Phase transitions of hybrid perovskites simulated by machine-learning force fields trained on-the-fly with Bayesian inference Phys. Rev. Lett. 122 225701 10.1103/PhysRevLett.122.225701
Unke O T Chmiela S Gastegger M Schütt K T Sauceda H E Müller K-R 2021 SpookyNet: learning force fields with electronic degrees of freedom and nonlocal effects Nat. Commun. 12 7273 10.1038/s41467-021-27504-0
Lederer J Gastegger M Schütt K T Kampffmeyer M Müller K-R Unke O T 2023 Automatic identification of chemical moieties Phys. Chem. Chem. Phys. 25 26370 9 26370-9 10.1039/D3CP03845A
Majewski M Pérez A Thölke P Doerr S Charron N E Giorgino T Husic B E Clementi C Noé F De Fabritiis G 2023 Machine learning coarse-grained potentials of protein thermodynamics Nat. Commun. 14 5739 10.1038/s41467-023-41343-1
Unke O T Chmiela S Sauceda H E Gastegger M Poltavsky I Schütt K T Tkatchenko A Müller K-R 2021 Machine learning force fields Chem. Rev. 121 10142 86 10142-86 10.1021/acs.chemrev.0c01111
Bergman D L 2020 Symmetry Constrained Machine Learning (Intelligent Systems and Applications) Bi Y Bhatia R Kapoor S Springer pp 501 12 pp 501-12
Grisafi A Wilkins D M Csányi G Ceriotti M 2018 Symmetry-adapted machine learning for tensorial properties of atomistic systems Phys. Rev. Lett. 120 036002 10.1103/PhysRevLett.120.036002
Schmitz N F Müller K-R Chmiela S 2022 Algorithmic differentiation for automated modeling of machine learned force fields J. Phys. Chem. Lett. 13 10183 9 10183-9 10.1021/acs.jpclett.2c02632
Ewig C S et al 2001 Derivation of classII force fields.VIII. Derivation of a general quantum mechanical force field for organic compounds J. Comput. Chem. 22 1782 800 1782-800 10.1002/jcc.1131
Hwang M J Stockfisch T P Hagler A T 1994 Derivation of class II force fields. 2. Derivation and characterization of a class II force field, CFF93, for the alkyl functional group and alkane molecules J. Am. Chem. Soc. 116 2515 25 2515-25 10.1021/ja00085a036
Frucht R 1949 Graphs of degree three with a given abstract group Can. J. Math. 1 365 78 365-78 10.4153/CJM-1949-033-6
Luks E M 1982 Isomorphism of graphs of bounded valence can be tested in polynomial time J. Comput. Syst. Sci. 25 42 65 42-65 10.1016/0022-0000(82)90009-5
Bohanec S Perdih M 1993 Symmetry of chemical structures: a novel method of graph automorphism group determination J. Chem. Inf. Comput. Sci. 33 719 26 719-26 10.1021/ci00015a010
Faulon J-L 1998 Isomorphism, automorphism partitioning and canonical labeling can be solved in polynomial-time for molecular graphs J. Chem. Inf. Comput. Sci. 38 432 44 432-44 10.1021/ci9702914
Merkys A Vaitkus A Grybauskas A Konovalovas A Quirós M Gražulis S 2023 Graph isomorphism-based algorithm for cross-checking chemical and crystallographic descriptions J. Cheminform. 15 25 10.1186/s13321-023-00692-1
Novoselov K Geim A Morozov S Jiang D Zhang Y Dubonos S Grigorieva I Firsov A 2004 Electric field in atomically thin carbon films Science 306 666 9 666-9 10.1126/science.1102896
Schedin F Geim A K Morozov S V Hill E W Blake P Katsnelson M I Novoselov K S 2007 Detection of individual gas molecules adsorbed on graphene Nat. Mater. 6 652 5 652-5 10.1038/nmat1967
Myung S Yin P T Kim C Park J Solanki A Reyes P I Lu Y Kim K S Lee K-B 2012 Label-free polypeptide-based enzyme detection using a graphene-nanoparticle hybrid sensor Adv. Mater. 24 6081 7 6081-7 10.1002/adma.201202961
Lazar P Karlický F Jurečka P Kocman M Otyepková E Mikuláš K Otyepka M 2013 Adsorption of small organic molecules on graphene J. Am. Chem. Soc. 135 6372 7 6372-7 10.1021/ja403162r
Poltavsky I et al 2025 Crash testing machine learning force fields for molecules, materials and interfaces: model analysis in the tea challenge 2023 Chem. Sci. 16 3720 37 3720-37 10.1039/D4SC06529H
Bonchev D 1991 Chemical Graph Theory: Introduction and Fundamentals (Chemical Graph Theory) Taylor & Francis
García-Domenech R Gálvez J de Julián-Ortiz J V Pogliani L 2008 Some new trends in chemical graph theory Chem. Rev. 108 1127 69 1127-69 10.1021/cr0780006
Douglas B L Wang J B 2008 A classical approach to the graph isomorphism problem using quantum walks J. Phys. A: Math. Theor. 41 075303 10.1088/1751-8113/41/7/075303
Lauri J Scapellato R 2016 Topics in Graph Automorphisms and Reconstruction (London Mathematical Society Lecture Note Series) 2nd edn Cambridge University Press
McKay B D Piperno A 2014 Practical graph isomorphism, II J. Symb. Comput. 60 94 112 94-112 10.1016/j.jsc.2013.09.003
Perdew J P Burke K Ernzerhof M 1996 Generalized gradient approximation made simple Phys. Rev. Lett. 77 3865 8 3865-8 10.1103/PhysRevLett.77.3865
Hermann J Tkatchenko A 2020 Density functional model for van der Waals interactions: unifying many-body atomic approaches with nonlocal functionals Phys. Rev. Lett. 124 146401 10.1103/PhysRevLett.124.146401
Blum V Gehrke R Hanke F Havu P Havu V Ren X Reuter K Scheffler M 2009 Ab initio molecular simulations with numeric atom-centered orbitals Comput. Phys. Commun. 180 2175 96 2175-96 10.1016/j.cpc.2009.06.022
Fonseca G Poltavsky I Vassilev-Galindo V Tkatchenko A 2021 Improving molecular force fields across configurational space by combining supervised and unsupervised machine learning J. Chem. Phys. 154 124102 10.1063/5.0035530
Fonseca G Poltavsky I Tkatchenko A 2023 Force field analysis software and tools (FFAST): assessing machine learning force fields under the microscope J. Chem. Theory Comput. 19 8706 17 8706-17 10.1021/acs.jctc.3c00985
VandeVondele J Krack M Mohamed F Parrinello M Chassaing T Hutter J 2005 Quickstep: fast and accurate density functional calculations using a mixed Gaussian and plane waves approach Comput. Phys. Commun. 167 103 28 103-28 10.1016/j.cpc.2004.12.014
Hutter J Iannuzzi M Schiffmann F VandeVondele J 2014 CP2K: atomistic simulations of condensed matter systems WIREs Comput. Mol. Sci. 4 15 25 15-25 10.1002/wcms.1159
VandeVondele J Hutter J 2007 Gaussian basis sets for accurate calculations on molecular systems in gas and condensed phases J. Chem. Phys. 127 114105 10.1063/1.2770708
Goedecker S Teter M Hutter J 1996 Separable dual-space Gaussian pseudopotentials Phys. Rev. B 54 1703 10 1703-10 10.1103/PhysRevB.54.1703
Krack M 2005 Pseudopotentials for H to Kr optimized for gradient-corrected exchange-correlation functionals Theor. Chem. Acc. 114 145 52 145-52 10.1007/s00214-005-0655-y
Nosé S 1984 A unified formulation of the constant temperature molecular dynamics methods J. Chem. Phys. 81 511 9 511-9 10.1063/1.447334
Hoover W G 1986 Constant-pressure equations of motion Phys. Rev. A 34 2499 500 2499-500 10.1103/PhysRevA.34.2499