Collaborative trial; Environmental exposure; Life course; Non-targeted; Open source tools; Pubchem; Spectra's; Spectral data; Spectral libraries; Work-flows; Environmental Chemistry; Public Health, Environmental and Occupational Health; Management, Monitoring, Policy and Law; General Medicine
Abstract :
[en] The term "exposome" is defined as a comprehensive study of life-course environmental exposures and the associated biological responses. Humans are exposed to many different chemicals, which can pose a major threat to the well-being of humanity. Targeted or non-targeted mass spectrometry techniques are widely used to identify and characterize various environmental stressors when linking exposures to human health. However, identification remains challenging due to the huge chemical space applicable to exposomics, combined with the lack of sufficient relevant entries in spectral libraries. Addressing these challenges requires cheminformatics tools and database resources to share curated open spectral data on chemicals to improve the identification of chemicals in exposomics studies. This article describes efforts to contribute spectra relevant for exposomics to the open mass spectral library MassBank (https://www.massbank.eu) using various open source software efforts, including the R packages RMassBank and Shinyscreen. The experimental spectra were obtained from ten mixtures containing toxicologically relevant chemicals from the US Environmental Protection Agency (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). Following processing and curation, 5582 spectra from 783 of the 1268 ENTACT compounds were added to MassBank, and through this to other open spectral libraries (e.g., MoNA, GNPS) for community benefit. Additionally, an automated deposition and annotation workflow was developed with PubChem to enable the display of all MassBank mass spectra in PubChem, which is rerun with each MassBank release. The new spectral records have already been used in several studies to increase the confidence in identification in non-target small molecule identification workflows applied to environmental and exposomics research.
Disciplines :
Chemistry
Author, co-author :
ELAPAVALORE, Anjana ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > Environmental Cheminformatics
KONDIC, Todor ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine > Environmental Cheminformatics > Team Emma SCHYMANSKI
SINGH, Randolph ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine > Environmental Cheminformatics > Team Emma SCHYMANSKI ; IFREMER (Institut Français de Recherche pour l'Exploitation de la Mer), Laboratoire Biogéochimie des Contaminants Organiques, Rue de l'Ile d'Yeu, BP 21105, Nantes Cedex 3, 44311, France
Shoemaker, Benjamin A ; National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
Thiessen, Paul A ; National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
Zhang, Jian ; National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
Bolton, Evan E ; National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
FNR12341006 - Environmental Cheminformatics To Identify Unknown Chemicals And Their Effects, 2018 (01/10/2018-30/09/2023) - Emma Schymanski
Funders :
Fonds National de la Recherche Luxembourg U.S. National Library of Medicine National Institutes of Health
Funding text :
E. L. S., A. E. and T. K. acknowledge funding support from the Luxembourg National Research Fund (FNR) for project A18/BM/12341006. The work of B. A. S., P. A. T., J. Z. and E. E. B. was supported by the National Center for Biotechnology Information of the National Library of Medicine (NLM), National Institutes of Health. We would like to thank Adelene Lai for assistance in restoring the stereochemistry information to the MS-ready SMILES (see the Discussion) and acknowledge the other members of the Environmental Cheminformatics (ECI) and PubChem teams, plus the MassBank consortium members and all other contributors to the open science efforts that supported this effort. We gratefully acknowledge the US EPA for providing the mixtures used here as part of the ENTACT trial.
Ulrich E. M. Sobus J. R. Grulke C. M. Richard A. M. Newton S. R. Strynar M. J. et al., EPA's non-targeted analysis collaborative trial (ENTACT): genesis, design, and initial findings Anal. Bioanal. Chem. 2019 411 4 853 866 https://dx.doi.org/10.1007/s00216-018-1435-6
Escher B. I. Stapleton H. M. Schymanski E. L. Tracking complex mixtures of chemicals in our changing environment Science 2020 367 6476 388 392 . Available from: https://science.sciencemag.org/content/367/6476/388
Schymanski E. L. Baker N. C. Williams A. J. Singh R. R. Trezzi J. P. Wilmes P. et al., Connecting environmental exposure and neurodegeneration using cheminformatics and high resolution mass spectrometry: potential and challenges Environ. Sci.: Processes Impacts 2019 21 9 1426 1445 . Available from: http://xlink.rsc.org/?DOI=C9EM00068B
Yorita Christensen K. L. Carr C. Sanyal A. J. Gennings C. Multiple classes of environmental chemicals are associated with liver disease: NHANES 2003-2004 Int. J. Hyg. Environ. Health 2013 216 6 703 709 . Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3713174/
Carpenter D. O. Arcaro K. Spink D. C. Understanding the human health effects of chemical mixtures Environ. Health Perspect. 2002 110 Suppl 1 25 42 . Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1241145/
Siroux V. Agier L. Slama R. The exposome concept: a challenge and a potential driver for environmental health research Eur. Respir. Rev. 2016 25 140 124 129 . Available from: https://err.ersjournals.com/content/25/140/124
Sobus J. R. Wambaugh J. F. Isaacs K. K. Williams A. J. McEachran A. D. Richard A. M. et al., Integrating tools for non-targeted analysis research and chemical safety evaluations at the US EPA J. Exposure Sci. Environ. Epidemiol. 2018 28 5 411 426 . Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6661898/
Wild C. P. Complementing the Genome with an “Exposome”: The Outstanding Challenge of Environmental Exposure Measurement in Molecular Epidemiology Cancer Epidemiol. Biomarkers Prev. 2005 14 8 1847 1850
Kalia V., Barouki R. and Miller G. W., The exposome: pursuing the totality of exposure, in A New Paradigm for Environmental Chemistry and Toxicology: From Concepts to Insights, ed. G. Jiang and X. Li, Springer, Singapore, 2020, pp. 3-10, https://dx.doi.org/10.1007/978-981-13-9447-8_1
Ho C. Lam C. Chan M. Cheung R. Law L. Lit L. et al., Electrospray Ionisation Mass Spectrometry: Principles and Clinical Applications Clin. Biochem. Rev. 2003 24 1 3 12 . available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1853331/
He P. Aga D. Comparison of GC-MS/MS and LC-MS/MS for the analysis of hormones and pesticides in surface waters: advantages and pitfalls Anal. Methods 2019 11 1436 1448 https://dx.doi.org/10.1039/C8AY02774A
Hollender J. Schymanski E. L. Singer H. P. Ferguson P. L. Nontarget Screening with High Resolution Mass Spectrometry in the Environment: Ready to Go? Environ. Sci. Technol. 2017 51 20 11505 11512 https://dx.doi.org/10.1021/acs.est.7b02184
Xue J. Lai Y. Liu C. W. Ru H. Towards Mass Spectrometry-Based Chemical Exposome: Current Approaches, Challenges, and Future Directions Toxics 2019 7 3 41 https://dx.doi.org/10.3390/toxics7030041
Groh K. J. and Suter M. J.-F., Mass spectrometric target analysis and proteomics in environmental toxicology, in Detection of Chemical, Biological, Radiological and Nuclear Agents for the Prevention of Terrorism, ed. J. Banoub, Springer Netherlands, Dordrecht, 2014, 149-167, (NATO Science for Peace and Security Series A: Chemistry and Biology)
Newton S. R. McMahen R. L. Sobus J. R. Mansouri K. Williams A. J. McEachran A. D. et al., Suspect screening and non-targeted analysis of drinking water using point-of-use filters Environ. Pollut. 2018 234 297 306 . Available from: http://www.sciencedirect.com/science/article/pii/S026974911732691X
Dom I. Biré R. Hort V. Lavison-Bompard G. Nicolas M. Guérin T. Extended Targeted and Non-Targeted Strategies for the Analysis of Marine Toxins in Mussels and Oysters by (LC-HRMS) Toxins 2018 10 9 375 . Available from: https://www.mdpi.com/2072-6651/10/9/375
Oberacher H. Sasse M. Antignac J. P. Guitton Y. Debrauwer L. Jamin E. L. et al., A European proposal for quality control and quality assurance of tandem mass spectral libraries Environ. Sci. Eur. 2020 32 1 43 https://dx.doi.org/10.1186/s12302-020-00314-9
Kruve A. Semi-quantitative non-target analysis of water with liquid chromatography/high-resolution mass spectrometry: how far are we? Rapid Commun. Mass Spectrom. 2019 33 S3 54 63 https://dx.doi.org/10.1002/rcm.8208
Schymanski E. L. Jeon J. Gulde R. Fenner K. Ruff M. Singer H. P. et al., Identifying Small Molecules via High Resolution Mass Spectrometry: Communicating Confidence Environ. Sci. Technol. 2014 48 4 2097 2098 https://dx.doi.org/10.1021/es5002105
Schymanski E. L. Kondić T. Neumann S. Thiessen P. A. Zhang J. Bolton E. E. Empowering large chemical knowledge bases for exposomics: PubChemLite meets MetFrag J. Cheminf. 2021 13 1 19 https://dx.doi.org/10.1186/s13321-021-00489-0
Horai H. Arita M. Kanaya S. Nihei Y. Ikeda T. Suwa K. et al., MassBank: a public repository for sharing mass spectral data for life sciences J. Mass Spectrom. 2010 45 7 703 714 https://dx.doi.org/10.1002/jms.1777
NORMAN Association and MassBank Consortium, EU MassBank: European MassBank Website, [Internet], 2022, available from https://massbank.eu/MassBank/
MassBank Consortium, MassBank on GitHub, GitHub [Internet], 2022, available from https://github.com/MassBank
Kim S. Chen J. Cheng T. Gindulyte A. He J. He S. et al., PubChem 2023 update Nucleic Acids Res. 2022 gkac956 https://dx.doi.org/10.1093/nar/gkac956
Williams A. J. Grulke C. M. Edwards J. McEachran A. D. Mansouri K. Baker N. C. et al., The CompTox Chemistry Dashboard: a community data resource for environmental chemistry J. Cheminf. 2017 9 1 61 https://dx.doi.org/10.1186/s13321-017-0247-6
CAS American Chemical Society, REGISTRY - The CAS Substance Collection, [Internet], 2023, available from https://www.cas.org/cas-data/cas-registry
Daylight Chemical Information Systems, Inc, SMILES - A Simplified Chemical Language, [Internet], 2008, available from http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
Heller S. McNaught A. Stein S. Tchekhovskoi D. Pletnev I. InChI - the worldwide chemical structure identifier standard J. Cheminf. 2013 5 1 7 https://dx.doi.org/10.1186/1758-2946-5-7
McEachran A. D. Mansouri K. Grulke C. Schymanski E. L. Ruttkies C. Williams A. J. MS-Ready structures for non-targeted high-resolution mass spectrometry screening studies J. Cheminf. 2018 10 1 45 https://dx.doi.org/10.1186/s13321-018-0299-2
Kondic T., Lai A., Schymanski E., Mohammed Taha H., Krier J., Narayanan M., et al., Shinyscreen, [Internet], GitLab, 2023, available from, https://gitlab.lcsb.uni.lu/eci/shinyscreen/, [cito:usesMethodIn]
Stravs M. A. Schymanski E. L. Singer H. P. Hollender J. Automatic recalibration and processing of tandem mass spectra using formula annotation J. Mass Spectrom. 2013 48 1 89 99 https://dx.doi.org/10.1002/jms.3131
Kondić T., Environmental Cheminformatics/RMB-mix-method GitLab, [Internet] GitLab, 2023, available from https://gitlab.lcsb.uni.lu/eci/rmb-mix-method
Gatto L. Gibb S. Rainer J. MSnbase, efficient and elegant R-based processing and visualisation of raw mass spectrometry data J. Proteome Res. 2021 20 1 1063 1069 https://dx.doi.org/10.1021/acs.jproteome.0c00313
Kondic T. Elapavalore A. Krier J. Lai A. Mohammed Taha H. Narayanan M. et al., Shinyscreen: Mass Spectrometry Data Inspection and Quality Checking Utility J. Open Source Softw. 2023 5439 https://dx.doi.org/10.21105/joss.05439
Deutsch E. W. Mass Spectrometer Output File Format mzML Methods Mol. Biol. 2010 604 319 331 . Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3073315/
Chambers M. C. Maclean B. Burke R. Amodei D. Ruderman D. L. Neumann S. et al., A cross-platform toolkit for mass spectrometry and proteomics Nat. Biotechnol. 2012 30 10 918 920
Elapavalore A., MassIVE MSV000091754 - GNPS-Adding Open Spectral Data to MassBank and PubChem using Open Source Tools to Support Non-Targeted Exposomics of Mixtures, [Internet], MassIVE, 2023, [cited 2023 Apr 28], Available from, https://massive.ucsd.edu/ProteoSAFe/dataset.jsp?accession=MSV000091754
Lai A. Singh R. R. Kovalova L. Jaeggi O. Kondić T. Schymanski E. L. Retrospective non-target analysis to support regulatory water monitoring: from masses of interest to recommendations via in silico workflows Environ. Sci. Eur. 2021 33 1 43 https://dx.doi.org/10.1186/s12302-021-00475-1
Wohlgemuth G. Mehta S. S. Mejia R. F. Neumann S. Pedrosa D. Pluskal T. et al., SPLASH, a hashed identifier for mass spectra Nat. Biotechnol. 2016 34 11 1099 1101 . Available from: http://www.nature.com/articles/nbt.3689
LCSB-ECI, and NCBI/NLM/NIH, Environmental Cheminformatics/PubChem - MassBank EU GitLab, [Internet], GitLab, 2023, [cited 2023 Apr 28], Available from, https://gitlab.lcsb.uni.lu/eci/pubchem/-/tree/master/massbank_eu
Schymanski E., Elapavalore A., Kondic T. and MassBank Consortium and PubChem Team, MassBank - PubChem Deposition/Annotation Repository, [Internet], Zenodo, 2023, [cited 2023 Apr 28], available from, https://zenodo.org/record/5139996
Baars O. and Perlman D. H., Small molecule LC-MS/MS fragmentation data analysis and application to siderophore identification, in Applications from Engineering with MATLAB Concepts, ed. J. Valdman, [Internet], InTech, 2016, [cited 2020 Jan 20], available from http://www.intechopen.com/books/applications-from-engineering-with-matlab-concepts/small-molecule-lc-ms-ms-fragmentation-data-analysis-and-application-to-siderophore-identification
Perkins P., Mazzoni-Putman S., Stepanova A., Alonso J. and RiboStreamR H. S., A web application for quality control, analysis, and visualization of Ribo-seq data, BMC Genomics, 2019, 20, Suppl 5, p. 422, https://dx.doi.org/10.1186/s12864-019-5700-7. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6551240/
FiehnLab MassBank of North America, [Internet], 2023, [cited 2023 Jan 5], available from, https://mona.fiehnlab.ucdavis.edu/
Wang M. Carver J. J. Phelan V. V. Sanchez L. M. Garg N. Peng Y. et al., Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking Nat. Biotechnol. 2016 34 8 828 837
Ruttkies C. Schymanski E. L. Wolf S. Hollender J. Neumann S. MetFrag relaunched: incorporating strategies beyond in silico fragmentation J. Cheminf. 2016 8 1 3 . Available from: http://www.jcheminf.com/content/8/1/3
Singh R. R. Lai A. Krier J. Kondić T. Diderich P. Schymanski E. L. Occurrence and Distribution of Pharmaceuticals and Their Transformation Products in Luxembourgish Surface Waters ACS Environ. Au 2021 1 1 58 70 https://dx.doi.org/10.1021/acsenvironau.1c00008
Krier J. Singh R. R. Kondić T. Lai A. Diderich P. Zhang J. et al., Discovering pesticides and their TPs in Luxembourg waters using open cheminformatics approaches Environ. Int. 2022 158 106885 . Available from: https://www.sciencedirect.com/science/article/pii/S0160412021005109
Talavera Andújar B. Aurich D. Aho V. T. E. Singh R. R. Cheng T. Zaslavsky L. et al., Studying the Parkinson’s disease metabolome and exposome in biological samples through different analytical and cheminformatics approaches: a pilot study Anal. Bioanal. Chem. 2022 414 7399 7419 https://dx.doi.org/10.1007/s00216-022-04207-z
Chao A. Al-Ghoul H. McEachran A. D. Balabin I. Transue T. Cathey T. et al., In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples Anal. Bioanal. Chem. 2020 412 6 1303 1315 https://dx.doi.org/10.1007/s00216-019-02351-7
Mohammed Taha H. Aalizadeh R. Alygizakis N. Antignac J. P. Arp H. P. H. Bade R. et al., The NORMAN Suspect List Exchange (NORMAN-SLE): facilitating European and worldwide collaboration on suspect screening in high resolution mass spectrometry Environ. Sci. Eur. 2022 34 1 104 https://dx.doi.org/10.1186/s12302-022-00680-6