Results 1-20 of 105.
((uid:50027893))
![]() Schymanski, Emma ![]() Presentation (2023, March 22) ZeroPM Webinar: \textlessstrong\textgreater\textlessem\textgreaterAre there really 6 million PFAS in PubChem?\textless/em\textgreater\textless/strong\textgreater The increasing concerns about poly and ... [more ▼] ZeroPM Webinar: \textlessstrong\textgreater\textlessem\textgreaterAre there really 6 million PFAS in PubChem?\textless/em\textgreater\textless/strong\textgreater The increasing concerns about poly and perfluoroalkyl substances (PFAS) and calls for action upon them as a class has spurred intense debates on how to define and enumerate the “PFAS Chemical Space”. There are now \>50 PFAS lists openly available, including the OECD PFAS list of \textasciitilde4700 PFAS (ENV/JM/MONO(2018)7) and the US EPA PFASMASTER list of \>12000 PFAS. However, searching the large open chemical collection PubChem (114 million chemicals, Feb. 2023) reveals that \textlessstrong\textgreater\textlessem\textgreater\>6 million entries\textless/em\textgreater\textless/strong\textgreater match the latest OECD PFAS definition where PFAS “contains at least one alkyl CF$_\textrm2$ group” (ENV/CBC/MONO(2021)25). This webinar will introduce listeners to the new classification browser in PubChem designed to help navigate these incredible numbers, the “PFAS and Fluorinated Compounds in PubChem Tree” (“PubChem PFAS Tree” for short). The current version contains six main sections: OECD PFAS definition (\>6 million PFAS), organofluorine compounds (\>19 million compounds), other diverse fluorinated compounds, OECD PFAS by chemistry (\>7 million PFAS including salts and mixtures), several PFAS collections (from CompTox, NORMAN-SLE, NIST, OntoChem and PubChem) and finally regulatory collections. We will walk listeners through the PubChem PFAS Tree and the many features it offers to help users explore the PFAS space in PubChem and look forward to lively discussions with the audience afterwards. [less ▲] Detailed reference viewed: 47 (0 UL)![]() Schymanski, Emma ![]() Presentation (2023, March 17) Invited talk for the Environmental Chemistry and Biogeochemistry Seminar at Umeå University, 17 March 2023, Virtual Event. Many thanks to Andriy Rebryk for the invitation! Detailed reference viewed: 49 (0 UL)![]() Schymanski, Emma ![]() in Analytical Scientist (2023) Why Open and FAIR data sharing in analytical research is important for public data availability, raising awareness of your data, and the very future of analytical science – according to Emma Schymanski Detailed reference viewed: 52 (1 UL)![]() Schymanski, Emma ![]() ![]() in Nature Water (2023), 1(1), 4--6 Since water is a common good, the outcome of water-related research should be accessible to everyone. Since Open Science is more than just open access research articles, journals must work with the ... [more ▼] Since water is a common good, the outcome of water-related research should be accessible to everyone. Since Open Science is more than just open access research articles, journals must work with the research community to enable fully open and FAIR science [less ▲] Detailed reference viewed: 37 (3 UL)![]() ; ; et al in TrAC: Trends in Analytical Chemistry (2023), 159 Non-target screening (NTS) methods are rapidly gaining in popularity, empowering researchers to search for an ever-increasing number of chemicals. Given this possibility, communicating the confidence of ... [more ▼] Non-target screening (NTS) methods are rapidly gaining in popularity, empowering researchers to search for an ever-increasing number of chemicals. Given this possibility, communicating the confidence of identification in an automated, concise and unambiguous manner is becoming increasingly important. In this study, we compiled several pieces of evidence necessary for communicating NTS identification confidence and developed a machine learning approach for classification of the identifications as reliable and unreliable. The machine learning approach was trained using data generated by four laboratories equipped with different instrumentation. The model discarded substances with insufficient identification evidence efficiently, while revealing the relevance of different parameters for identification. Based on these results, a harmonized IP-based system is proposed. This new NTS-oriented system is compatible with the currently widely used five level system. It increases the precision in reporting and the reproducibility of current approaches via the inclusion of evidence scores, while being suitable for automation. [less ▲] Detailed reference viewed: 52 (1 UL)![]() Lai, Adelene ![]() in Journal of Cheminformatics (2022), 14(85), Homologous series are groups of related compounds that share the same core structure attached to a motif that repeats to different degrees. Compounds forming homologous series are of interest in multiple ... [more ▼] Homologous series are groups of related compounds that share the same core structure attached to a motif that repeats to different degrees. Compounds forming homologous series are of interest in multiple domains, including natural products, environmental chemistry, and drug design. However, many homologous compounds remain unannotated as such in compound datasets, which poses obstacles to understanding chemical diversity and their analytical identification via database matching. To overcome these challenges, an algorithm to detect homologous series within compound datasets was developed and implemented using the RDKit. The algorithm takes a list of molecules as SMILES strings and a monomer (i.e., repeating unit) encoded as SMARTS as its main inputs. In an iterative process, substructure matching of repeating units, molecule fragmentation, and core detection lead to homologous series classification through grouping of identical cores. Three open compound datasets from environmental chemistry (NORMAN Suspect List Exchange, NORMAN-SLE), exposomics (PubChemLite for Exposomics), and natural products (the COlleCtion of Open NatUral producTs, COCONUT) were subject to homologous series classification using the algorithm. Over 2000, 12,000, and 5000 series with CH2 repeating units were classified in the NORMAN-SLE, PubChemLite, and COCONUT respectively. Validation of classified series was performed using published homologous series and structure categories, including a comparison with a similar existing method for categorising PFAS compounds. The OngLai algorithm and its implementation for classifying homologues are openly available at: https://github.com/adelenelai/onglai-classify-homologues. [less ▲] Detailed reference viewed: 16 (0 UL)![]() Talavera Andujar, Begona ![]() ![]() ![]() in Analytical and Bioanalytical Chemistry (2022) Parkinson’s disease (PD) is the second most prevalent neurodegenerative disease, with an increasing incidence in recent years due to the ageing population. Genetic mutations alone only explain <10% of PD ... [more ▼] Parkinson’s disease (PD) is the second most prevalent neurodegenerative disease, with an increasing incidence in recent years due to the ageing population. Genetic mutations alone only explain <10% of PD cases, while environmental factors, including small molecules, may play a significant role in PD. In the present work, 22 plasma (11 PD, 11 control) and 19 feces samples (10 PD, 9 control) were analyzed by non-target high resolution mass spectrometry (NT-HRMS) coupled to two liquid chromatography (LC) methods (reversed phase (RP) and hydrophilic interaction liquid chromatography (HILIC)). A cheminformatics workflow was optimized using open software (MS-DIAL and patRoon) and open databases (all public MSP-formatted spectral libraries for MS-DIAL, PubChemLite for Exposomics and the LITMINEDNEURO list for patRoon). Furthermore, five disease-specific databases and three suspect lists (on PD and related disorders) were developed, using PubChem functionality to identifying relevant unknown chemicals. The results showed that non-target screening with the larger databases generally provided better results compared with smaller suspect lists. However, two suspect screening approaches with patRoon were also good options to study specific chemicals in PD. The combination of chromatographic methods (RP and HILIC) as well as two ionization modes (positive and negative) enhanced the coverage of chemicals in the biological samples. While most metabolomics studies in PD have focused on blood and cerebrospinal fluid, we found a higher number of relevant features in feces, such as alanine betaine or nicotinamide, which can be directly metabolized by gut microbiota. This highlights the potential role of gut dysbiosis in PD development. [less ▲] Detailed reference viewed: 85 (1 UL)![]() ; ; et al in Environmental Science Technology Letters (2022), 0(0), Detailed reference viewed: 45 (4 UL)![]() Lai, Adelene ![]() in Environmental Science and Technology (2022) Substances of unknown or variable composition, complex reaction products, or biological materials (UVCBs) are over 70 000 “complex” chemical mixtures produced and used at significant levels worldwide. Due ... [more ▼] Substances of unknown or variable composition, complex reaction products, or biological materials (UVCBs) are over 70 000 “complex” chemical mixtures produced and used at significant levels worldwide. Due to their unknown or variable composition, applying chemical assessments originally developed for individual compounds to UVCBs is challenging, which impedes sound management of these substances. Across the analytical sciences, toxicology, cheminformatics, and regulatory practice, new approaches addressing specific aspects of UVCB assessment are being developed, albeit in a fragmented manner. This review attempts to convey the “big picture” of the state of the art in dealing with UVCBs by holistically examining UVCB characterization and chemical identity representation, as well as hazard, exposure, and risk assessment. Overall, information gaps on chemical identities underpin the fundamental challenges concerning UVCBs, and better reporting and substance characterization efforts are needed to support subsequent chemical assessments. To this end, an information level scheme for improved UVCB data collection and management within databases is proposed. The development of UVCB testing shows early progress, in line with three main methods: whole substance, known constituents, and fraction profiling. For toxicity assessment, one option is a whole-mixture testing approach. If the identities of (many) constituents are known, grouping, read across, and mixture toxicity modeling represent complementary approaches to overcome data gaps in toxicity assessment. This review highlights continued needs for concerted efforts from all stakeholders to ensure proper assessment and sound management of UVCBs. [less ▲] Detailed reference viewed: 52 (3 UL)![]() Frigerio, Gianfranco ![]() in Molecules (2022), 27(8), 2580 Pooled quality controls (QCs) are usually implemented within untargeted methods to improve the quality of datasets by removing features either not detected or not reproducible. However, this approach can ... [more ▼] Pooled quality controls (QCs) are usually implemented within untargeted methods to improve the quality of datasets by removing features either not detected or not reproducible. However, this approach can be limiting in exposomics studies conducted on groups of exposed and nonexposed subjects, as compounds present at low levels only in exposed subjects can be diluted and thus not detected in the pooled QC. The aim of this work is to develop and apply an untargeted workflow for human biomonitoring in urine samples, implementing a novel separated approach for preparing pooled quality controls. An LC-MS/MS workflow was developed and applied to a case study of smoking and non-smoking subjects. Three different pooled quality controls were prepared: mixing an aliquot from every sample (QC-T), only from non-smokers (QC-NS), and only from smokers (QC-S). The feature tables were filtered using QC-T (T-feature list), QC-S, and QC-NS, separately. The last two feature lists were merged (SNS-feature list). A higher number of features was obtained with the SNS-feature list than the T-feature list, resulting in identification of a higher number of biologically significant compounds. The separated pooled QC strategy implemented can improve the nontargeted human biomonitoring for groups of exposed and nonexposed subjects. [less ▲] Detailed reference viewed: 53 (5 UL)![]() Schymanski, Emma ![]() Presentation (2022, January 10) The multitude of chemicals to which we are exposed is ever increasing, with over 110 million chemicals in the largest open chemical databases, over 350,000 in global use inventories, and over 70,000 ... [more ▼] The multitude of chemicals to which we are exposed is ever increasing, with over 110 million chemicals in the largest open chemical databases, over 350,000 in global use inventories, and over 70,000 estimated to be in household use alone. Detectable molecules in exposomics can be captured using non-target high resolution mass spectrometry (HRMS), but despite the size of the chemical space, scientists cannot yet identify most of the tens of thousands of features in each sample, leading to critical bottlenecks in identification and data interpretation. This talk will cover European and worldwide community initiatives and resources to help connect environmental expert knowledge and observations towards a better understanding of the exposome, including various open cheminformatics and computational mass spectrometry approaches such as the NORMAN Suspect List Exchange, MassBank, MetFrag and PubChemLite for Exposomics. [less ▲] Detailed reference viewed: 121 (4 UL)![]() Schymanski, Emma ![]() in ACS Environmental Au (2022), 2(4), 287--289 As the first half of 2022 comes to a close, it is an interesting time to reflect on some recent trends. In many ways, the world is “opening” up again, with many colleagues going to their first “in person” ... [more ▼] As the first half of 2022 comes to a close, it is an interesting time to reflect on some recent trends. In many ways, the world is “opening” up again, with many colleagues going to their first “in person” conferences since the start of the pandemic in early 2020. A significant leap forward for open chemistry was made in 2021, with the Chemical Abstracts Service (CAS) Registry embracing a hybrid model and releasing half a million chemicals as the CAS Common Chemistry set under an open license. (1)ACS Environmental Au continues to develop as one of the key gold open access journals for publishing work on environmental topics. (2) The European Union has just launched the €400 million European Partnership for the Assessment of Risks from Chemicals (PARC), with ∼200 partners (3) and a whole work package on FAIR (Findable, Accessible, Interoperable, Reusable) (4,5) and Open (6) data. While these trends are cause for optimism, the CAS Registry continues to climb toward the 200 million chemical mark (7) and many of us were blown away by the sheer immensity of the chemical pollution problem at recent meetings. Other colleagues, e.g., those affected by war, by lockdowns, or with insufficient funds, are unable to share in the “post-pandemic” reopening, conferences, and travel. Others cannot afford the costs associated with open access or still do not see the benefits of open science. Why the focus on these disjoint subjects? Both chemical pollution and the COVID-19 pandemic are global challenges requiring global solutions, where failure to act comes with a high price. Landrigan et al. estimated that 9 million premature deaths (16% of the global total) were caused by pollution in 2015. (8) Worldwide deaths directly due to the COVID-19 pandemic are already over 6 million (9) (January 2020 to May 2022). While public awareness is high, individuals often feel powerless to tackle global challenges─yet the pandemic has proven that individual actions can make an incredible collective difference. The same applies to open data and the exchange of research results─the collective benefit from many individual contributions can be extraordinary. [less ▲] Detailed reference viewed: 29 (0 UL)![]() ; Schymanski, Emma ![]() in Journal of Cheminformatics (2022), 14(1), 51 Detailed reference viewed: 176 (0 UL)![]() ; Schymanski, Emma ![]() in Nature Machine Intelligence (2022), 4(12), 1224--1237 Abstract Structural annotation of small molecules in biological samples remains a key bottleneck in untargeted metabolomics, despite rapid progress in predictive methods and tools during the past decade ... [more ▼] Abstract Structural annotation of small molecules in biological samples remains a key bottleneck in untargeted metabolomics, despite rapid progress in predictive methods and tools during the past decade. Liquid chromatography–tandem mass spectrometry, one of the most widely used analysis platforms, can detect thousands of molecules in a sample, the vast majority of which remain unidentified even with best-of-class methods. Here we present LC-MS2Struct, a machine learning framework for structural annotation of small-molecule data arising from liquid chromatography–tandem mass spectrometry (LC-MS2) measurements. LC-MS2Struct jointly predicts the annotations for a set of mass spectrometry features in a sample, using a novel structured prediction model trained to optimally combine the output of state-of-the-art MS2 scorers and observed retention orders. We evaluate our method on a dataset covering all publicly available reversed-phase LC-MS2 data in the MassBank reference database, including 4,327 molecules measured using 18 different LC conditions from 16 contributors, greatly expanding the chemical analytical space covered in previous multi-MSscorer evaluations. LC-MS2Struct obtains significantly higher annotation accuracy than earlier methods and improves the annotation accuracy of state-of-the-art MS2 scorers by up to 106\%. The use of stereochemistry-aware molecular fingerprints improves prediction performance, which highlights limitations in existing approaches and has strong implications for future computational LC-MS2 developments. [less ▲] Detailed reference viewed: 27 (1 UL)![]() ; Schymanski, Emma ![]() ![]() in Medizinische Genetik (2022), 34(2), 103--116 Detailed reference viewed: 54 (3 UL)![]() ; ; Schymanski, Emma ![]() in Environment International (2022), 170 Identification of bioaccumulating contaminants of emerging concern (CECs) via suspect and non-target screening remains a challenging task. In this study, ion mobility separation with high-resolution mass ... [more ▼] Identification of bioaccumulating contaminants of emerging concern (CECs) via suspect and non-target screening remains a challenging task. In this study, ion mobility separation with high-resolution mass spectrometry (IM-HRMS) was used to investigate the effects of drift time (DT) alignment on spectrum quality and peak annotation for screening of CECs in complex sample matrices using data independent acquisition (DIA). Data treatment approaches (Binary Sample Comparison) and prioritisation strategies (Halogen Match, co-occurrence of features in biota and the water phase) were explored in a case study on zebra mussel (Dreissena polymorpha) in Lake Mälaren, Sweden’s largest drinking water reservoir. DT alignment evidently improved the fragment spectrum quality by increasing the similarity score to reference spectra from on average (±standard deviation) 0.33 ± 0.31 to 0.64 ± 0.30 points, thus positively influencing structure elucidation efforts. Thirty-two features were tentatively identified at confidence level 3 or higher using MetFrag coupled with the new PubChemLite database, which included predicted collision cross-section values from CCSbase. The implementation of predicted mobility data was found to support compound annotation. This study illustrates a quantitative assessment of the benefits of IM-HRMS on spectral quality, which will enhance the performance of future screening studies of CECs in complex environmental matrices. [less ▲] Detailed reference viewed: 26 (0 UL)![]() ; ; et al in Digital Discovery (2022) Extracting PFAS with open source cheminformatics toolkits reveals ~1.78 million PFAS in Google Patents, ~28 K in the CORE literature repository. The extraction of chemical information from documents is a ... [more ▼] Extracting PFAS with open source cheminformatics toolkits reveals ~1.78 million PFAS in Google Patents, ~28 K in the CORE literature repository. The extraction of chemical information from documents is a demanding task in cheminformatics due to the variety of text and image-based representations of chemistry. The present work describes the extraction of chemical compounds with unique chemical structures from the open access CORE (COnnecting REpositories) and Google Patents full text document repositories. The importance of structure normalization is demonstrated using three open access cheminformatics toolkits: the Chemistry Development Kit (CDK), RDKit and OpenChemLib (OCL). Each toolkit was used for structure parsing, normalization and subsequent substructure searching, using SMILES as structure representations of chemical molecules and International Chemical Identifiers (InChIs) for comparison. Per- and polyfluoroalkyl substances (PFAS) were chosen as a case study to perform the substructure search, due to their high environmental relevance, their presence in both literature and patent corpuses, and the current lack of community consensus on their definition. Three different structural definitions of PFAS were chosen to highlight the implications of various definitions from a cheminformatics perspective. Since CDK, RDKit and OCL implement different criteria and methods for SMILES parsing and normalization, different numbers of parsed compounds were extracted, which were then evaluated using the three PFAS definitions. A comparison of these toolkits and definitions is provided, along with a discussion of the implications for PFAS screening and text mining efforts in cheminformatics. Finally, the extracted PFAS (~1.7 M PFAS from patents and ~27 K from CORE) were compared against various existing PFAS lists and are provided in various formats for further community research efforts. [less ▲] Detailed reference viewed: 37 (1 UL)![]() ; ; Kondic, Todor ![]() in Environment International (2022), 158 The diversity of hundreds of thousands of potential organic pollutants and the lack of (publicly available) information about many of them is a huge challenge for environmental sciences, engineering, and ... [more ▼] The diversity of hundreds of thousands of potential organic pollutants and the lack of (publicly available) information about many of them is a huge challenge for environmental sciences, engineering, and regulation. Suspect screening based on high-resolution liquid chromatography-mass spectrometry (LC-HRMS) has enormous potential to help characterize the presence of these chemicals in our environment, enabling the detection of known and newly emerging pollutants, as well as their potential transformation products (TPs). Here, suspect list creation (focusing on pesticides relevant for Luxembourg, incorporating data sources in 4 languages) was coupled to an automated retrieval of related TPs from PubChem based on high confidence suspect hits, to screen for pesticides and their TPs in Luxembourgish river samples. A computational workflow was established to combine LC-HRMS analysis and pre-screening of the suspects (including automated quality control steps), with spectral annotation to determine which pesticides and, in a second step, their related TPs may be present in the samples. The data analysis with Shinyscreen (https://gitlab.lcsb.uni.lu/eci/shinyscreen/), an open source software developed in house, coupled with custom-made scripts, revealed the presence of 162 potential pesticide masses and 96 potential TP masses in the samples. Further identification of these mass matches was performed using the open source approach MetFrag (https://msbi.ipb-halle.de/MetFrag/). Eventual target analysis of 36 suspects resulted in 31 pesticides and TPs confirmed at Level-1 (highest confidence), and five pesticides and TPs not confirmed due to different retention times. Spatio-temporal analysis of the results showed that TPs and pesticides followed similar trends, with a maximum number of potential detections in July. The highest detections were in the rivers Alzette and Mess and the lowest in the Sûre and Eisch. This study (a) added pesticides, classification information and related TPs into the open domain, (b) developed automated open source retrieval methods - both enhancing FAIRness (Findability, Accessibility, Interoperability and Reusability) of the data and methods; and (c) will directly support “L’Administration de la Gestion de l’Eau” on further monitoring steps in Luxembourg. [less ▲] Detailed reference viewed: 84 (8 UL)![]() ; ; et al in Journal of Open Source Software (2022), 7(71), 4029 Non-target analysis (NTA) via chromatography coupled to high resolution mass spectrometry (HRMS) is used to monitor and identify organic chemicals in the environment. Biotic and abiotic processes can ... [more ▼] Non-target analysis (NTA) via chromatography coupled to high resolution mass spectrometry (HRMS) is used to monitor and identify organic chemicals in the environment. Biotic and abiotic processes can transform original chemicals (parents) into transformation products (TPs). These TPs can be of equal or more concern than their parent compounds and are therefore critical to monitor and identify in the environment (Escher & Fenner, 2011; Farré et al., 2008), often with NTA. Given the amount of data generated by NTA, advanced automated data processing workflows are essential. The open-source, R-based (R Core Team, 2021) platform patRoon (Helmus, ter Laak, et al., 2021) offers automated, straightforward, flexible and comprehensive NTA workflows. This article describes improvements introduced in patRoon 2.0, including extensive TP screening and simultaneous processing of positive and negative HRMS data. The updated documentation and code are available via https://rickhelmus.github.io/patRoon and archived in Helmus, Velde, et al. (2021). [less ▲] Detailed reference viewed: 212 (9 UL)![]() ; Aho, Velma ![]() ![]() E-print/Working paper (2022) Patients with Parkinson’s disease (PD) exhibit differences in their gut microbiomes compared to healthy individuals. Although differences have most commonly been described in the abundances of bacterial ... [more ▼] Patients with Parkinson’s disease (PD) exhibit differences in their gut microbiomes compared to healthy individuals. Although differences have most commonly been described in the abundances of bacterial taxa, changes to viral and archaeal populations have also been observed. Mechanistic links between gut microbes and PD pathogenesis remain elusive but could involve molecules that promote α-synuclein aggregation. Here, we show that 2-hydroxypyridine (2-HP) represents a key molecule for the pathogenesis of PD. We observe significantly elevated 2-HP levels in faecal samples from patients with PD or its prodrome, idiopathic REM sleep behaviour disorder (iRBD), compared to healthy controls. 2-HP is correlated with the archaeal species Methanobrevibacter smithii and with genes involved in methane metabolism, and it is detectable in isolate cultures of M. smithii. We demonstrate that 2-HP is selectively toxic to transgenic α-synuclein overexpressing yeast and increases α-synuclein aggregation in a yeast model as well as in human induced pluripotent stem cell derived enteric neurons. It also exacerbates PD-related motor symptoms, α-synuclein aggregation, and striatal degeneration when injected intrastriatally in transgenic mice overexpressing human α-synuclein. Our results highlight the effect of an archaeal molecule in relation to the gut-brain axis, which is critical for the diagnosis, prognosis, and treatment of PD. [less ▲] Detailed reference viewed: 110 (6 UL) |
||