Results 1-10 of 10.
((uid:50000800))

Bookmark and Share    
Full Text
See detailSupporting findability of COVID-19 research with large-scale text mining of scientific publications
Welter, Danielle UL; Vega Moreno, Carlos Gonzalo UL; Biryukov, Maria UL et al

Poster (2020, November 27)

When the COVID-19 pandemic hit in early 2020, a lot of research efforts were quickly redirected towards studies on SARS-CoV2 and COVID-19 disease, from the sequencing and assembly of viral genomes to the ... [more ▼]

When the COVID-19 pandemic hit in early 2020, a lot of research efforts were quickly redirected towards studies on SARS-CoV2 and COVID-19 disease, from the sequencing and assembly of viral genomes to the elaboration of robust testing methodologies and the development of treatment and vaccination strategies. At the same time, a flurry of scientific publications around SARS-CoV-2 and COVID-19 began to appear, making it increasingly difficult for researchers to stay up-to-date with latest trends and developments in this rapidly evolving field. The BioKB platform is a pipeline which, by exploiting text mining and semantic technologies, helps researchers easily access semantic content of thousands of abstracts and full text articles. The content of the articles is analysed and concepts from a range of contexts, including proteins, species, chemicals, diseases and biological processes are tagged based on existing dictionaries of controlled terms. Co-occurring concepts are classified based on their asserted relationship and the resulting subject-relation-object triples are stored in a publicly accessible human- and machine-readable knowledge base. All concepts in the BioKB dictionaries are linked to stable, persistent identifiers, either a resource accession such as an Ensembl, Uniprot or PubChem ID for genes, proteins and chemicals, or an ontology term ID for diseases, phenotypes and other ontology terms. In order to improve COVID-19 related text mining, we extended the underlying dictionaries to include many additional viral species (via NCBI Taxonomy identifiers), phenotypes from the Human Phenotype Ontology (HPO), COVID-related concepts including clinical and laboratory tests from the COVID-19 ontology, as well as additional diseases (DO) and biological processes (GO). We also added all viral proteins found in UniProt and gene entries from EntrezGene to increase the sensitivity of the text mining pipeline to viral data. To date, BioKB has indexed over 270’000 sentences from 21’935 publications relating to coronavirus infections, with publications dating from 1963 to 2021, 3’863 of which were published this year. We are currently working to further refine the text mining pipeline by training it on the extraction of increasingly complex relations such as protein-phenotype relationships. We are also regularly adding new terms to our dictionaries for areas where coverage is currently low, such as clinical and laboratory tests and procedures and novel drug treatments. [less ▲]

Detailed reference viewed: 109 (12 UL)
Peer Reviewed
See detailTopics, buckets, and psychiatry. On the collective creation of a corpus exploration tool
Biryukov, Maria UL; Kalyakin, Roman; Andersen, Eva UL et al

Scientific Conference (2020, July)

Detailed reference viewed: 39 (4 UL)
Full Text
Peer Reviewed
See detailHow to read the 52.000 pages of the British Journal of Psychiatry? A collaborative approach to source exploration
Andersen, Eva UL; Biryukov, Maria UL; Kalyakin, Roman et al

in Journal of Data Mining and Digital Humanities (2020)

Historians are confronted with an overabundance of sources that require new perspectives and tools to make use of large-scale corpora. Based on a use case from the history of psychiatry this paper ... [more ▼]

Historians are confronted with an overabundance of sources that require new perspectives and tools to make use of large-scale corpora. Based on a use case from the history of psychiatry this paper describes the work of an interdisciplinary team to tackle these challenges by combining different NLP tools with new visual interfaces that foster the exploration of the corpus. The paper highlights several research challenges in the preparation and processing of the corpus and sketches new insights for historical research that were gathered due to the use of the tools. [less ▲]

Detailed reference viewed: 39 (1 UL)
Full Text
Peer Reviewed
See detailA patient-based model of RNA mis-splicing uncovers treatment targets in Parkinson's disease.
Boussaad, Ibrahim UL; Obermaier, Carolin D.; Hanss, Zoé et al

in Science translational medicine (2020), 12(560),

Parkinson's disease (PD) is a heterogeneous neurodegenerative disorder with monogenic forms representing prototypes of the underlying molecular pathology and reproducing to variable degrees the sporadic ... [more ▼]

Parkinson's disease (PD) is a heterogeneous neurodegenerative disorder with monogenic forms representing prototypes of the underlying molecular pathology and reproducing to variable degrees the sporadic forms of the disease. Using a patient-based in vitro model of PARK7-linked PD, we identified a U1-dependent splicing defect causing a drastic reduction in DJ-1 protein and, consequently, mitochondrial dysfunction. Targeting defective exon skipping with genetically engineered U1-snRNA recovered DJ-1 protein expression in neuronal precursor cells and differentiated neurons. After prioritization of candidate drugs, we identified and validated a combinatorial treatment with the small-molecule compounds rectifier of aberrant splicing (RECTAS) and phenylbutyric acid, which restored DJ-1 protein and mitochondrial dysfunction in patient-derived fibroblasts as well as dopaminergic neuronal cell loss in mutant midbrain organoids. Our analysis of a large number of exomes revealed that U1 splice-site mutations were enriched in sporadic PD patients. Therefore, our study suggests an alternative strategy to restore cellular abnormalities in in vitro models of PD and provides a proof of concept for neuroprotection based on precision medicine strategies in PD. [less ▲]

Detailed reference viewed: 155 (10 UL)
Full Text
Peer Reviewed
See detailEvaluation of Cell Line Suitability for Disease Specific Perturbation Experiments.
Biryukov, Maria UL; Antony, Paul UL; Krishna, Abhimanyu UL et al

in Lausen, Berthold; Krolak-Schwerdt, Sabine; Böhmer, Matthias (Eds.) Data Science, Learning by Latent Structures, and Knowledge Discovery (2015, February 20)

Cell lines are widely used in translational biomedical research to study the genetic basis of diseases. A major approach for experimental disease modeling are genetic perturbation experiments that aim to ... [more ▼]

Cell lines are widely used in translational biomedical research to study the genetic basis of diseases. A major approach for experimental disease modeling are genetic perturbation experiments that aim to trigger selected cellular disease states. In this type of experiments it is crucial to ensure that the targeted disease- related genes and pathways are intact in the used cell line. In this work we are developing a framework which integrates genetic sequence information and disease- specific network analysis for evaluating disease-specific cell line suitability. [less ▲]

Detailed reference viewed: 212 (23 UL)
Full Text
Peer Reviewed
See detailSystems genomics evaluation of the SH-SY5Y neuroblastoma cell line as a model for Parkinson’s disease
Krishna, Abhimanyu UL; Biryukov, Maria UL; Trefois, Christophe UL et al

in BMC Genomics (2014), 15(1154),

Background: The human neuroblastoma cell line, SH-SY5Y, is a commonly used cell line in studies related to neurotoxicity, oxidative stress, and neurodegenerative diseases. Although this cell line is often ... [more ▼]

Background: The human neuroblastoma cell line, SH-SY5Y, is a commonly used cell line in studies related to neurotoxicity, oxidative stress, and neurodegenerative diseases. Although this cell line is often used as a cellular model for Parkinson’s disease, the relevance of this cellular model in the context of Parkinson’s disease (PD) and other neurodegenerative diseases has not yet been systematically evaluated. Results: We have used a systems genomics approach to characterize the SH-SY5Y cell line using whole-genome sequencing to determine the genetic content of the cell line and used transcriptomics and proteomics data to determine molecular correlations. Further, we integrated genomic variants using a network analysis approach to evaluate the suitability of the SH-SY5Y cell line for perturbation experiments in the context of neurodegenerative diseases, including PD. Conclusions: The systems genomics approach showed consistency across different biological levels (DNA, RNA and protein concentrations). Most of the genes belonging to the major Parkinson’s disease pathways and modules were intact in the SH-SY5Y genome. Specifically, each analysed gene related to PD has at least one intact copy in SH-SY5Y. The disease-specific network analysis approach ranked the genetic integrity of SH-SY5Y as higher for PD than for Alzheimer’s disease but lower than for Huntington’s disease and Amyotrophic Lateral Sclerosis for loss of function perturbation experiments. [less ▲]

Detailed reference viewed: 269 (25 UL)
Full Text
Peer Reviewed
See detailThe Parkinson's Disease Map: A Framework for Integration, Curation and Exploration of Disease-related Pathways
Ostaszewski, Marek UL; Fujita, Kazuhiro; Matsuoka, Yukiko et al

Poster (2013, March 09)

Objectives: The pathogenesis of Parkinson's Disease (PD) is multi-factorial and age-related, implicating various genetic and environmental factors. It becomes increasingly important to develop new ... [more ▼]

Objectives: The pathogenesis of Parkinson's Disease (PD) is multi-factorial and age-related, implicating various genetic and environmental factors. It becomes increasingly important to develop new approaches to organize and explore the exploding knowledge of this field. Methods: The published knowledge on pathways implicated in PD, such as synaptic and mitochondrial dysfunction, alpha-synuclein pathobiology, failure of protein degradation systems and neuroinflammation has been organized and represented using CellDesigner. This repository has been linked to a framework of bioinformatics tools including text mining, database annotation, large-scale data integration and network analysis. The interface for online curation of the repository has been established using Payao tool. Results: We present the PD map, a computer-based knowledge repository, which includes molecular mechanisms of PD in a visually structured and standardized way. A bioinformatics framework that facilitates in-depth knowledge exploration, extraction and curation supports the map. We discuss the insights gained from PD map-driven text mining of a corpus of over 50 thousands full text PD-related papers, integration and visualization of gene expression in post mortem brain tissue of PD patients with the map, as well as results of network analysis. Conclusions: The knowledge repository of disease-related mechanisms provides a global insight into relationships between different pathways and allows considering a given pathology in a broad context. Enrichment with available text and bioinformatics databases as well as integration of experimental data supports better understanding of complex mechanisms of PD and formulation of novel research hypotheses. [less ▲]

Detailed reference viewed: 547 (70 UL)
Full Text
Peer Reviewed
See detailFunctional Genomics, Proteomics, Metabolomics and Bioinformatics for Systems Biology
Ballereau, S.; Glaab, Enrico UL; Kolodkin, Alexey UL et al

in Prokop, Ales; Csukás, Bela (Eds.) Systems Biology: Integrative Biology and Simulation Tools (2013)

This chapter introduces systems biology, its context, aims, concepts and strategies. It then describes approaches and methods used for collection of high-dimensional structural and functional genomics ... [more ▼]

This chapter introduces systems biology, its context, aims, concepts and strategies. It then describes approaches and methods used for collection of high-dimensional structural and functional genomics data, including epigenomics, transcriptomics, proteomics, metabolomics and lipidomics, and how recent technological advances in these fields have moved the bottleneck from data production to data analysis and bioinformatics. Finally, the most advanced mathematical and computational methods used for clustering, feature selection, prediction analysis, text mining and pathway analysis in functional genomics and systems biology are reviewed and discussed in the context of use cases. [less ▲]

Detailed reference viewed: 583 (48 UL)
Full Text
See detailMethods for Extracting Meta-Information from bibliographic databases
Biryukov, Maria UL

Doctoral thesis (2010)

Due to intensive growth of the electronically available publications, bibliographic databases have become widespread. They cover a large variety of knowledge fields and provide a fast access to the wide ... [more ▼]

Due to intensive growth of the electronically available publications, bibliographic databases have become widespread. They cover a large variety of knowledge fields and provide a fast access to the wide variety of data. At the same time they contain a wealth of hidden knowledge that requires steps of extra processing in order to infer it. In this work we focus on extraction of such meta knowledge from the research bibliographic databases by looking at them from sociolinguistic, text mining and bibliometric perspectives. We choose the Digital Library and Bibliographic Database as a testbed for our experiments. In the framework of the sociolinguistic analysis we build a statistical system for the language identification of personal names. We show also that extension of a purely statistical model with the co-authors network boosts the system's performance. In the text mining scenario, we perform a number of experiments that focus on topic identification and ranking. While our topic detection approach remains generic and can be used for any kind of textual data, the topic ranking metrics are built upon the information provided by the bibliographic databases. The goal of our bibliometric study is to create a researcher's profile on DBLP and analyze some of the research communities defined by the different conferences, in terms of the publication activity, interdisciplinarity of research, collaboration trends and population stability. We also aim at exploring to what extent these aspects correlate with the conference rank. Each of the above topics constitutes a method of meta information extraction from bibliographic databases and other similarly structured data sources. [less ▲]

Detailed reference viewed: 305 (100 UL)