Reference : VISUALISATION AND BINNING OF METAGENOMIC DATA
Dissertations and theses : Doctoral thesis
Life sciences : Microbiology
http://hdl.handle.net/10993/24369
VISUALISATION AND BINNING OF METAGENOMIC DATA
English
Laczny, Cedric Christian mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > > ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > Eco-Systems Biology Group > Excellent]
3-Nov-2015
University of Luxembourg, ​​Luxembourg
Docteur en Biologie
Wilmes, Paul mailto
Sauter, Thomas mailto
Meese, Eckart
Sczyrba, Alexander
Glaab, Enrico mailto
Colombo, Nicolo mailto
[en] binning ; metgenomics ; machine learning
[en] Metagenomic sequencing and assembly have become important approaches for the in situ characterisation of mixed microbial communities. Nevertheless, the data are typically fragmented and disconnected. The binning of individual sequence fragments into population-level genomic complements promotes the population-resolved synchronous study of community composition and functional potential. However, current binning approaches require a priori knowledge, scale poorly to larger datasets, or exclude human input.
In this work, a reference-independent approach for the visualisation and subsequent human-augmented binning of metagenomic sequence fragments, represented by their high-dimensional, oligonucleotide frequency-based signatures, is introduced. Due to the efficient and faithful representation of high-dimensional cluster structures in low-dimensional space, the described methodology facilitates the exploration and analysis of large datasets by a human user. Subsequently, a stand-alone software implementation, VizBin, is developed and described. This graphical user interface-based tool is designed to allow a user-friendly application of the herein introduced approach without the requirement of a bioinformatical background, special training, or exceptional computing resources. Following the software development, VizBin was applied for the analysis of human gastrointestinal tract-derived metagenomic sequencing data. This allowed the recovery of six virtually complete or partial genomes of hitherto uncharacterised and deeply branching microbial populations from four taxa including a potential butyrate-producing taxon.
In summary, this work illustrates how improved recovery of population-level microbial genomes is achieved by reference-independent binning of assembled metagenomic sequencing data using human input. The broad applicability and robustness of the herein introduced approach is furthermore demonstrated by using VizBin for the visualisation of state-of-the-art long read-sequencing data. Despite the increased sequence error rate of this emerging type of sequencing data, pertinent cluster structures are revealed thus motivating the development of future read-level binning approaches. Targeted wet-lab validation of in silico recovered population-level genomes and comprehensive population-resolved analysis of microbial consortia in situ are key to advancing our knowledge and understanding of microbiota in different environments.
Luxembourg Centre for Systems Biomedicine (LCSB): Eco-Systems Biology (Wilmes Group)
Fonds National de la Recherche - FnR
Researchers ; Students
http://hdl.handle.net/10993/24369

There is no file associated with this reference.

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.