References of "BioData Mining"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailUnraveling genomic variation from next generation sequencing data
Pavlopoulos, Georgios A.; Oulas, Anastasis; Iacucci, Ernesto et al

in BioData Mining (2013), 6(1), 13

Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper ... [more ▼]

Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. [less ▲]

Detailed reference viewed: 40 (1 UL)
Full Text
Peer Reviewed
See detailCaipirini: using gene sets to rank literature
Soldatos, Theodoros G.; O’Donoghue, S. I.; Satagopam, V. P. et al

in BioData Mining (2012), 5(1),

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists ... [more ▼]

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists of abstracts that the user judges to be ‘interesting’. Some methods go further by allowing the user to provide a second input set of ‘uninteresting’ abstracts; these two input sets are then used to search and rank literature by relevance. In this work we present the service ‘Caipirini’ (http:// caipirini.org) that also allows two input sets, but takes the novel approach of allowing ranking of literature based on one or more sets of genes. Results: To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in Arabidopsis thaliana. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets. Conclusions: To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments. [less ▲]

Detailed reference viewed: 44 (2 UL)
Full Text
Peer Reviewed
See detailUsing graph theory to analyze biological networks
Pavlopoulos, Georgios A.; Secrier, Maria; Moschopoulos, Charalampos N. et al

in BioData Mining (2011), 4(10), 1-27

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can ... [more ▼]

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system. [less ▲]

Detailed reference viewed: 52 (2 UL)
Full Text
Peer Reviewed
See detailA reference guide for tree analysis and visualization
Pavlopoulos, Georgios A.; Soldatos, Theodoros G.; Barbosa Da Silva, Adriano UL et al

in BioData Mining (2010), 3(1), 1

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics ... [more ▼]

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. [less ▲]

Detailed reference viewed: 58 (4 UL)
Full Text
Peer Reviewed
See detailA survey of visualization tools for biological network analysis.
Pavlopoulos, Georgios A.; Wegener, A. L.; Schneider, Reinhard UL

in BioData Mining (2008)

The analysis and interpretation of relationships between biological molecules, networks and concepts is becoming a major bottleneck in systems biology. Very often the pure amount of data and their ... [more ▼]

The analysis and interpretation of relationships between biological molecules, networks and concepts is becoming a major bottleneck in systems biology. Very often the pure amount of data and their heterogeneity provides a challenge for the visualization of the data. There are a wide variety of graph representations available, which most often map the data on 2D graphs to visualize biological interactions. These methods are applicable to a wide range of problems, nevertheless many of them reach a limit in terms of user friendliness when thousands of nodes and connections have to be analyzed and visualized. In this study we are reviewing visualization tools that are currently available for visualization of biological networks mainly invented in the latest past years. We comment on the functionality, the limitations and the specific strengths of these tools, and how these tools could be further developed in the direction of data integration and information sharing. [less ▲]

Detailed reference viewed: 42 (0 UL)