References of "Soldatos, Theodoros G"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailCaipirini: using gene sets to rank literature
Soldatos, Theodoros G.; O’Donoghue, S. I.; Satagopam, V. P. et al

in BioData Mining (2012), 5(1),

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists ... [more ▼]

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists of abstracts that the user judges to be ‘interesting’. Some methods go further by allowing the user to provide a second input set of ‘uninteresting’ abstracts; these two input sets are then used to search and rank literature by relevance. In this work we present the service ‘Caipirini’ (http:// caipirini.org) that also allows two input sets, but takes the novel approach of allowing ranking of literature based on one or more sets of genes. Results: To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in Arabidopsis thaliana. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets. Conclusions: To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments. [less ▲]

Detailed reference viewed: 62 (2 UL)
Full Text
Peer Reviewed
See detailUsing graph theory to analyze biological networks
Pavlopoulos, Georgios A.; Secrier, Maria; Moschopoulos, Charalampos N. et al

in BioData Mining (2011), 4(10), 1-27

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can ... [more ▼]

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system. [less ▲]

Detailed reference viewed: 69 (2 UL)
Full Text
Peer Reviewed
See detailLAITOR - Literature Assistant for Identification of Terms co-Occurrences and Relationships.
Barbosa Da Silva, Adriano UL; Soldatos, Theodoros G.; Magalhaes, Ivan L. F. et al

in BMC Bioinformatics (2010), 11

BACKGROUND: Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such ... [more ▼]

BACKGROUND: Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context. RESULTS: We created a text mining system (LAITOR: Literature Assistant for Identification of Terms co-Occurrences and Relationships) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic. CONCLUSIONS: Text mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds. [less ▲]

Detailed reference viewed: 146 (7 UL)
Full Text
Peer Reviewed
See detailA reference guide for tree analysis and visualization
Pavlopoulos, Georgios A.; Soldatos, Theodoros G.; Barbosa Da Silva, Adriano UL et al

in BioData Mining (2010), 3(1), 1

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics ... [more ▼]

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. [less ▲]

Detailed reference viewed: 61 (4 UL)
Full Text
Peer Reviewed
See detailMartini: using literature keywords to compare gene sets.
Soldatos, Theodoros G.; O'Donoghue, Sean I.; Satagopam, Venkata UL et al

in Nucleic acids research (2010), 38(1), 26-38

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing ... [more ▼]

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de. [less ▲]

Detailed reference viewed: 98 (8 UL)
Full Text
Peer Reviewed
See detailArena3D: visualization of biological networks in 3D
Pavlopoulos, Georgios A.; O'Donoghue, Sean I.; Satagopam, Venkata UL et al

in BMC Systems Biology (2008), 2

Background: Complexity is a key problem when visualizing biological networks; as the number of entities increases, most graphical views become incomprehensible. Our goal is to enable many thousands of ... [more ▼]

Background: Complexity is a key problem when visualizing biological networks; as the number of entities increases, most graphical views become incomprehensible. Our goal is to enable many thousands of entities to be visualized meaningfully and with high performance. Results: We present a new visualization tool, Arena3D, which introduces a new concept of staggered layers in 3D space. Related data - such as proteins, chemicals, or pathways - can be grouped onto separate layers and arranged via layout algorithms, such as Fruchterman-Reingold, distance geometry, and a novel hierarchical layout. Data on a layer can be clustered via k-means, affinity propagation, Markov clustering, neighbor joining, tree clustering, or UPGMA ('unweighted pair-group method with arithmetic mean'). A simple input format defines the name and URL for each node, and defines connections or similarity scores between pairs of nodes. The use of Arena3D is illustrated with datasets related to Huntington's disease. Conclusion: Arena3D is a user friendly visualization tool that is able to visualize biological or any other network in 3D space. It is free for academic use and runs on any platform. It can be downloaded or lunched directly from http://arena3d.org. Java3D library and Java 1.5 need to be pre-installed for the software to run. [less ▲]

Detailed reference viewed: 87 (3 UL)