References of "Schneider, Reinhard 50003033"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailCaipirini: using gene sets to rank literature
Soldatos, Theodoros G.; O’Donoghue, S. I.; Satagopam, V. P. et al

in BioData Mining (2012), 5(1),

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists ... [more ▼]

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists of abstracts that the user judges to be ‘interesting’. Some methods go further by allowing the user to provide a second input set of ‘uninteresting’ abstracts; these two input sets are then used to search and rank literature by relevance. In this work we present the service ‘Caipirini’ (http:// caipirini.org) that also allows two input sets, but takes the novel approach of allowing ranking of literature based on one or more sets of genes. Results: To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in Arabidopsis thaliana. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets. Conclusions: To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments. [less ▲]

Detailed reference viewed: 96 (2 UL)
Full Text
Peer Reviewed
See detailPathVar: analysis of gene and protein expression variance in cellular pathways using microarray data
Glaab, Enrico UL; Schneider, Reinhard UL

in Bioinformatics (2012)

Finding significant differences between the expression levels of genes or proteins across diverse biological conditions is one of the primary goals in the analysis of functional genomics data. However ... [more ▼]

Finding significant differences between the expression levels of genes or proteins across diverse biological conditions is one of the primary goals in the analysis of functional genomics data. However, existing methods for identifying differentially expressed genes or sets of genes by comparing measures of the average expression across predefined sample groups do not detect differential variance in the expression levels across genes in cellular pathways. Since corresponding pathway deregulations occur frequently in microarray gene or protein expression data, we present a new dedicated web application, PathVar, to analyze these data sources. The software ranks pathway-representing gene/protein sets in terms of the differences of the variance in the within-pathway expression levels across different biological conditions. Apart from identifying new pathway deregulation patterns, the tool exploits these patterns by combining different machine learning methods to find clusters of similar samples and build sample classification models. [less ▲]

Detailed reference viewed: 131 (11 UL)
Full Text
Peer Reviewed
See detailEnrichNet: network-based gene set enrichment analysis
Glaab, Enrico UL; Baudot, A.; Krasnogor, N. et al

in Bioinformatics (2012), 28(18), 451-457

Assessing functional associations between an experimentally derived gene or protein set of interest and a database of known gene/protein sets is a common task in the analysis of large-scale functional ... [more ▼]

Assessing functional associations between an experimentally derived gene or protein set of interest and a database of known gene/protein sets is a common task in the analysis of large-scale functional genomics data. For this purpose, a frequently used approach is to apply an over-representation-based enrichment analysis. However, this approach has four drawbacks: (i) it can only score functional associations of overlapping gene/proteins sets; (ii) it disregards genes with missing annotations; (iii) it does not take into account the network structure of physical interactions between the gene/protein sets of interest and (iv) tissue-specific gene/protein set associations cannot be recognized. RESULTS: To address these limitations, we introduce an integrative analysis approach and web-application called EnrichNet. It combines a novel graph-based statistic with an interactive sub-network visualization to accomplish two complementary goals: improving the prioritization of putative functional gene/protein set associations by exploiting information from molecular interaction networks and tissue-specific gene expression data and enabling a direct biological interpretation of the results. By using the approach to analyse sets of genes with known involvement in human diseases, new pathway associations are identified, reflecting a dense sub-network of interactions between their corresponding proteins. [less ▲]

Detailed reference viewed: 180 (14 UL)
Full Text
Peer Reviewed
See detailBioinformatics as a driver, not a passenger, of translational biomedical research: Perspectives from the 6th Benelux bioinformatics conference.
Azuaje, F. J.; Heymann, M.; Ternes, A. M. et al

in Journal of Clinical Bioinformatics (2012), 2(7),

The 6th Benelux Bioinformatics Conference (BBC11) held in Luxembourg on 12 and 13 December 2011 attracted around 200 participants, including internationally-renowned guest speakers and more than 100 peer ... [more ▼]

The 6th Benelux Bioinformatics Conference (BBC11) held in Luxembourg on 12 and 13 December 2011 attracted around 200 participants, including internationally-renowned guest speakers and more than 100 peer-reviewed submissions from 3 continents. Researchers from the public and private sectors convened at BBC11 to discuss advances and challenges in a wide spectrum of application areas. A key theme of the conference was the contribution of bioinformatics to enable and accelerate translational and clinical research. The BBC11 stressed the need for stronger collaborating efforts across disciplines and institutions. The demonstration of the clinical relevance of systems approaches and of next-generation sequencing-based measurement technologies are among the existing opportunities for increasing impact in translational research. Translational bioinformatics will benefit from research models that strike a balance between the importance of protecting intellectual property and the need to openly access scientific and technological advances. The full conference proceedings are freely available at http://www.bbc11.lu. [less ▲]

Detailed reference viewed: 130 (5 UL)
Full Text
Peer Reviewed
See detailMedusa: A tool for exploring and clustering biological networks.
Pavlopoulos, Georgios A.; Hooper, S. D.; Sifrim, A. et al

in BMC Research Notes (2011), 4(1), 384

Background: Biological processes such as metabolic pathways, gene regulation or protein-protein interactions are often represented as graphs in systems biology. The understanding of such networks, their ... [more ▼]

Background: Biological processes such as metabolic pathways, gene regulation or protein-protein interactions are often represented as graphs in systems biology. The understanding of such networks, their analysis, and their visualization are today important challenges in life sciences. While a great variety of visualization tools that try to address most of these challenges already exists, only few of them succeed to bridge the gap between visualization and network analysis. Findings: Medusa is a powerful tool for visualization and clustering analysis of large-scale biological networks. It is highly interactive and it supports weighted and unweighted multi-edged directed and undirected graphs. It combines a variety of layouts and clustering methods for comprehensive views and advanced data analysis. Its main purpose is to integrate visualization and analysis of heterogeneous data from different sources into a single network. Conclusions: Medusa provides a concise visual tool, which is helpful for network analysis and interpretation. Medusa is offered both as a standalone application and as an applet written in Java. It can be found at: https:// sites.google.com/site/medusa3visualization. [less ▲]

Detailed reference viewed: 124 (4 UL)
Full Text
Peer Reviewed
See detailUsing graph theory to analyze biological networks
Pavlopoulos, Georgios A.; Secrier, Maria; Moschopoulos, Charalampos N. et al

in BioData Mining (2011), 4(10), 1-27

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can ... [more ▼]

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system. [less ▲]

Detailed reference viewed: 113 (2 UL)
Full Text
Peer Reviewed
See detailWhich clustering algorithm is better for predicting protein complexes?
Moschopoulos, Charalampos N.; Pavlopoulos, Georgios A.; Iacucci, Ernesto et al

in BMC Research Notes (2011), (4), 549

Background Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the ... [more ▼]

Background Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks. Results In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases. Conclusions While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm [less ▲]

Detailed reference viewed: 112 (3 UL)
Full Text
Peer Reviewed
See detailDefective Lamin A-Rb Signaling in Hutchinson-Gilford Progeria Syndrome and Reversal by Farnesyltransferase Inhibition
Marji, Jackleen; O'Donoghue, Sean I.; McClintock, Dayle et al

in PLoS ONE (2010), 5(6),

Hutchinson-Gilford Progeria Syndrome (HGPS) is a rare premature aging disorder caused by a de novo heterozygous point mutation G608G (GGC>GGT) within exon 11 of LMNA gene encoding A-type nuclear lamins ... [more ▼]

Hutchinson-Gilford Progeria Syndrome (HGPS) is a rare premature aging disorder caused by a de novo heterozygous point mutation G608G (GGC>GGT) within exon 11 of LMNA gene encoding A-type nuclear lamins. This mutation elicits an internal deletion of 50 amino acids in the carboxyl-terminus of prelamin A. The truncated protein, progerin, retains a farnesylated cysteine at its carboxyl terminus, a modification involved in HGPS pathogenesis. Inhibition of protein farnesylation has been shown to improve abnormal nuclear morphology and phenotype in cellular and animal models of HGPS. We analyzed global gene expression changes in fibroblasts from human subjects with HGPS and found that a lamin A-Rb signaling network is a major defective regulatory axis. Treatment of fibroblasts with a protein farnesyltransferase inhibitor reversed the gene expression defects. Our study identifies Rb as a key factor in HGPS pathogenesis and suggests that its modulation could ameliorate premature aging and possibly complications of physiological aging. [less ▲]

Detailed reference viewed: 109 (2 UL)
Full Text
Peer Reviewed
See detailPhenotypic profiling of the human genome by time-lapse microscopy reveals cell division genes
Neumann, Beate; Walter, Thomas; Heriche, Jean-Karim et al

in Nature (2010), 464(7289), 721-727

Despite our rapidly growing knowledge about the human genome, we do not know all of the genes required for some of the most basic functions of life. To start to fill this gap we developed a high ... [more ▼]

Despite our rapidly growing knowledge about the human genome, we do not know all of the genes required for some of the most basic functions of life. To start to fill this gap we developed a high-throughput phenotypic screening platform combining potent gene silencing by RNA interference, time-lapse microscopy and computational image processing. We carried out a genome-wide phenotypic profiling of each of the similar to 21,000 human protein-coding genes by two-day live imaging of fluorescently labelled chromosomes. Phenotypes were scored quantitatively by computational image processing, which allowed us to identify hundreds of human genes involved in diverse biological functions including cell division, migration and survival. As part of the Mitocheck consortium, this study provides an in-depth analysis of cell division phenotypes and makes the entire high-content data set available as a resource to the community. [less ▲]

Detailed reference viewed: 145 (4 UL)
Full Text
Peer Reviewed
See detailVisualization of omics data for systems biology
Gehlenborg, Nils; O'Donoghue, Sean I.; Baliga, Nitin S. et al

in Nature Methods (2010), 7(3), 56-68

High-throughput studies of biological systems are rapidly accumulating a wealth of 'omics'-scale data. Visualization is a key aspect of both the analysis and understanding of these data, and users now ... [more ▼]

High-throughput studies of biological systems are rapidly accumulating a wealth of 'omics'-scale data. Visualization is a key aspect of both the analysis and understanding of these data, and users now have many visualization methods and tools to choose from. The challenge is to create clear, meaningful and integrated visualizations that give biological insight, without being overwhelmed by the intrinsic complexity of the data. In this review, we discuss how visualization tools are being used to help interpret protein interaction, gene expression and metabolic profile data, and we highlight emerging new directions. [less ▲]

Detailed reference viewed: 130 (1 UL)
Full Text
Peer Reviewed
See detailGPCRs, G-proteins, Effectors and their interactions: Human-gpDB, a database employing advanced visualization tools and data integration techniques
Satagopam, Venkata UL; Theodoropoulou, Margarita C.; Christos, Stampolakis K. et al

in Database: the Journal of Biological Databases and Curation (2010)

G-protein coupled receptors (GPCRs) are a major family of membrane receptors in eukaryotic cells. They play a crucial role in the communication of a cell with the environment. Ligands bind to GPCRs on the ... [more ▼]

G-protein coupled receptors (GPCRs) are a major family of membrane receptors in eukaryotic cells. They play a crucial role in the communication of a cell with the environment. Ligands bind to GPCRs on the outside of the cell, activating them by causing a conformational change, and allowing them to bind to G-proteins. Through their interaction with G-proteins, several effector molecules are activated leading to many kinds of cellular and physiological responses. The great importance of GPCRs and their corresponding signal transduction pathways is indicated by the fact that they take part in many diverse disease processes and that a large part of efforts towards drug development today is focused on them. We present Human-gpDB, a database which currently holds information about 713 human GPCRs, 36 human G-proteins and 99 human effectors. The collection of information about the interactions between these molecules was done manually and the current version of Human-gpDB holds information for about 1663 connections between GPCRs and G-proteins and 1618 connections between G-proteins and effectors. Major advantages of Human-gpDB are the integration of several external data sources and the support of advanced visualization techniques. Human-gpDB is a simple, yet a powerful tool for researchers in the life sciences field as it integrates an up-to-date, carefully curated collection of human GPCRs, G-proteins, effectors and their interactions. The database may be a reference guide for medical and pharmaceutical research, especially in the areas of understanding human diseases and chemical and drug discovery. [less ▲]

Detailed reference viewed: 92 (4 UL)
Full Text
Peer Reviewed
See detailFrom experimental setup to bioinformatics: An RNAi screening platform to identify host factors involved in HIV-1 replication
Boerner, Kathleen; Hermle, Johannes; Sommer, Christoph et al

in Biotechnology Journal (2010), 5(1), 39-49

RNA interference (RNAi) has emerged as a powerful technique for studying loss-of-function phenotypes by specific down-regulation of gene expression, allowing the investigation of virus-host interactions ... [more ▼]

RNA interference (RNAi) has emerged as a powerful technique for studying loss-of-function phenotypes by specific down-regulation of gene expression, allowing the investigation of virus-host interactions by large-scale high-throughput RNAi screens. Here we present a robust and sensitive small interfering RNA screening platform consisting of an experimental setup, single-cell image and statistical analysis as well as bioinformatics. The workflow has been established to elucidate host gene functions exploited by viruses, monitoring both suppression and enhancement of viral replication simultaneously by fluorescence microscopy. The platform comprises a two-stage procedure in which potential host factors are first identified in a primary screen and afterwards re-tested in a validation screen to confirm true positive hits. Subsequent bioinformatics allows the identification of cellular genes participating in metabolic pathways and cellular networks utilised by viruses for efficient infection. Our workflow has been used to investigate host factor usage by the human immunodeficiency virus-1 (HIV-1), but can also be adapted to other viruses. Importantly, we expect that the description of the platform will guide further screening approaches for virus-host interactions. The ViroQuant-Cell Networks RNAi Screening core facility is an integral part of the recently founded BioQuant centre for systems biology at the University of Heidelberg and will provide service to external users in the near future. [less ▲]

Detailed reference viewed: 140 (0 UL)
Full Text
Peer Reviewed
See detailA reference guide for tree analysis and visualization
Pavlopoulos, Georgios A.; Soldatos, Theodoros G.; Barbosa Da Silva, Adriano UL et al

in BioData Mining (2010), 3(1), 1

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics ... [more ▼]

The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. [less ▲]

Detailed reference viewed: 75 (4 UL)
Full Text
Peer Reviewed
See detailGPCRs, G-proteins, effectors and their interactions: human-gpDB, a database employing visualization tools and data integration techniques.
Satagopam, Venkata UL; Theodoropoulou, Margarita C.; Stampolakis, Christos K. et al

in Database: the Journal of Biological Databases and Curation (2010), 2010

G-protein coupled receptors (GPCRs) are a major family of membrane receptors in eukaryotic cells. They play a crucial role in the communication of a cell with the environment. Ligands bind to GPCRs on the ... [more ▼]

G-protein coupled receptors (GPCRs) are a major family of membrane receptors in eukaryotic cells. They play a crucial role in the communication of a cell with the environment. Ligands bind to GPCRs on the outside of the cell, activating them by causing a conformational change, and allowing them to bind to G-proteins. Through their interaction with G-proteins, several effector molecules are activated leading to many kinds of cellular and physiological responses. The great importance of GPCRs and their corresponding signal transduction pathways is indicated by the fact that they take part in many diverse disease processes and that a large part of efforts towards drug development today is focused on them. We present Human-gpDB, a database which currently holds information about 713 human GPCRs, 36 human G-proteins and 99 human effectors. The collection of information about the interactions between these molecules was done manually and the current version of Human-gpDB holds information for about 1663 connections between GPCRs and G-proteins and 1618 connections between G-proteins and effectors. Major advantages of Human-gpDB are the integration of several external data sources and the support of advanced visualization techniques. Human-gpDB is a simple, yet a powerful tool for researchers in the life sciences field as it integrates an up-to-date, carefully curated collection of human GPCRs, G-proteins, effectors and their interactions. The database may be a reference guide for medical and pharmaceutical research, especially in the areas of understanding human diseases and chemical and drug discovery. Database URLs: http://schneider.embl.de/human_gpdb; http://bioinformatics.biol.uoa.gr/human_gpdb/ [less ▲]

Detailed reference viewed: 148 (2 UL)
Full Text
Peer Reviewed
See detailA series of PDB related databases for everyday needs
Joosten, R.; Beek, T.; Krieger, E. et al

in Nucleic Acids Research (2010), 39(1), 411-419

The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible ... [more ▼]

The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible, for each PDB entry. DSSP holds the secondary structure of the proteins. PDBREPORT holds reports on the structure quality and lists errors. HSSP holds a multiple sequence alignment for all proteins. The PDBFINDER holds easy to parse summaries of the PDB file content, augmented with essentials from the other systems. PDB_REDO holds re-refined, and often improved, copies of all structures solved by X-ray. WHY_NOT summarizes why certain files could not be produced. All these systems are updated weekly. The data sets can be used for the analysis of properties of protein structures in areas ranging from structural genomics, to cancer biology and protein design. [less ▲]

Detailed reference viewed: 117 (0 UL)
Full Text
Peer Reviewed
See detailLive Coverage of Intelligent Systems for Molecular Biology
Lister, Allyson L.; Datta, Ruchira S.; Hofmann, Oliver et al

in PLoS Computational Biology (2010), 6

Detailed reference viewed: 70 (1 UL)
Full Text
Peer Reviewed
See detailLive Coverage of Scientific Conferences Using Web Technologies
Lister, Allyson L.; Datta, Ruchira S.; Hofmann, Oliver et al

in PLoS Computational Biology (2010), 6(1), 1-2

Detailed reference viewed: 247 (1 UL)
Full Text
Peer Reviewed
See detailLAITOR - Literature Assistant for Identification of Terms co-Occurrences and Relationships.
Barbosa Da Silva, Adriano UL; Soldatos, Theodoros G.; Magalhaes, Ivan L. F. et al

in BMC Bioinformatics (2010), 11

BACKGROUND: Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such ... [more ▼]

BACKGROUND: Biological knowledge is represented in scientific literature that often describes the function of genes/proteins (bioentities) in terms of their interactions (biointeractions). Such bioentities are often related to biological concepts of interest that are specific of a determined research field. Therefore, the study of the current literature about a selected topic deposited in public databases, facilitates the generation of novel hypotheses associating a set of bioentities to a common context. RESULTS: We created a text mining system (LAITOR: Literature Assistant for Identification of Terms co-Occurrences and Relationships) that analyses co-occurrences of bioentities, biointeractions, and other biological terms in MEDLINE abstracts. The method accounts for the position of the co-occurring terms within sentences or abstracts. The system detected abstracts mentioning protein-protein interactions in a standard test (BioCreative II IAS test data) with a precision of 0.82-0.89 and a recall of 0.48-0.70. We illustrate the application of LAITOR to the detection of plant response genes in a dataset of 1000 abstracts relevant to the topic. CONCLUSIONS: Text mining tools combining the extraction of interacting bioentities and biological concepts with network displays can be helpful in developing reasonable hypotheses in different scientific backgrounds. [less ▲]

Detailed reference viewed: 225 (7 UL)
Full Text
Peer Reviewed
See detailReflect: A practical approach to web semantics
O'Donoghue, Sean I.; Horn, Heiko; Pafilis, Evangelos et al

in Journal of Web Semantics (2010), 8(2-3), 182-189

To date, adding semantic capabilities to web content usually requires considerable server-side re-engineering, thus only a tiny fraction of all web content currently has semantic annotations. Recently, we ... [more ▼]

To date, adding semantic capabilities to web content usually requires considerable server-side re-engineering, thus only a tiny fraction of all web content currently has semantic annotations. Recently, we announced Reflect (http://reflect.ws), a free service that takes a more practical approach: Reflect uses augmented browsing to allow end-users to add systematic semantic annotations to any web-page in real-time, typically within seconds. In this paper we describe the tagging process in detail and show how further entity types can be added to Reflect; we also describe how publishers and content providers can access Reflect programmatically using SOAP, REST (HTTP post), and JavaScript. Usage of Reflect has grown rapidly within the life sciences, and while currently only genes, protein and small molecule names are tagged, we plan to soon expand the scope to include a much broader range of terms (e. g., Wikipedia entries). The popularity of Reflect demonstrates the use and feasibility of letting end-users decide how and when to add semantic annotations. Ultimately, 'semantics is in the eye of the end-user', hence we believe end-user approaches such as Reflect will become increasingly important in semantic web technologies. [less ▲]

Detailed reference viewed: 151 (5 UL)
Full Text
Peer Reviewed
See detailMartini: using literature keywords to compare gene sets.
Soldatos, Theodoros G.; O'Donoghue, Sean I.; Satagopam, Venkata UL et al

in Nucleic acids research (2010), 38(1), 26-38

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing ... [more ▼]

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de. [less ▲]

Detailed reference viewed: 131 (8 UL)