References of "Barbosa-Silva, A."
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailCaipirini: using gene sets to rank literature
Soldatos, Theodoros G.; O’Donoghue, S. I.; Satagopam, V. P. et al

in BioData Mining (2012), 5(1),

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists ... [more ▼]

Background: Keeping up-to-date with bioscience literature is becoming increasingly challenging. Several recent methods help meet this challenge by allowing literature search to be launched based on lists of abstracts that the user judges to be ‘interesting’. Some methods go further by allowing the user to provide a second input set of ‘uninteresting’ abstracts; these two input sets are then used to search and rank literature by relevance. In this work we present the service ‘Caipirini’ (http:// caipirini.org) that also allows two input sets, but takes the novel approach of allowing ranking of literature based on one or more sets of genes. Results: To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in Arabidopsis thaliana. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets. Conclusions: To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments. [less ▲]

Detailed reference viewed: 107 (2 UL)
Full Text
Peer Reviewed
See detailA procedure to recruit members to enlarge protein family databases--the building of UECOG (UniRef-Enriched COG Database) as a model.
Fernandes, G. R.; Barbosa Da Silva, Adriano UL; Prosdocimi, F. et al

in Genetics and Molecular Research (2008), 7(3), 910-24

A procedure to recruit members to enlarge protein family databases is described here. The procedure makes use of UniRef50 clusters produced by UniProt. Current family entries are used to recruit ... [more ▼]

A procedure to recruit members to enlarge protein family databases is described here. The procedure makes use of UniRef50 clusters produced by UniProt. Current family entries are used to recruit additional members based on the UniRef50 clusters to which they belong. Only those additional UniRef50 members that are not fragments and whose length is within a restricted range relative to the original entry are recruited. The enriched dataset is then limited to contain only genomes from selected clades. We used the COG database - used for genome annotation and for studies of phylogenetics and gene evolution - as a model. To validate the method, a UniRef-Enriched COG0151 (UECOG) was tested with distinct procedures to compare recruited members with the recruiters: PSI-BLAST, secondary structure overlap (SOV), Seed Linkage, COGnitor, shared domain content, and neighbor-joining single-linkage, and observed that the former four agree in their validations. Presently, the UniRef50-based recruitment procedure enriches the COG database for Archaea, Bacteria and its subgroups Actinobacteria, Firmicutes, Proteobacteria, and other bacteria by 2.2-, 8.0-, 7.0-, 8.8-, 8.7-, and 4.2-fold, respectively, in terms of sequences, and also considerably increased the number of species. [less ▲]

Detailed reference viewed: 135 (2 UL)