Clustering of cognate proteins among distinct proteomes derived from multiple links to a single seed sequence.

Algorithms; Amino Acid Sequence; Cluster Analysis; Molecular Sequence Data; Multigene Family; Pattern Recognition, Automated/methods; Proteome/chemistry; Sequence Alignment/methods; Sequence Analysis, Protein/methods; Sequence Homology, Amino Acid

Abstract :

[en] BACKGROUND: Modern proteomes evolved by modification of pre-existing ones. It is extremely important to comparative biology that related proteins be identified as members of the same cognate group, since a characterized putative homolog could be used to find clues about the function of uncharacterized proteins from the same group. Typically, databases of related proteins focus on those from completely-sequenced genomes. Unfortunately, relatively few organisms have had their genomes fully sequenced; accordingly, many proteins are ignored by the currently available databases of cognate proteins, despite the high amount of important genes that are functionally described only for these incomplete proteomes. RESULTS: We have developed a method to cluster cognate proteins from multiple organisms beginning with only one sequence, through connectivity saturation with that Seed sequence. We show that the generated clusters are in agreement with some other approaches based on full genome comparison. CONCLUSION: The method produced results that are as reliable as those produced by conventional clustering approaches. Generating clusters based only on individual proteins of interest is less time consuming than generating clusters for whole proteomes.

Research center :

- Luxembourg Centre for Systems Biomedicine (LCSB): Bioinformatics Core (R. Schneider Group)

Disciplines :

Biochemistry, biophysics & molecular biology

Identifiers :

UNILU:UL-ARTICLE-2012-041

Author, co-author :

BARBOSA DA SILVA, Adriano ; European Molecular Biology Laboratory - EMBL

SATAGOPAM, Venkata ; European Molecular Biology Laboratory - EMBL

SCHNEIDER, Reinhard ; European Molecular Biology Laboratory - EMBL

Ortega, J. Miguel

Language :

English

Title :

Clustering of cognate proteins among distinct proteomes derived from multiple links to a single seed sequence.

Publication date :

2008

Journal title :

BMC Bioinformatics

eISSN :

1471-2105

Publisher :

BioMed Central, United Kingdom

Volume :

Pages :

141

Peer reviewed :

Peer Reviewed verified by ORBi

Available on ORBilu :

since 30 June 2014

Statistics

Number of views

268 (6 by Unilu)

Number of downloads

144 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

WoS citations^™