Amino Acid Motifs/genetics; Amino Acid Sequence; Animals; Base Sequence; Databases as Topic; Eukaryota/genetics; Genome, Protozoan; Molecular Sequence Data; Protein Structure, Tertiary/genetics; Protozoan Proteins/chemistry; Sequence Alignment
[en] Whole sequencing of protozoan trypanosomatid genomes revealed the presence of several predicted unknown genes coding for hypothetical proteins. Pairwise, alignment-based, computational methods available online are unable to identify the function of these sequences. To detect clues to identify the function of hypothetical proteins, a user-friendly, bioinformatic tool named PROTOzoan Gene Identification Motifs (PROTOGIM, available on http://www.biowebdb.org/protogim ) was developed, which allows the user to search functional patterns of hypothetical proteins through the screening of regular expression in the sequences. The analysis of 1,194 trypanosomatid hypothetical proteins through PROTOGIM resulted in an identification of motifs and domains in 98% of the cases, demonstrating the reliability and accuracy of the employed method. The added value of this tool is the possibility to modify or insert new regular expressions to perform an analysis against either one or several sequences at the same time. An in silico strategy along with biochemical and molecular characterizations creates new possibilities to find the functions of hypothetical proteins at the postgenome era.