![]() May, Patrick ![]() in Nucleic Acids Research (2010), 38(Database issue), 326-30 With growing amount of experimental data, the number of known protein structures also increases continuously. Classification of protein structures helps to understand relationships between protein ... [more ▼] With growing amount of experimental data, the number of known protein structures also increases continuously. Classification of protein structures helps to understand relationships between protein structure and function. The main classification methods based on secondary structures are SCOP, CATH and TOPS, which all classify under different aspects, and therefore can lead to different results. We developed a mathematically unique representation of protein structure topologies at a higher abstraction level providing new aspects of classification and enabling for a fast search through the data. Protein Topology Graph Library (PTGL; http://ptgl.zib.de) aims at providing a database on protein secondary structure topologies, including search facilities, the visualization as intuitive topology diagrams as well as in the 3D structure, and additional information. Secondary structure-based protein topologies are represented uniquely as undirected labeled graphs in four different ways allowing for exploration under different aspects. The linear notations, and the 2D and 3D diagrams of each notation facilitate a deeper understanding of protein topologies. Several search functions for topologies and sub-topologies, BLAST search possibility, and links to SCOP, CATH and PDBsum support individual and large-scale investigation of protein structures. Currently, PTGL comprises topologies of 54,859 protein structures. Main structural patterns for common structural motifs like TIM-barrel or Jelly Roll are pre-implemented, and can easily be searched. [less ▲] Detailed reference viewed: 119 (3 UL)![]() ![]() May, Patrick ![]() in Dubitzky, Werner; Schuster, Assaf; Sloot, Peterm A. (Eds.) et al Distributed, High-Performance and Grid Computing in Computational Biology (2007) During the last few years more and more functionalities of RNA have been discovered that were previously thought of being carried out by proteins alone. One of the most striking discoveries was the ... [more ▼] During the last few years more and more functionalities of RNA have been discovered that were previously thought of being carried out by proteins alone. One of the most striking discoveries was the detection of microRNAs, a class of noncoding RNAs that play an important role in post-transcriptional gene regulation. Large-scale analyses are needed for the still increasingly growing amount of sequence data derived from new experimental technologies. In this paper we present a framework for the detection of the distinctive precursor structure of microRNAS that is based on the well-known Smith-Waterman algorithm. By conducting the computation of the local alignment on a FPGA, we are able to gain a substantial speedup compared to a pure software implementation bringing together supercomputer performance and bioinformatics research. We conducted experiments on real genomic data and we found several new putative hits for microRNA precursor structures. [less ▲] Detailed reference viewed: 141 (5 UL)![]() ![]() May, Patrick ![]() in Nagel, Wolfgang E.; Walter, Wolfgang V.; Lehner, Wolfgang (Eds.) Euro-Par 2006 Parallel Processing (2006) In life sciences, scientists are confronted with an exponential growth of biological data, especially in the genomics and proteomics area. The efficient management and use of these data, and its ... [more ▼] In life sciences, scientists are confronted with an exponential growth of biological data, especially in the genomics and proteomics area. The efficient management and use of these data, and its transformation into knowledge are basic requirements for biological research. Therefore, integration of diverse applications and data from geographically distributed computing resources will become a major issue. We will present the status of our efforts for the realization of an automated protein prediction pipeline as an example for a complex biological workflow scenario in a Grid environment based on Web services. This case study demonstrates the ability of an easy orchestration of complex biological workflows based on Web services as building blocks and Triana as workflow engine. [less ▲] Detailed reference viewed: 245 (3 UL)![]() May, Patrick ![]() Report (2006) THESEUS, the ZIB threading environment, is a parallel implementation of a protein threading based on a multi-queued branch-and-bound optimal search algorithm to find the best sequence-to-structure ... [more ▼] THESEUS, the ZIB threading environment, is a parallel implementation of a protein threading based on a multi-queued branch-and-bound optimal search algorithm to find the best sequence-to-structure alignment through a library of template structures. THESEUS uses a template core model based on secondary structure definition and a scoring function based on knowledge-based potentials reflecting pairwise interactions and the chemical environment, as well as pseudo energies for homology detection, loop alignment, and secondary structure matching. The threading core is implemented in C++ as a SPMD parallization architecture using MPI for communication. The environment is designed for generic testing of different scoring functions, e.g. different secondary structure prediction terms, different scoring matrices and information derived from multiple sequence alignments. A validaton of the structure prediction results has been done on the basis of standard threading benchmark sets. THESEUS successfully participated in the 6th Critical Assessment of Techniques for Protein Structure Prediction (CASP) 2004. [less ▲] Detailed reference viewed: 96 (10 UL)![]() ![]() ; May, Patrick ![]() in BMC Bioinformatics (2006), 7 BACKGROUND: Protein-structure alignment is a fundamental tool to study protein function, evolution and model building. In the last decade several methods for structure alignment were introduced, but most ... [more ▼] BACKGROUND: Protein-structure alignment is a fundamental tool to study protein function, evolution and model building. In the last decade several methods for structure alignment were introduced, but most of them ignore that structurally similar proteins can share the same spatial arrangement of secondary structure elements (SSE) but differ in the underlying polypeptide chain connectivity (non-sequential SSE connectivity). RESULTS: We perform protein-structure alignment using a two-level hierarchical approach implemented in the program GANGSTA. On the first level, pair contacts and relative orientations between SSEs (i.e. alpha-helices and beta-strands) are maximized with a genetic algorithm (GA). On the second level residue pair contacts from the best SSE alignments are optimized. We have tested the method on visually optimized structure alignments of protein pairs (pairwise mode) and for database scans. For a given protein structure, our method is able to detect significant structural similarity of functionally important folds with non-sequential SSE connectivity. The performance for structure alignments with strictly sequential SSE connectivity is comparable to that of other structure alignment methods. CONCLUSION: As demonstrated for several applications, GANGSTA finds meaningful protein-structure alignments independent of the SSE connectivity. GANGSTA is able to detect structural similarity of protein folds that are assigned to different superfamilies but nevertheless possess similar structures and perform related functions, even if these proteins differ in SSE connectivity. [less ▲] Detailed reference viewed: 68 (2 UL) |
||