Article (Scientific journals)
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.
Lewis, Steven; Csordas, Attila; Killcoyne, Sarah et al.
2012In BMC Bioinformatics, 13, p. 324
Peer Reviewed verified by ORBi
 

Files


Full Text
hydra-paper.pdf
Publisher postprint (744.61 kB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Proteomics; High-performance computing
Abstract :
[en] BACKGROUND: For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. RESULTS: We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. CONCLUSION: The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.
Research center :
- Luxembourg Centre for Systems Biomedicine (LCSB): Computational Biology (Del Sol Group)
Disciplines :
Computer science
Biochemistry, biophysics & molecular biology
Author, co-author :
Lewis, Steven;  Institute for Systems Biology
Csordas, Attila;  EMBL European Bioinformatics Institute > PRIDE Group Proteomics Services Team
Killcoyne, Sarah ;  University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Hermjakob, Henning;  EMBL European Bioinformatics Institute > PRIDE Group Proteomics Services Team
Hoopmann, Michael R.;  Institute for Systems Biology
Moritz, Robert L.;  Institute for Systems Biology
Deutsch, Eric W.;  Institute for Systems Biology
Boyle, John;  Institute for Systems Biology
External co-authors :
yes
Language :
English
Title :
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.
Publication date :
2012
Journal title :
BMC Bioinformatics
ISSN :
1471-2105
Publisher :
BioMed Central, United Kingdom
Volume :
13
Pages :
324
Peer reviewed :
Peer Reviewed verified by ORBi
European Projects :
FP7 - 260558 - PROTEOMEXCHANGE - International Data Exchange and Data Representation Standards for Proteomics
Funders :
NIGMS (USA) R01GM087221
NCI (USA) R01CA137442
CE - Commission Européenne [BE]
Available on ORBilu :
since 24 April 2013

Statistics


Number of views
112 (7 by Unilu)
Number of downloads
125 (2 by Unilu)

Scopus citations®
 
46
Scopus citations®
without self-citations
45
OpenCitations
 
44
WoS citations
 
36

Bibliography


Similar publications



Contact ORBilu