References of "Killcoyne, Sarah 50002090"
     in
Bookmark and Share    
Full Text
See detailInsilico genomes for high-throughput sequencing cancer-specific analysis
Killcoyne, Sarah UL

Doctoral thesis (2015)

As a genomic disease cancer is unique in that the entire genome can be highly unstable, with new mutations accumulating at a rapid rate and massive alterations to the chromosomal structure. Structural ... [more ▼]

As a genomic disease cancer is unique in that the entire genome can be highly unstable, with new mutations accumulating at a rapid rate and massive alterations to the chromosomal structure. Structural aberrations can be highly significant to a patient’s disease, resulting in aberrant proteins that can drive a cancer to progress faster or metastasize. Such aberrations may also have more subtle effects, enabling the cellular population to more rapidly develop drug resistance or simply generate highly diverse populations within a tumor making targeted therapies less effective. In fact it is these diverse or heterogeneous cellular populations, with highly mutated and frequently structurally aberrant genomes, that make understanding the extent of a tumor genome’s variation so challenging. Large scale sequencing efforts through the Cancer Genome Atlas and the International Cancer Genome Consortium have sequenced thousands of cancer genomes, and while small-scale variants have enabled researchers to begin to trace the evolutionary history and diversity of tumor genomes, large-scale structural variations have continued to be difficult to identify.Current methods and technologies for short-read sequencing generally rely on fitting genomes to a single reference assembly that is assumed to be representative of all individuals. Tumor genomes, which consist of heterogeneous cellular populations with unique aberrations can vary significantly from a ‘normal’ genome. This means that such single references are poor representations of a cancerous cell population, and so methods that rely less directly on the reference offer better opportunities to investigate these aberrations. In this project, a new method for large-scale structural variant identification, called MultiSieve, is proposed. This method uses prior knowledge to generate and test multiple references for each patient genome. Validation using simulated data establishes the utility of the method, and a comparison with commonly used methods demonstrates that MultiSieve is capable of finding variations often missed by traditional methods and that there are likely to be more structural variants in patients than have been identified previously. [less ▲]

Detailed reference viewed: 124 (12 UL)
Full Text
Peer Reviewed
See detailIdentification of large-scale genomic variation in cancer genomes using in silico reference models
Killcoyne, Sarah UL; del Sol Mesa, Antonio UL

in Nucleic Acids Research (2015)

Identifying large-scale structural variation in cancer genomes continues to be a challenge to researchers. Current methods rely on genome alignments based on a reference that can be a poor fit to highly ... [more ▼]

Identifying large-scale structural variation in cancer genomes continues to be a challenge to researchers. Current methods rely on genome alignments based on a reference that can be a poor fit to highly variant and complex tumor genomes. To address this challenge we developed a method that uses available breakpoint information to generate models of structural variations. We use these models as references to align previously unmapped and discordant reads from a genome. By using these models to align unmapped reads, we show that our method can help to identify large-scale variations that have been previously missed. [less ▲]

Detailed reference viewed: 175 (55 UL)
Full Text
Peer Reviewed
See detailHydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.
Lewis, Steven; Csordas, Attila; Killcoyne, Sarah UL et al

in BMC Bioinformatics (2012), 13

BACKGROUND: For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post ... [more ▼]

BACKGROUND: For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. RESULTS: We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. CONCLUSION: The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. [less ▲]

Detailed reference viewed: 88 (9 UL)