[en] Recent breakthroughs in genomic sequencing led to an enormous increase of DNA sampling rates, which in turn favored the use of clouds to efficiently process huge amounts of genomic data. However, while allowing possible achievements in personalized medicine and related areas, cloud-based processing of genomic information also entails significant privacy risks, asking for increased protection. In this paper, we focus on the first, but also most data-intensive, processing step of the genomics information processing pipeline: the alignment of raw genomic data samples (called reads) to a synthetic human reference genome. Even though privacy-preserving alignment solutions (e.g., based on homomorphic encryption) have been proposed, their slow performance encourages alternatives based on trusted execution environments, such as Intel SGX, to speed up secure alignment. Such alternatives have to deal with data structures whose size by far exceeds secure enclave memory, requiring the alignment code to reach out into untrusted memory. We highlight how sensitive genomic information can be leaked when those enclave-external alignment data structures are accessed, and suggest countermeasures to prevent privacy breaches. The overhead of these countermeasures indicate that the competitiveness of a privacy-preserving enclave-based alignment has yet to be precisely evaluated.
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Critical and Extreme Security and Dependability Research Group (CritiX)
FnR ; FNR8149128 > Paulo Esteves-Veríssimo > IISD > Strategic Rtnd Program On Information Infrastructure Security And Dependability > 01/01/2015 > 31/12/2019 > 2014