privacy-preserving genomic data processing; copy number variation; applied cryptography
Abstract :
[en] Innovative pharma-genomics and personalized medicine services are now possible thanks to the availability
for processing and analysis of a large amount of genomic data. Operating on such databases, is possible
to test for predisposition to diseases by searching for genomic variants on whole genomes as well as on
exomes, which are collections of protein coding regions called exons. Genomic data are therefore shared
amongst research institutes, public/private operators, and third parties, creating issues of privacy, ethics, and
data protection because genome data are strictly personal and identifying. To prevent damages that could
follow a data breach—a likely threat nowadays—and to be compliant with current data protection regulations,
genomic data files should be encrypted, and the data processing algorithms should be privacy-preserving.
Such a migration is not always feasible: not all operations can be implemented straightforwardly to be privacypreserving; a privacy-preserving version of an algorithm may not be as accurate for the purpose of biomedical
analysis as the original; or the privacy-preserving version may not scale up when applied to genomic data
processing because of inefficiency in computation time. In this work, we demonstrate that at least for a wellknown genomic data procedure for the analysis of copy number variants called copy number variations (CNV)
a privacy-preserving analysis is possible and feasible. Our algorithm relies on Homomorphic Encryption, a
cryptographic technique to perform calculations directly on the encrypted data. We test our implementation for
performance and reliability, giving evidence that it is practical to study copy number variations and preserve
genomic data privacy. Our proof-of-concept application successfully and efficiently searches for a patient’s
somatic copy number variation changes by comparing the patient gene coverage in the whole exome with a
healthy control exome coverage. Since all the genomics data are securely encrypted, the data remain protected
even if they are transmitted or shared via an insecure environment like a public cloud. Being this the first study
for privacy-preserving copy number variation analysis, we demonstrate the potential of recent Homomorphic Encryption tools in genomic applications.
Disciplines :
Computer science
Author, co-author :
DEMIRCI, Huseyin ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > IRiSC
LENZINI, Gabriele ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > IRiSC
External co-authors :
no
Language :
English
Title :
Privacy-preserving Copy Number Variation Analysis with Homomorphic Encryption
Publication date :
2022
Event name :
15th International Joint Conference on Biomedical Engineering Systems and Technologies - Scale-IT-up