![]() Pascoal, Túlio ![]() in Proceedings on Privacy Enhancing Technologies (2023) Genome-wide Association Studies (GWASes) identify genomic variations that are statistically associated with a trait, such as a disease, in a group of individuals. Unfortunately, careless sharing of GWAS ... [more ▼] Genome-wide Association Studies (GWASes) identify genomic variations that are statistically associated with a trait, such as a disease, in a group of individuals. Unfortunately, careless sharing of GWAS statistics might give rise to privacy attacks. Several works attempted to reconcile secure processing with privacy-preserving releases of GWASes. However, we highlight that these approaches remain vulnerable if GWASes utilize overlapping sets of individuals and genomic variations. In such conditions, we show that even when relying on state-of-the-art techniques for protecting releases, an adversary could reconstruct the genomic variations of up to 28.6% of participants, and that the released statistics of up to 92.3% of the genomic variations would enable membership inference attacks. We introduce I-GWAS, a novel framework that securely computes and releases the results of multiple possibly interdependent GWASes. I-GWAS continuously releases privacy-preserving and noise-free GWAS results as new genomes become available. [less ▲] Detailed reference viewed: 67 (12 UL)![]() Pascoal, Túlio ![]() ![]() in ACM/IFIP International Middleware Conference (2022, November) Genome-wide association studies (GWAS) identify correlations between the genetic variants and an observable characteristic such as a disease. Previous works presented privacy-preserving distributed ... [more ▼] Genome-wide association studies (GWAS) identify correlations between the genetic variants and an observable characteristic such as a disease. Previous works presented privacy-preserving distributed algorithms for a federation of genome data holders that spans multiple institutional and legislative domains to securely compute GWAS results. However, these algorithms have limited applicability, since they still require a centralized instance to decide whether GWAS results can be safely disclosed, which is in violation to privacy regulations, such as GDPR. In this work, we introduce GenDPR, a distributed middleware that leverages Trusted Execution Environments (TEEs) to securely determine a subset of the potential GWAS statistics that can be safely released. GenDPR achieves the same accuracy as centralized solutions, but requires transferring significantly less data because TEEs only exchange intermediary results but no genomes. Additionally, GenDPR can be configured to tolerate all-but-one honest-but-curious federation members colluding with the aim to expose genomes of correct members. [less ▲] Detailed reference viewed: 46 (16 UL)![]() Pascoal, Túlio ![]() Doctoral thesis (2022) Understanding the interplay between genomics and human health is a crucial step for the advancement and development of our society. Genome-Wide Association Study (GWAS) is one of the most popular methods ... [more ▼] Understanding the interplay between genomics and human health is a crucial step for the advancement and development of our society. Genome-Wide Association Study (GWAS) is one of the most popular methods for discovering correlations between genomic variations associated with a particular phenotype (i.e., an observable trait such as a disease). Leveraging genome data from multiple institutions worldwide nowadays is essential to produce more powerful findings by operating GWAS at larger scale. However, this raises several security and privacy risks, not only in the computation of such statistics, but also in the public release of GWAS results. To that extent, several solutions in the literature have adopted cryptographic approaches to allow secure and privacy-preserving processing of genome data for federated analysis. However, conducting federated GWAS in a secure and privacy-preserving manner is not enough since the public releases of GWAS results might be vulnerable to known genomic privacy attacks, such as recovery and membership attacks. The present thesis explores possible solutions to enable end-to-end privacy-preserving federated GWAS in line with data privacy regulations such as GDPR to secure the public release of the results of Genome-Wide Association Studies (GWASes) that are dynamically updated as new genomes become available, that might overlap with their genomes and considered locations within the genome, that can support internal threats such as colluding members in the federation and that are computed in a distributed manner without shipping actual genome data. While achieving these goals, this work created several contributions described below. First, the thesis proposes DyPS, a Trusted Execution Environment (TEE)-based framework that reconciles efficient and secure genome data outsourcing with privacy-preserving data processing inside TEE enclaves to assess and create private releases of dynamic GWAS. In particular, DyPS presents the conditions for the creation of safe dynamic releases certifying that the theoretical complexity of the solution space an external probabilistic polynomial-time (p.p.t.) adversary or a group of colluders (up to all-but-one parties) would need to infer when launching recovery attacks on the observation of GWAS statistics is large enough. Besides that, DyPS executes an exhaustive verification algorithm along with a Likelihood-ratio test to measure the probability of identifying individuals in studies. Thus, also protecting individuals against membership inference attacks. Only safe genome data (i.e., genomes and SNPs) that DyPS selects are further used for the computation and release of GWAS results. At the same time, the remaining (unsafe) data is kept secluded and protected inside the enclave until it eventually can be used. Our results show that if dynamic releases are not improperly evaluated, up to 8% of genomes could be exposed to genomic privacy attacks. Moreover, the experiments show that DyPS’ TEE-based architecture can accommodate the computational resources demanded by our algorithms and present practical running times for larger-scale GWAS. Secondly, the thesis offers I-GWAS that identifies the new conditions for safe releases when considering the existence of overlapping data among multiple GWASes (e.g., same individuals participating in several studies). Indeed, it is shown that adversaries might leverage information of overlapping data to make both recovery and membership attacks feasible again (even if they are produced following the conditions for safe single-GWAS releases). Our experiments show that up to 28.6% of genetic variants of participants could be inferred during recovery attacks, and 92.3% of these variants would enable membership attacks from adversaries observing overlapping studies, which are withheld by I-GWAS. Lastly yet importantly, the thesis presents GenDPR, which encompasses extensions to our protocols so that the privacy-verification algorithms can be conducted distributively among the federation members without demanding the outsourcing of genome data across boundaries. Further, GenDPR can also cope with collusion among participants while selecting genome data that can be used to create safe releases. Additionally, GenDPRproduces the same privacy guarantees as centralized architectures, i.e., it correctly identifies and selects the same data in need of protection as with centralized approaches. In the end, the thesis presents a homogenized framework comprising DyPS, I-GWAS and GenDPR simultaneously. Thus, offering a usable approach for conducting practical GWAS. The method chosen for protection is of a statistical nature, ensuring that the theoretical complexity of attacks remains high and withholding releases of statistics that would impose membership inference risks to participants using Likelihood-ratio tests, despite adversaries gaining additional information over time, but the thesis also relates the findings to techniques that can be leveraged to protect releases (such as Differential Privacy). The proposed solutions leverage Intel SGX as Trusted Execution Environment to perform selected critical operations in a performant manner, however, the work translates equally well to other trusted execution environments and other schemes, such as Homomorphic Encryption. [less ▲] Detailed reference viewed: 166 (14 UL)![]() Pascoal, Túlio ![]() ![]() ![]() Scientific Conference (2022, July 11) The popularization of large-scale federated Genome-Wide Association Study (GWAS) where multiple data owners share their genome data to conduct federated analytics uncovers new privacy issues that have ... [more ▼] The popularization of large-scale federated Genome-Wide Association Study (GWAS) where multiple data owners share their genome data to conduct federated analytics uncovers new privacy issues that have remained unnoticed or not given proper attention. Indeed, as soon as a diverse type of interested parties (e.g., private or public biocenters and governmental institutions from around the globe) and individuals from heterogeneous populations are participating in cooperative studies, interdependent and multi-party privacy appear as crucial issues that are currently not adequately assessed. In fact, in federated GWAS environments, the privacy of individuals and parties does not depend solely on their own behavior anymore but also on others, because a collaborative environment opens new credible adversary models. For instance, one might want to tailor the privacy guarantees to withstand the presence of potentially colluding federation members aiming to violate other members' data privacy and the privacy deterioration that might occur in the presence of interdependent genomic data (e.g., due to the presence of relatives in studies or the perpetuation of previous genomic privacy leaks in future studies). In this work, we catalog and discuss the features, unsolved problems, and challenges to tackle toward truly end-to-end private and practical federated GWAS. [less ▲] Detailed reference viewed: 85 (9 UL)![]() Pascoal, Túlio ![]() Presentation (2021, September 22) Genome-Wide Association Studies (GWAS) identify the genomic variations that are statistically associated with a particular phenotype (e.g., a disease). GWAS results, i.e., statistics, benefit research and ... [more ▼] Genome-Wide Association Studies (GWAS) identify the genomic variations that are statistically associated with a particular phenotype (e.g., a disease). GWAS results, i.e., statistics, benefit research and personalized medicine. The confidence in GWAS increases with the number of genomesanalyzed, which encourages federated computations where biocenters periodically include newly sequenced genomes. However, for legal and economical reasons, this collaboration can only happen if a release of GWAS results never jeopardizes the genomic privacy of data donors, if biocenters retain ownership and cannot learn each others’ data. Furthermore, given the reduced cost of sequencing DNA nowadays, there is now a need to update GWAS results in a dynamic manner, while enabling donors to withdraw consent at any time. Therefore, two challenges need to be simultaneously addressed to enable federated and dynamic GWAS: (i) the computation of GWAS statistics must rely on secure and privacy-preserving methods; and (ii) GWAS results that are publicly released should not allow any form of privacy attack. In this talk, we will introduce the problem we consider in more detail and present DyPS, the framework we have designed and recently presented at the Privacy Enhancing Technologies Symposium (PETS). We refer the reader to the full paper1 for the details we cannot cover in this short version. [less ▲] Detailed reference viewed: 125 (25 UL)![]() Pascoal, Túlio ![]() ![]() in Proceedings on Privacy Enhancing Technologies (2021) Genome-Wide Association Studies (GWAS) identify the genomic variations that are statistically associated with a particular phenotype (e.g., a disease). The confidence in GWAS results increases with the ... [more ▼] Genome-Wide Association Studies (GWAS) identify the genomic variations that are statistically associated with a particular phenotype (e.g., a disease). The confidence in GWAS results increases with the number of genomes analyzed, which encourages federated computations where biocenters would periodically share the genomes they have sequenced. However, for economical and legal reasons, this collaboration will only happen if biocenters cannot learn each others’ data. In addition, GWAS releases should not jeopardize the privacy of the individuals whose genomes are used. We introduce DyPS, a novel framework to conduct dynamic privacy-preserving federated GWAS. DyPS leverages a Trusted Execution Environment to secure dynamic GWAS computations. Moreover, DyPS uses a scaling mechanism to speed up the releases of GWAS results according to the evolving number of genomes used in the study, even if individuals retract their participation consent. Lastly, DyPS also tolerates up to all-but-one colluding biocenters without privacy leaks. We implemented and extensively evaluated DyPS through several scenarios involving more than 6 million simulated genomes and up to 35,000 real genomes. Our evaluation shows that DyPS updates test statistics with a reasonable additional request processing delay (11% longer) compared to an approach that would update them with minimal delay but would lead to 8% of the genomes not being protected. In addition, DyPS can result in the same amount of aggregate statistics as a static release (i.e., at the end of the study), but can produce up to 2.6 times more statistics information during earlier dynamic releases. Besides, we show that DyPS can support a larger number of genomes and SNP positions without any significant performance penalty. [less ▲] Detailed reference viewed: 385 (60 UL)![]() Pascoal, Tulio ![]() in Computer Networks (2020) Software Defined Networking (SDN) is a network paradigm that decouples the network’s control plane, delegated to the SDN controller, from the data plane, delegated to SDN switches. For increased ... [more ▼] Software Defined Networking (SDN) is a network paradigm that decouples the network’s control plane, delegated to the SDN controller, from the data plane, delegated to SDN switches. For increased efficiency, SDN switches use a high-performance Ternary Content-Addressable memory (TCAM) to install rules. However, due to the TCAM’s high cost and power consumption, switches have a limited amount of TCAM memory. Consequently, a limited number of rules can be installed. This limitation has been exploited to carry out Distributed Denial of Service (DDoS) attacks, such as Saturation attacks, that generate large amounts of traffic. Inspired by slow application layer DDoS attacks, this paper presents and investigates DDoS attacks on SDN that do not require large amounts of traffic, thus bypassing existing defenses that are triggered by traffic volume. In particular, we offer two slow attacks on SDN. The first attack, called Slow TCAM Exhaustion attack (Slow-TCAM), is able to consume all SDN switch’s TCAM memory by forcing the installation of new forwarding rules and maintaining them indeterminately active, thus disallowing new rules to be installed to serve legitimate clients. The second attack, called Slow Saturation attack, combines Slow-TCAM attack with a lower rate instance of the Saturation attack. A Slow Saturation attack is capable of denying service using a fraction of the traffic of typical Saturation attacks. Moreover, the Slow Saturation attack can also impact installed legitimate rules, thus causing a greater impact than the Slow-TCAM attack. In addition, it also affects the availability of other network’s components, e.g., switches, even the ones not being directly targeted by the attack, as has been proven by our experiments. We propose a number of variations of these attacks and demonstrate their effectiveness by means of an extensive experimental evaluation. The Slow-TCAM is able to deny service to legitimate clients requiring only 38 seconds and sending less than 40 packets per second without abruptly changing network resources, such as CPU and memory. Moreover, besides denying service as a Slow-TCAM attack, the Slow Saturation attack can also disrupt multiple SDN switches (not only the targeted ones) by sending a lower-rate traffic when compared to current known Saturation attacks. [less ▲] Detailed reference viewed: 95 (10 UL) |
||