![]() ; ; et al in The American Journal of Human Genetics (2021) Summary Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation for the resulting phenotypic variation is unknown. As part of the ongoing Epi25 Collaboration, we ... [more ▼] Summary Both mild and severe epilepsies are influenced by variants in the same genes, yet an explanation for the resulting phenotypic variation is unknown. As part of the ongoing Epi25 Collaboration, we performed a whole-exome sequencing analysis of 13,487 epilepsy-affected individuals and 15,678 control individuals. While prior Epi25 studies focused on gene-based collapsing analyses, we asked how the pattern of variation within genes differs by epilepsy type. Specifically, we compared the genetic architectures of severe developmental and epileptic encephalopathies (DEEs) and two generally less severe epilepsies, genetic generalized epilepsy and non-acquired focal epilepsy (NAFE). Our gene-based rare variant collapsing analysis used geographic ancestry-based clustering that included broader ancestries than previously possible and revealed novel associations. Using the missense intolerance ratio (MTR), we found that variants in DEE-affected individuals are in significantly more intolerant genic sub-regions than those in NAFE-affected individuals. Only previously reported pathogenic variants absent in available genomic datasets showed a significant burden in epilepsy-affected individuals compared with control individuals, and the ultra-rare pathogenic variants associated with DEE were located in more intolerant genic sub-regions than variants associated with non-DEE epilepsies. MTR filtering improved the yield of ultra-rare pathogenic variants in affected individuals compared with control individuals. Finally, analysis of variants in genes without a disease association revealed a significant burden of loss-of-function variants in the genes most intolerant to such variation, indicating additional epilepsy-risk genes yet to be discovered. Taken together, our study suggests that genic and sub-genic intolerance are critical characteristics for interpreting the effects of variation in genes that influence epilepsy. [less ▲] Detailed reference viewed: 54 (2 UL)![]() ; ; et al in American Journal of Human Genetics (2019) Sequencing-based studies have identified novel risk genes associated with severe epilepsies and revealed an excess of rare deleterious variation in less-severe forms of epilepsy. To identify the shared ... [more ▼] Sequencing-based studies have identified novel risk genes associated with severe epilepsies and revealed an excess of rare deleterious variation in less-severe forms of epilepsy. To identify the shared and distinct ultra-rare genetic risk factors for different types of epilepsies, we performed a whole-exome sequencing (WES) analysis of 9,170 epilepsy-affected individuals and 8,436 controls of European ancestry. We focused on three phenotypic groups: severe developmental and epileptic encephalopathies (DEEs), genetic generalized epilepsy (GGE), and non-acquired focal epilepsy (NAFE). We observed that compared to controls, individuals with any type of epilepsy carried an excess of ultra-rare, deleterious variants in constrained genes and in genes previously associated with epilepsy; we saw the strongest enrichment in individuals with DEEs and the least strong in individuals with NAFE. Moreover, we found that inhibitory GABAA receptor genes were enriched for missense variants across all three classes of epilepsy, whereas no enrichment was seen in excitatory receptor genes. The larger gene groups for the GABAergic pathway or cation channels also showed a significant mutational burden in DEEs and GGE. Although no single gene surpassed exome-wide significance among individuals with GGE or NAFE, highly constrained genes and genes encoding ion channels were among the lead associations; such genes included CACNA1G, EEF1A2, and GABRG2 for GGE and LGI1, TRIM3, and GABRG2 for NAFE. Our study, the largest epilepsy WES study to date, confirms a convergence in the genetics of severe and less-severe epilepsies associated with ultra-rare coding variation, and it highlights a ubiquitous role for GABAergic inhibition in epilepsy etiology. [less ▲] Detailed reference viewed: 146 (7 UL)![]() ; May, Patrick ![]() E-print/Working paper (2019) Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to ... [more ▼] Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2,871 gene family protein sequence alignments involving 9,990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 65,034 missense variants from patients. With this gene family approach, we identified 398 regions enriched for patient variants spanning 33,887 amino acids in 1,058 genes. As a comparison, testing the same genes individually we identified less patient variant enriched regions involving only 2,167 amino acids and 180 genes. Next, we selected de novo variants from 6,753 patients with neurodevelopmental disorders and 1,911 unaffected siblings, and observed a 5.56-fold enrichment of patient variants in our identified regions (95% C.I. =2.76-Inf, p-value = 6.66×10−8). Using an independent ClinVar variant set, we found missense variants inside the identified regions are 111-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 111.48, 95% C.I = 68.09-195.58, p-value < 2.2e−16). All patient variant enriched regions identified (PERs) are available online through a user-friendly platform for interactive data mining, visualization and download at http://per.broadinstitute.org. In summary, our gene family burden analysis approach identified novel patient variant enriched regions in protein sequences. This annotation can empower variant interpretation. [less ▲] Detailed reference viewed: 133 (0 UL)![]() ; ; et al in Biophysical Journal (2018, February 02), 114(3, Suppl. 1), 664 The functional interpretation of genetic variation in disease-associated genes is far outpaced by data generation. Existing algorithms for prediction of variant consequences do not adequately distinguish ... [more ▼] The functional interpretation of genetic variation in disease-associated genes is far outpaced by data generation. Existing algorithms for prediction of variant consequences do not adequately distinguish pathogenic variants from benign rare variants. This lack of statistical and bioinformatics analyses, accompanied by an ever-increasing number of identified variants in biomedical research and clinical applications, has become a major challenge. Established methods to predict the functional effect of genetic variation use the degree of amino acid conservation across species in linear protein sequence alignment. More recent methods include the spatial distribution pattern of known patient and control variants. Here, we propose to combine the linear conservation and spatial constrained based scores to devise a novel score that incorporates 3-dimensional structural properties of amino acid residues, such as the solvent-accessible surface area, degree of flexibility, secondary structure propensity and binding tendency, to quantify the effect of amino acid substitutions. For this study, we develop a framework for large-scale mapping of established linear sequence-based paralog and ortholog conservation scores onto the tertiary structures of human proteins. This framework can be utilized to map the spatial distribution of mutations on solved protein structures as well as homology models. As a proof of concept, using a homology model of the human Nav1.2 voltage-gated sodium channel structure, we observe spatial clustering in distinct domains of mutations, associated with Autism Spectrum Disorder (>20 variants) and Epilepsy (>100 variants), that exert opposing effects on channel function. We are currently characterizing all variants (>300k individuals) found in ClinVar, the largest disease variant database, as well as variants identified in >140k individuals from general population. The variant mapping framework and our score, informed with structural information, will be useful in identifying structural motifs of proteins associated with disease risk. [less ▲] Detailed reference viewed: 131 (2 UL) |
||