![]() ; May, Patrick ![]() in Genome Research (2020), 30(1), 62-71 Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to ... [more ▼] Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2,871 gene family protein sequence alignments involving 9,990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 76,153 missense variants from patients. With this gene family approach, we identified 465 regions enriched for patient variants spanning 41,463 amino acids in 1,252 genes. As a comparison, testing the same genes individually we identified less patient variant enriched regions involving only 2,639 amino acids and 215 genes. Next, we selected de novo variants from 6,753 patients with neurodevelopmental disorders and 1,911 unaffected siblings, and observed an 8.33-fold enrichment of patient variants in our identified regions (95% C.I.=3.90-Inf, p-value = 2.72x10-11). Using the complete ClinVar variant set, we found that missense variants inside the identified regions are 106-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 106.15, 95% C.I = 70.66-Inf, p-value < 2.2 x 10-16). All pathogenic variant enriched regions (PERs) identified are available online through the “PER viewer” a user-friendly online platform for interactive data mining, visualization and download. In summary, our gene family burden analysis approach identified novel pathogenic variant enriched regions in protein sequences. This annotation can empower variant interpretation. [less ▲] Detailed reference viewed: 239 (2 UL)![]() ; May, Patrick ![]() in European Journal of Human Genetics (2019) It is challenging to estimate genetic variant burden across different subtypes of epilepsy. Herein, we used a comparative approach to assess the diagnostic yield and genotype-phenotype correlations in the ... [more ▼] It is challenging to estimate genetic variant burden across different subtypes of epilepsy. Herein, we used a comparative approach to assess the diagnostic yield and genotype-phenotype correlations in the four most common brain lesions in patients with drug-resistant focal epilepsy. Targeted sequencing analysis was performed for a panel of 161 genes with a mean coverage of > 400x. Lesional tissue was histopathologically reviewed and dissected from hippocampal sclerosis (n=15), ganglioglioma (n=16), dysembryoplastic neuroepithelial tumors (n=8) and ocal cortical dysplasia type II (n=15). Peripheral blood (n=12) or surgical tissue samples histopathologically classified as lesion-free (n=42) were available for comparison. Variants were classified as pathogenic or likely pathogenic according to American College of Medical Genetics and Genomics guidelines. Overall, we identified pathogenic and likely pathogenic variants in 25.9% of patients with a mean coverage of 383x. The highest number of pathogenic/ likely pathogenic variants was observed in patients with ganglioglioma (43.75%; all somatic) and dysembryoplastic neuroepithelial tumors (37.5%; all somatic), and in 20% of cases with focal cortical dysplasia type II (13.33% somatic, 6.67% germline). Pathogenic/likely pathogenic positive genes were disorder-specific and BRAF V600E the only recurrent pathogenic variant. This study represents a reference for diagnostic yield across the four most common lesion entities in patients with drug-resistant focal epilepsy. The observed large variability in variant burden by epileptic lesion type calls for whole exome sequencing of histopathologically well characterized tissue in a diagnostic setting and in research to discover novel disease-associated genes. [less ▲] Detailed reference viewed: 113 (2 UL)![]() ; May, Patrick ![]() E-print/Working paper (2019) Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to ... [more ▼] Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2,871 gene family protein sequence alignments involving 9,990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 65,034 missense variants from patients. With this gene family approach, we identified 398 regions enriched for patient variants spanning 33,887 amino acids in 1,058 genes. As a comparison, testing the same genes individually we identified less patient variant enriched regions involving only 2,167 amino acids and 180 genes. Next, we selected de novo variants from 6,753 patients with neurodevelopmental disorders and 1,911 unaffected siblings, and observed a 5.56-fold enrichment of patient variants in our identified regions (95% C.I. =2.76-Inf, p-value = 6.66×10−8). Using an independent ClinVar variant set, we found missense variants inside the identified regions are 111-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 111.48, 95% C.I = 68.09-195.58, p-value < 2.2e−16). All patient variant enriched regions identified (PERs) are available online through a user-friendly platform for interactive data mining, visualization and download at http://per.broadinstitute.org. In summary, our gene family burden analysis approach identified novel patient variant enriched regions in protein sequences. This annotation can empower variant interpretation. [less ▲] Detailed reference viewed: 133 (0 UL)![]() ; ; et al in Bioinformatics (2019) The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same ... [more ▼] The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same gene. However, most of the existing variant annotation tools do not reference the score range of benign population variants on gene level. Here, we present a web-application, Variant Score Ranker, which enables users to rapidly annotate variants and perform gene-specific variant score ranking on the population level. We also provide an intuitive example of how gene- and population-calibrated variant ranking scores can improve epilepsy variant prioritization. [less ▲] Detailed reference viewed: 88 (4 UL)![]() ; ; et al in Epilepsia (2018) Objective: Increasing availability of surgically resected brain tissue from patients with focal epilepsy and Focal Cortical Dysplasia (FCD) or low-grade glio-neuronal tumors has fostered large-scale ... [more ▼] Objective: Increasing availability of surgically resected brain tissue from patients with focal epilepsy and Focal Cortical Dysplasia (FCD) or low-grade glio-neuronal tumors has fostered large-scale genetic examination. However, assessment of pathogenicity of germline and somatic variants remains difficult. Here, we present a state of the art evaluation of reported genes and variants associated with epileptic brain lesions. Methods: We critically re-evaluated the pathogenicity for all neuropathology-associated variants reported to date in PubMed and ClinVar databases including 101 neuropathology-associated missense variants encompassing 11 disease-related genes. We assessed gene variant tolerance and classified all identified missense variants according to guidelines from the American College of Medical Genetics and Genomics (ACMG). We further extended the bioinformatic variant prediction by introducing a novel gene-specific deleteriousness ranking for prediction scores. Results: Application of ACMG guidelines and in silico gene variant tolerance analysis classified only seven out of 11 genes to be likely disease-associated according to the reported a disease mechanism, while 61 (60.4%) of 101 variants of those genes were classified as of uncertain significance (VUS), 37 (36.6%) as being likely pathogenic (LP) and 3 (3%) as being pathogenic (P). Significance: We concluded that the majority of neuropathology-associated variants reported to date do not have enough evidence to be classified as pathogenic. Interpretation of lesion-associated variants remains challenging and application of current ACMG guidelines is recommended for interpretation and prediction. [less ▲] Detailed reference viewed: 149 (4 UL) |
||