[en] Neurodevelopmental disorders (NDDs), including severe pediatric epilepsy, autism, and intellectual disabilities are heterogeneous conditions in which clinical genetic testing can often identify a pathogenic variant. For many of them, genetic therapies will be tested in this or the coming years in clinical trials. In contrast to first-generation symptomatic treatments, the new disease-modifying precision medicines require a genetic test-informed diagnosis before a patient can be enrolled in a clinical trial. However, even in 2022, most identified genetic variants in NDD genes are ‘Variants of Uncertain Significance’. To safely enroll patients in precision medicine clinical trials, it is important to increase our knowledge about which regions in NDD-associated proteins can ‘tolerate’ missense variants and which ones are ‘essential’ and will cause a NDD when mutated. In addition, knowledge about functionally indispensable regions in the three-dimensional (3D) structure context of proteins can also provide insights into the molecular mechanisms of disease variants. We developed a novel consensus approach that overlays evolutionary, and population based genomic scores to identify 3D essential sites (Essential3D) on protein structures. After extensive benchmarking of AlphaFold predicted and experimentally solved protein structures, we generated the currently largest expert curated protein structure set for 242 NDDs and identified 14,377 Essential3D sites across 189 gene disorders associated proteins. We demonstrate that the consensus annotation of Essential3D sites improves prioritization of disease mutations over single annotations. The identified Essential3D sites were enriched for functional features such as intermembrane regions or active sites and discovered key inter-molecule interactions in protein complexes that were otherwise not annotated. Using the currently largest autism, developmental disorders, and epilepsies exome sequencing studies including >360,000 NDD patients and population controls, we found that missense variants at Essential3D sites are 8-fold enriched in patients. In summary, we developed a comprehensive protein structure set for 242 neurodevelopmental disorders and identified 14,377 Essential3D sites in these. All data are available at https://es-ndd.broadinstitute.org for interactive visual inspection to enhance variant interpretation and development of mechanistic hypotheses for 242 NDDs genes. The provided resources will enhance clinical variant interpretation and in silico drug target development for NDD-associated genes and encoded proteins.
Research center :
- Luxembourg Centre for Systems Biomedicine (LCSB): Bioinformatics Core (R. Schneider Group)
Disciplines :
Neurology Genetics & genetic processes
Author, co-author :
Iqbal, Sumaiya
Brünger, Tobias
Pérez-Palma, Eduardo
Macnee, Marie
Brunklaus, Andreas
Daly, Mark J.
Campbell, Arthur J.
Hoksza, David
MAY, Patrick ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > Bioinformatics Core
Lal, Dennis
External co-authors :
yes
Language :
English
Title :
Delineation of functionally essential protein regions for 242 neurodevelopmental genes
Emerson E. Deprivation, ethnicity and the prevalence of intellectual and developmental disabilities. J Epidemiol Community Health. 2012;66:218-224.
Parenti I, Rabaneda LG, Schoen H, Novarino G. Neurodevelopmental disorders: From genetics to functional pathways. Trends Neurosci. 2020;43:608-621.
Thapar A, Cooper M, Rutter M. Neurodevelopmental disorders. Lancet Psychiatry. 2017;4:339-346.
Morris-Rosendahl DJ, Crocq MA. Neurodevelopmental disorders-the history and future of a diagnostic concept. Dialogues Clin Neurosci. 2020;22:65-72.
Jarmasz JS, Basalah DA, Chudley AE, Del Bigio MR. Human brain abnormalities associated with prenatal alcohol exposure and fetal alcohol spectrum disorder. J Neuropathol Exp Neurol. 2017;76:813-833.
Goeden N, Velasquez J, Arnold KA, et al. Maternal inflammation disrupts fetal neurodevelopment via increased placental output of serotonin to the fetal brain. J Neurosci. 2016;36:6041-6049.
Satterstrom FK, Kosmicki JA, Wang J, et al. Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell. 2020;180:568-584.e23.
Sanders SJ, He X, Willsey AJ, et al. Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron. 2015;87:1215-1233.
Heyne HO, Singh T, Stamberger H, et al. De novo variants in neurodevelopmental disorders with epilepsy. Nat Genet. 2018;50: 1048-1053.
Singh T, Walters JTR, Johnstone M, et al. The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat Genet. 2017;49:1167-1173.
Deciphering Developmental Disorders Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature. 2017;542:433-438.
Epi25 Collaborative. Ultra-rare genetic variation in the epilepsies: A whole-exome sequencing study of 17,606 individuals. Am J Hum Genet. 2019;105:267-282.
Kaplanis J, Samocha KE, Wiel L, et al. Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature. 2020;586(7831):757-762.
Heyne HO, Baez-Nieto D, Iqbal S, et al. Predicting functional effects of missense variants in voltage-gated sodium and calcium channels. Sci Transl Med. 2020;12(556):eaay6848.
Escayg A, Goldin AL. Sodium channel SCN1A and epilepsy: Mutations and mechanisms. Epilepsia. 2010;51:1650-1658.
Sanders SJ, Campbell AJ, Cottrell JR, et al. Progress in understanding and treating SCN2A-mediated disorders. Trends Neurosci. 2018;41:442-456.
Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405-424.
Sivley RM, Dou X, Meiler J, Bush WS, Capra JA. Comprehensive analysis of constraint on the spatial distribution of missense variants in human protein structures. Am J Hum Genet. 2018; 102:415-426.
Kamburov A,, Lawrence MS, Polak P, et al. Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc Natl Acad Sci U S A. 2015;112:E5486-E5495.
Iqbal S, Pérez-Palma E, Jespersen JB, et al. Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants. Proc Natl Acad Sci U S A. 2020;117:28201-28211.
Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci. 2020;29:247-257.
Tang Z-Z, Sliwoski GR, Chen G, et al. PSCAN: Spatial scan tests guided by protein structures improve complex disease gene discovery and signal variant detection. Genome Biol. 2020;21(1):217.
Kelly M, Park M, Mihalek I, et al. Spectrum of neurodevelopmental disease associated with the GNAO1 guanosine triphosphatebinding region. Epilepsia. 2019;60:406-418.
Katayama S, Sueyoshi N, Inazu T, Kameshita I. Cyclin-dependent kinase-like 5 (CDKL5): Possible cellular signalling targets and involvement in CDKL5 deficiency disorder. Neural Plast. 2020;2020:6970190.
Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583-589.
Tunyasuvunakool K, Adler J, Wu Z, et al. Highly accurate protein structure prediction for the human proteome. Nature. 2021;596: 590-596.
Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res. 2000;28:235-242.
The UniProt Consortium. UniProt: The universal protein knowledgebase. Nucleic Acids Res. 2018;46:2699.
Karczewski KJ, Francioli LC, Tiao G, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434-443.
Danecek P, Auton A, Abecasis G, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156-2158.
Landrum MJ, Lee JM, Benson MB, et al. ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062-D1067.
Stenson PD, Mort M, Ball EV, et al. The Human Gene Mutation Database: Towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet. 2017;136:665-677.
Turner TN, Yi Q, Krumm N, et al. denovo-db: A compendium of human de novo variants. Nucleic Acids Res. 2017;45: D804-D811.
Dewey FE, Murray MF, Overton JD, et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science. 2016;354:aaf6814.
Sudlow C, Gallacher J, Allen N, et al. UK biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779.
Yuan C, Chen H, Kihara D. Effective inter-residue contact definitions for accurate protein fold recognition. BMC Bioinformatics. 2012;13:292.
Adhikari B, Cheng J. Protein residue contacts and prediction methods. Methods Mol Biol. 2016;1415:463-476.
Hoksza D, Gawron P, Ostaszewski M, Schneider R. MolArt: A molecular structure annotation and visualization tool. Bioinformatics. 2018;34:4127-4128.
Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285-291.
Zhang Y, Skolnick J. TM-align: A protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302-2309.
Hocker B. Design of proteins from smaller fragments-learning from evolution. Curr Opin Struct Biol. 2014;27:56-62.
Worth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol. 2009;10:709-720.
Perez-Palma E, May P, Iqbal S, et al. Identification of pathogenic variant enriched regions across genes and gene families. Genome Res. 2020;30:62-71.
Traynelis J, Silk M, Wang Q, et al. Optimizing genomic medicine in epilepsy through a gene-customized approach to missense variant interpretation. Genome Res. 2017;27:1715-1729.
Lal D, May P, Perez-Palma E, et al. Gene family information facilitates variant interpretation and identification of disease-associated genes. Genome Med. 2020;12(1):28.
Hopf TA, Ingraham JB, Poelwijk FJ, et al. Mutation effects predicted from sequence co-variation. Nat Biotechnol. 2017;35:128-135.
Miceli F, Soldovieri MV, Ambrosino P, et al. Molecular patho-physiology and pharmacology of the voltage-sensing module of neuronal ion channels. Front Cell Neurosci. 2015;9:259.
Scheffer IE, Berkovic S, Capovilla G, et al. ILAE classification of the epilepsies: Position paper of the ILAE Commission for Classification and Terminology. Epilepsia. 2017;58:512-521.
Muir AM, Gardner JF, van Jaarsveld RH, et al. Variants in GNAI1 cause a syndrome associated with variable features including developmental delay, seizures, and hypotonia. Genet Med. 2021;23:881-887.
Reynhout S, Jansen S, Haesen D, et al. De novo mutations affecting the catalytic Calpha subunit of PP2A, PPP2CA, cause syndromic intellectual disability resembling other PP2A-related neurodevelopmental disorders. Am J Hum Genet. 2019;104:139-156.
Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Molecular mechanisms of disease-causing missense mutations. J Mol Biol. 2013;425:3919-3936.
Al Mehdi K, Fouad B, Zouhair E, et al. Molecular modelling and dynamics study of nsSNP in STXBP1 gene in early infantile epileptic encephalopathy disease. Biomed Res Int. 2019;2019:4872101.
McTague A, Nair U, Malhotra S, et al. Clinical and molecular characterization of KCNT1-related severe early-onset epilepsy. Neurology. 2018;90:e55-e66.
Parrini E, Marini C, Mei D, et al. Diagnostic targeted resequencing in 349 patients with drug-resistant pediatric epilepsies identifies causative mutations in 30 different genes. Hum Mutat. 2017;38:216-225.
Thornton JM, Laskowski RA, Borkakoti N. AlphaFold heralds a data-driven revolution in biology and medicine. Nat Med. 2021; 27:1666-1669.
Lal D, May P, Perez-Palma E, et al. Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders. Genome Med. 2020;12:28.
Akdel M, Pires DEV, Porta Pardo E, et al. A structural biology community assessment of AlphaFold 2 applications. bioRxiv 391185. 2021. doi: 10.1101/2021.09.26.461876
Meyer MJ, Lapcevic R, Romero AE, et al. mutation3D: Cancer gene prediction through atomic clustering of coding variants in the structural proteome. Hum Mutat. 2016;37:447-456.
Ghosh R, Oak N, Plon SE. Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biol. 2017;18:225.
Geisheker MR, Heymann G, Wang T, et al. Hotspots of missense mutation identify neurodevelopmental disorder genes and functional domains. Nat Neurosci. 2017;20:1043-1051.
Ye J, Pavlicek A, Lunney EA, Rejto PA, Teng CH. Statistical method on nonrandom clustering with application to somatic mutations in cancer. BMC Bioinformatics. 2010;11:11.
Poole W, Leinonen K, Shmulevich I, Knijnenburg TA, Bernard B. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression. PLoS Comput Biol. 2017;13:e1005347.
Jubb HC, Saini HK, Verdonk ML, Forbes SA. COSMIC-3D provides structural perspectives on cancer genetics for drug discovery. Nat Genet. 2018;50:1200-1202.
Ofoegbu TC, David A, Kelley LA, et al. PhyreRisk: A dynamic web application to bridge genomics, proteomics and 3D structural data to guide interpretation of human genetic variants. J Mol Biol. 2019;431:2460-2466.
Stephenson JD, Laskowski RA, Nightingale A, Hurles ME, Thornton JM. VarMap: A web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations. Bioinformatics. 2019;35:4854-4856.
Liang S, Mort M, Stenson PD, Cooper DN, Yu H. PIVOTAL: Prioritizing variants of uncertain significance with spatial genomic patterns in the 3D proteome. bioRxiv 2020.06.04.135103. 2020. doi: 10.1101/2020.06.04.135103
Segura J, Sanchez-Garcia R, Sorzano COS, Carazo JM. 3DBIONOTES v3.0: Crossing molecular and structural biology data with genomic variations. Bioinformatics. 2019;35:3512-3513.
Paznekas WA, Boyadjiev SA, Shapiro RE, et al. Connexin 43 (GJA1) mutations cause the pleiotropic phenotype of oculodentodigital dysplasia. Am J Hum Genet. 2003;72:408-418.
Brunklaus A, Du J, Steckler F, et al. Biological concepts in human sodium channel epilepsies and their relevance in clinical practice. Epilepsia. 2020;61:387-399.
Bellazzi R, Masseroli M, Murphy S, Shabo A, Romano P. Clinical bioinformatics: Challenges and opportunities. BMC Bioinformatics. 2012;13(Suppl 14):S1.
Mangul S, Martin LS, Eskin E, Blekhman R. Improving the usability and archival stability of bioinformatics software. Genome Biol. 2019;20:47.
Li Q, Wang K. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. Am J Hum Genet. 2017;100:267-280.
Amendola LM, Jarvik GP, Leo MC, et al. Performance of ACMG-AMP variant-interpretation guidelines among nine laboratories in the clinical sequencing exploratory research consortium. Am J Hum Genet. 2016;99:247.
Babione JN, Ocampo W, Haubrich S, et al. Human-centred design processes for clinical decision support: A pulmonary embolism case study. Int J Med Inform. 2020;142:104196.
Bates DW, Kuperman GJ, Wang S, et al. Ten commandments for effective clinical decision support: Making the practice of evidence-based medicine a reality. J Am Med Inform Assoc. 2003;10:523-530.
Cai CJ, Reif E, Hegde N, et al. Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ‘19). Association for Computing Machinery; Paper 4:1-14. https://doi.org/10.1145/3290605.3300234
Castellotti B, Ragona F, Freri E, et al. Screening of SLC2A1 in a large cohort of patients suspected for Glut1 deficiency syndrome: Identification of novel variants and associated phenotypes. J Neurol. 2019;266:1439-1448.
Nickels KC, Zaccariello MJ, Hamiwka LD, Wirrell EC. Cognitive and neurodevelopmental comorbidities in paediatric epilepsy. Nat Rev Neurol. 2016;12:465-476.
Deng D, Xu C, Sun P, et al. Crystal structure of the human glucose transporter GLUT1. Nature. 2014;510:121-125.
Tung K-F, Pan C-Y, Chen C-H, Lin W-C. Top-ranked expressed gene transcripts of human protein-coding genes investigated with GTEx dataset. Sci Rep. 2020;10:16245.