[en] Synonymous single nucleotide variants (sSNVs), traditionally seen as neutral, are now recognized for their biological impact. To assess their relevance, we developed SyMetrics, a framework that integrates predictors of splicing, RNA stability, evolutionary conservation, codon usage, synonymous variation effects, sequence properties, and allele frequency. We analyzed all possible sSNVs across the human genome, and our machine-learning model achieved 97% accuracy in distinguishing deleterious from benign variants, with a ROC–AUC of 0.89, outperforming individual predictors. Our estimates indicate that about 1.98 ± 0.17% of sSNVs absent from population databases are damaging (roughly 900 000 sSNVs), with an odds ratio of 3.87 for deleteriousness compared to common sSNVs (P < 0.05). To validate predictions, we performed functional assays on selected sSNVs in the AVPR2 gene and additionally used available large scale mutagenesis screens of RAD51C and BAP1 variants. In a clinical cohort, we identified 15 predicted deleterious sSNVs in genes linked to patient phenotypes; 9 were classified as (likely) pathogenic while 6 were variants of uncertain significance (VUS) per American College of Medical Genetics guidelines. For three VUS, segregation data supported their suspected inheritance patterns (de novo, X-linked). Our findings underscore the functional importance of sSNVs. To support further research and clinical applications, we provide a Python package and web application (https://symetrics.org/) for evaluating these variants comprehensively.
Research center :
Luxembourg Centre for Systems Biomedicine (LCSB): Bioinformatics Core (R. Schneider Group)
Disciplines :
Genetics & genetic processes
Author, co-author :
Bundalian, Linnaeus ; Institute of Human Genetics, University of Leipzig Medical Center , Leipzig, Saxony 04103 , ; Institute for Clinical Genetics, Technische Universität Dresden, and National Cancer Center (NCT) Dresden , Dresden, Saxony 01307 ,
Strnadová, Martina Schmidt; Rudolf Schönheimer Institute of Biochemistry, Medical Faculty, University of Leipzig , Leipzig, Saxony 04103 ,
Garten, Felix; Institute of Human Genetics, University of Leipzig Medical Center , Leipzig, Saxony 04103 ,
Horn, Susanne ; Rudolf Schönheimer Institute of Biochemistry, Medical Faculty, University of Leipzig , Leipzig, Saxony 04103 ,
Stenzel, Udo; Rudolf Schönheimer Institute of Biochemistry, Medical Faculty, University of Leipzig , Leipzig, Saxony 04103 ,
Popp, Denny; Institute of Human Genetics, University of Leipzig Medical Center , Leipzig, Saxony 04103 ,
Lemke, Johannes R; Institute of Human Genetics, University of Leipzig Medical Center , Leipzig, Saxony 04103 ,
Garten, Antje; Hospital for Children and Adolescents and Center for Pediatric Research (CPL), University of Leipzig , Leipzig, Saxony 04103 ,
Thor, Doreen; Rudolf Schönheimer Institute of Biochemistry, Medical Faculty, University of Leipzig , Leipzig, Saxony 04103 ,
Schulz, Angela; Rudolf Schönheimer Institute of Biochemistry, Medical Faculty, University of Leipzig , Leipzig, Saxony 04103 ,
Hentschel, Julia; Institute of Human Genetics, University of Leipzig Medical Center , Leipzig, Saxony 04103 ,
Kelso, Janet ; Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology , Leipzig, Saxony 04103 ,
Schöneberg, Torsten; Rudolf Schönheimer Institute of Biochemistry, Medical Faculty, University of Leipzig , Leipzig, Saxony 04103 , ; School of Medicine, University of Global Health Equity , Kigali 6955 ,
Le Duc, Diana ; Institute of Human Genetics, University of Leipzig Medical Center , Leipzig, Saxony 04103 , ; Institute for Clinical Genetics, Technische Universität Dresden, and National Cancer Center (NCT) Dresden , Dresden, Saxony 01307 , ; Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology , Leipzig, Saxony 04103 , ; Center for Diagnostics GmbH, Department of Genetics, Chemnitz Clinics , Chemnitz, Saxony 09116 ,
Else Kroner-Fresenius-Stiftung German Research Foundation Deutsche Forschungsgemeinschaft Deutsche Forschungsgemeinschaft
Funding text :
This study is funded by the Else Kroner-Fresenius-Stiftung 2020_EKEA.42 to D.L.D., the German Research Foundation SFB 1052 project B10 to D.L.D. and A.G, and Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through CRC 1423/2 (project number 421152132) to T.S. and D.T..
Sauna ZE, Kimchi-Sarfaty C. Understanding the contribution of synonymous mutations to human disease. Nat Rev Genet. 2011;12:683–91. 10.1038/nrg3051.
Winterer G, Goldman D. Genetics of human prefrontal function. Brain Res Rev. 2003;43:134–63. 10.1016/S0165-0173(03)00205-4.
Hunt RC, Simhadri VL, Iandoli M et al. Exposing synonymous mutations. Trends Genet. 2014;30:308–21. 10.1016/j.tig.2014.04.006.
Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF et al. Predicting splicing from primary sequence with Deep learning. Cell. 2019;176:535–48. 10.1016/j.cell.2018.12.015.
Sciascia S, Roccatello D, Salvatore M et al. Unmet needs in countries participating in the undiagnosed diseases network international: an international survey considering national health care and economic indicators. Front Public Health. 2023;11:1248260. 10.3389/fpubh.2023.1248260.
Graessner H, Zurek B, Hoischen A et al. Solving the unsolved rare diseases in Europe. Eur J Hum Genet. 2021;29:1319–20. 10.1038/s41431-021-00924-8.
Katsonis P, Koire A, Wilson SJ et al. Single nucleotide variations: biological impact and theoretical interpretation. Protein Sci. 2014;23:1650–66. 10.1002/pro.2552.
Walsh IM, Bowman MA, Soto Santarriaga IF et al. Synonymous codon substitutions perturb cotranslational protein folding in vivo and impair cell fitness. Proc Natl Acad Sci USA. 2020;117:3528–34. 10.1073/pnas.1907126117.
Dhindsa RS, Wang Q, Vitsios D et al. A minimal role for synonymous variation in human disease. Am Hum Genet. 2022;109:2105–9. 10.1016/j.ajhg.2022.10.016.
Shi F, Yao Y, Bin Y et al. Computational identification of deleterious synonymous variants in human genomes using a feature-based approach. BMC Med Genomics. 2019;12:12. 10.1186/s12920-018-0455-6.
Giacoletto CJ, Rotter JI, Grody WW et al. Synonymous variants of uncertain silence. Int J Mol Sci. 2023;24:10556, 10.3390/ijms241310556.
Lin BC, Katneni U, Jankowska KI et al. In silico methods for predicting functional synonymous variants. Genome Biol. 2023;24:126. 10.1186/s13059-023-02966-1.
Jankowska KI, Meyer D, Holcomb DD et al. Synonymous ADAMTS13 variants impact molecular characteristics and contribute to variability in active protein abundance. Blood Adv. 2022;6:5364–78.
Ranganathan Ganakammal S, Alexov E. An ensemble approach to predict the pathogenicity of synonymous variants. Genes. 2020;11:1102, 10.3390/genes11091102.
Dhindsa RS, Wang Q, Vitsios D et al. A minimal role for synonymous variation in human disease. Am Hum Genet. 2022;109:2105–9. 10.1016/j.ajhg.2022.10.016.
Mello AC, Leao D, Dias L et al. Broken silence: 22, 841 predicted deleterious synonymous variants identified in the human exome through computational analysis. Genet Mol Biol. 2024;46:e20230125. 10.1590/1678-4685-gmb-2023-0125.
Gudkov M, Thibaut L, Giannoulatou E. Quantifying negative selection on synonymous variants. Human Genetics and Genomics Advances. 2024;5:100262. 10.1016/j.xhgg.2024.100262.
Kovacs E, Tompa P, Liliom K et al. Dual coding in alternative reading frames correlates with intrinsic protein disorder. Proc Natl Acad Sci USA. 2010;107:5429–34. 10.1073/pnas.0907841107.
Vasu K, Khan D, Ramachandiran I et al. Analysis of nested alternate open reading frames and their encoded proteins. NAR Genom Bioinform. 2022;4:lqac076.
Vihinen M. When a synonymous variant is nonsynonymous. Genes. 2022;13:1485. 10.3390/genes13081485.
Ranganathan Ganakammal S, Alexov E. An ensemble approach to predict the pathogenicity of synonymous variants. Genes. 2020;11:1102, 10.3390/genes11091102.
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res. 2021;49:12673–91. 10.1093/nar/gkab1159.
Buske OJ, Manickaraj A, Mital S et al. Identification of deleterious synonymous variants in human genomes. Bioinformatics. 2015;31:799. 10.1093/bioinformatics/btu765.
Zeng Z, Bromberg Y. Predicting functional effects of synonymous variants: a systematic review and perspectives. Front Genet. 2019;10:914. 10.3389/fgene.2019.00914.
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res. 2021;49:12673–91. 10.1093/nar/gkab1159.
Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. In: Proceedings of the seventh annual international conference on Research in computational molecular biology. New York, NY, USA: ACM, 2003, 322–31. 10.1145/640075.640118.
Gaither JBS, Lammi GE, Li JL et al. Synonymous variants that disrupt messenger RNA structure are significantly constrained in the human population. Gigascience. 2021;10:giab023. 10.1093/gigascience/giab023.
Davydov EV, Goode DL, Sirota M et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput Biol. 2010;6:e1001025. 10.1371/journal.pcbi.1001025.
Buske OJ, Manickaraj A, Mital S et al. Identification of deleterious synonymous variants in human genomes. Bioinformatics. 2015;31:799. 10.1093/bioinformatics/btu765.
Shi F, Yao Y, Bin Y et al. Computational identification of deleterious synonymous variants in human genomes using a feature-based approach. BMC Med Genomics. 2019;12:12. 10.1186/s12920-018-0455-6.
Zeng Z, Bromberg Y. Predicting functional effects of synonymous variants: a systematic review and perspectives. Front Genet. 2019;10:914. 10.3389/fgene.2019.00914.
Mount SM. A catalogue of splice junction sequences. Nucleic Acids Res. 1982;10:459–72. 10.1093/nar/10.2.459.
Will CL, Luhrmann R. Spliceosome structure and function. Cold Spring Harb Perspect Biol. 2011;3:a003707. 10.1101/cshperspect.a003707.
Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF et al. Predicting splicing from primary sequence with Deep learning. Cell. 2019;176:535–48. 10.1016/j.cell.2018.12.015.
Ha C, Kim J-W, Jang J-H. Performance evaluation of SpliceAI for the prediction of splicing of NF1 variants. Genes. 2021;12:1308. 10.3390/genes12091308.
Eng L, Coutinho G, Nahas S et al. Nonclassical splicing mutations in the coding and noncoding regions of the ATM gene: maximum entropy estimates of splice junction strengths. Hum Mutat. 2004;23:67–76. 10.1002/humu.10295.
Cooper GM, Stone EA, Asimenos G et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005;15:901–13. 10.1101/gr.3577405.
Huber CD, Kim BY, Lohmueller KE. Population genetic models of GERP scores suggest pervasive turnover of constrained sites across mammalian evolution. PLoS Genet. 2020;16:e1008827. 10.1371/journal.pgen.1008827.
Henn BM, Botigué LR, Peischl S et al. Distance from sub-saharan Africa predicts mutational load in diverse human genomes. Proc Natl Acad Sci USA. 2016;113:E440–9. 10.1073/pnas.1510805112.
Marsden CD, Vecchyo O-D, D. O’B et al. Bottlenecks and selective sweeps during domestication have increased deleterious genetic variation in dogs. Proc Natl Acad Sci USA. 2016;113:152–7. 10.1073/pnas.1512501113.
Paulet D, David A, Rivals E. Ribo-seq enlightens codon usage bias. DNA Res. 2017;24:303–210. 10.1093/dnares/dsw062.
Xu X, Liu Q, Fan L et al. Analysis of synonymous codon usage and evolution of begomoviruses. J Zhejiang Univ Sci B. 2008;9:667–74. 10.1631/jzus.B0820005.
Cutter AD, Wasmuth JD, Blaxter ML. The evolution of biased codon and amino acid usage in nematode genomes. Mol Biol Evol. 2006;23:2303–15. 10.1093/molbev/msl097.
Liao X, Zhu W, Zhou J et al. Repetitive DNA sequence detection and its role in the human genome. Commun Biol. 2023;6:954. 10.1038/s42003-023-05322-y.
Zhang Z, Gerstein M. Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Res. 2003;31:5338–48. 10.1093/nar/gkg745.
Yu G, Wang L-G, Han Y et al. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16:284–7. 10.1089/omi.2011.0118.
Wu T, Hu E, Xu S et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. The Innovation. 2021;2:100141. 10.1016/j.xinn.2021.100141.
Draghici S, Khatri P, Tarca AL et al. A systems biology approach for pathway level analysis. Genome Res. 2007;17:1537–45. 10.1101/gr.6202607.
Wen P, Xiao P, Xia J. dbDSM: a manually curated database for deleterious synonymous mutations. Bioinformatics. 2016;32:1914–6. 10.1093/bioinformatics/btw086.
Landrum MJ, Lee JM, Benson M et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44:D862–8. 10.1093/nar/gkv1222.
Chawla NV, Bowyer KW, Hall LO et al. SMOTE: synthetic minority over-sampling technique. Jair. 2002;16:321–57. 10.1613/jair.953.
Abdelhamid M, Desai A. Balancing the scales: a comprehensive study on tackling class imbalance in binary classification. arXiv, https://arxiv.org/abs/2409.19751, 29 September 2024, preprint: not peer reviewed.
Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. Bmc Genomics [Electronic Resource]. 2020;21:6. 10.1186/s12864-019-6413-7.
Sokolova M, Japkowicz N, Szpakowicz S. Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In AAAI Workshop. 2006;- Technical Report.Vol. WS-06-06, pp.24–9.
Sangkuhl K, Römpler H, Busch W et al. Nephrogenic diabetes insipidus caused by mutation of Tyr205: a key residue of V2 vasopressin receptor function. Hum Mutat. 2005;25:505. 10.1002/humu.9337.
Schoneberg T, Yun J, Wenkert D et al. Functional rescue of mutant V2 vasopressin receptors causing nephrogenic diabetes insipidus by a co-expressed receptor polypeptide. EMBO J. 1996;15:1283–91. 10.1002/j.1460-2075.1996.tb00470.x.
Bonner TI, Young AC, Brann MR et al. Cloning and expression of the human and rat m5 muscarinic acetylcholine receptor genes. Neuron. 1988;1:403–10. 10.1016/0896-6273(88)90190-0.
Römpler H, Yu H-T, Arnold A et al. Functional consequences of naturally occurring DRY motif variants in the mammalian chemoattractant receptor GPR33. Genomics. 2006;87:724–32.
Liebing A-D, Krumbholz P, Stäubert C. Protocol to characterize gi/o and gs protein-coupled receptors in transiently transfected cells using ELISA and cAMP measurements. STAR Protocols. 2023;4:102120. 10.1016/j.xpro.2023.102120.
Gasperini M, Starita L, Shendure J. The power of multiplexed functional analysis of genetic variants. Nat Protoc. 2016;11:1782–7. 10.1038/nprot.2016.135.
Olvera-León R, Zhang F, Offord V et al. High-resolution functional mapping of RAD51C by saturation genome editing. Cell. 2024;187:5719–5734.e19. 10.1016/j.cell.2024.08.039.
Waters AJ, Brendler-Spaeth T, Smith D et al. Saturation genome editing of BAP1 functionally classifies somatic and germline variants. Nat Genet. 2024;56:1434–45. 10.1038/s41588-024-01799-3.
Kinsella RJ, Kähäri A, Haider S et al. Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database (Oxford). 2011;2011:bar030.
Karczewski KJ, Francioli LC, Tiao G et al. The mutational constraint spectrum quantified from variation in 141, 456 humans. Nature. 2020;581:434–43. 10.1038/s41586-020-2308-7.
Traag VA, Waltman L, van Eck NJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep. 2019;9:5233. 10.1038/s41598-019-41695-z.
Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Mach Learn. 2006;63:3–42. 10.1007/s10994-006-6226-1.
Rentzsch P, Witten D, Cooper GM et al. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–94. 10.1093/nar/gky1016.
Schaafsma GCP, Vihinen M. VariSNP, a benchmark database for variations from dbSNP. Hum Mutat. 2015;36:161–6. 10.1002/humu.22727.
Deng Y, Gao L, Wang B et al. HPOSim: an R package for phenotypic similarity measure and enrichment analysis based on the human phenotype ontology. PLoS One. 2015;10:e0115692. 10.1371/journal.pone.0115692.
de Coulgeans CD, Silvy M, Halverson G et al. Synonymous nucleotide polymorphisms influence Dombrock blood group protein expression in K562 cells. Br J Haematol. 2014;164:131–41.
Simhadri VL, Hamasaki-Katagiri N, Lin BC et al. Single synonymous mutation in factor IX alters protein properties and underlies haemophilia B. J Med Genet. 2017;54:338–45. 10.1136/jmedgenet-2016-104072.
Schernthaner-Reiter MH, Adams D, Trivellin G et al. A novel AVPR2 splice site mutation leads to partial X-linked nephrogenic diabetes insipidus in two brothers. Eur J Pediatr. 2016;175:727–33. 10.1007/s00431-015-2684-4.
Pruner I, Farm M, Tomic B et al. The silence speaks, but we do not listen: synonymous c.1824C>T gene variant in the last exon of the prothrombin gene as a new prothrombotic risk factor. Clin Chem. 2020;66:379–89. 10.1093/clinchem/hvz015.
Durkie M, Cassidy E-J, Berry I et al. ACGS Best practice guidelines for variant classification in rare disease 2024. 2024. 14 July 2025.