Reference : Gene family information facilitates variant interpretation and identification of dise...
Scientific journals : Article
Life sciences : Genetics & genetic processes
Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders
Lal, Dennis []
May, Patrick mailto [University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > >]
Perez-Palma, Eduardo []
Samocha, Kaitlin E. []
Kosmicki, Jack A. []
Robinson, Elise B. []
Møller, Rikke S. []
Krause, Roland mailto [University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > >]
Nürnberg, Peter []
Weckhuysen, Sarah []
De Jonghe, Peter []
Guerrini, Renzo []
Niestroj, Lisa M. []
Du, Juliana []
Marini, Carla []
EuroEPINOMICS-RES Consortium []
Ware, James S. []
Kurki, Mitja []
Gormley, Padhraig []
Tang, Sha []
Wu, Sitao []
Biskup, Saskia []
Poduri, Annapurna []
Neubauer, Bernd A. []
Koeleman, Bobby P.C. []
Helbig, Katherine L. []
Weber, Yvonne G. []
Helbig, Ingo []
Majitha, Amit R. []
Palotie, Aarno []
Daly, Mark J. []
Genome Medicine
BioMed Central
United Kingdom
[en] Gene families ; variant interpretation ; neurodevelopmental disease ; paralogs ; missense variants
[en] Background: Classifying pathogenicity of missense variants represents a major challenge in clinical practice during the diagnoses of rare and genetic heterogeneous neurodevelopmental disorders (NDDs). While orthologous gene conservation is commonly employed in variant annotation, approximately 80% of known disease-associated genes belong to gene families. The use of gene family information for disease gene discovery and variant interpretation has not yet been investigated on genome-wide scale. We empirically evaluate whether paralog conserved or non-conserved sites in human gene families are important in NDDs.
Methods: Gene family information was collected from Ensembl. Paralog conserved sites were defined based on paralog sequence alignments. 10,068 NDD patients and 2,078 controls were statistically evaluated for de novo variant burden in gene families.
Results: We demonstrate that disease-associated missense variants are enriched at paralog conserved sites across all disease groups and inheritance models tested. We developed a gene family de novo enrichment framework that identified 43 exome-wide enriched gene families including 98 de novo variant carrying genes in NDD patients of which 28 represent novel candidate genes for NDD which are brain expressed and under evolutionary constraint.
Conclusion: This study represents the first method to incorporate gene-family information into a statistical framework to interpret variant data for NDDs and to discover newly NDD -associated genes.
Luxembourg Centre for Systems Biomedicine (LCSB): Bioinformatics Core (R. Schneider Group) ; University of Luxembourg: High Performance Computing - ULHPC

File(s) associated to this reference

Fulltext file(s):

Limited access
Lal2020.GenomeMedicine.pdfPublisher postprint1.61 MBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.