Article (Scientific journals)
Identification of pathogenic variant enriched regions across genes and gene families
Perez-Palma, Eduardo; May, Patrick; Iqbal, Sumaiya et al.
2020In Genome Research, 30 (1), p. 62-71
Peer Reviewed verified by ORBi
 

Files


Full Text
Perez-Palma.GenomeRes.2019.pdf
Author postprint (870.55 kB)
Request a copy

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2,871 gene family protein sequence alignments involving 9,990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 76,153 missense variants from patients. With this gene family approach, we identified 465 regions enriched for patient variants spanning 41,463 amino acids in 1,252 genes. As a comparison, testing the same genes individually we identified less patient variant enriched regions involving only 2,639 amino acids and 215 genes. Next, we selected de novo variants from 6,753 patients with neurodevelopmental disorders and 1,911 unaffected siblings, and observed an 8.33-fold enrichment of patient variants in our identified regions (95% C.I.=3.90-Inf, p-value = 2.72x10-11). Using the complete ClinVar variant set, we found that missense variants inside the identified regions are 106-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 106.15, 95% C.I = 70.66-Inf, p-value < 2.2 x 10-16). All pathogenic variant enriched regions (PERs) identified are available online through the “PER viewer” a user-friendly online platform for interactive data mining, visualization and download. In summary, our gene family burden analysis approach identified novel pathogenic variant enriched regions in protein sequences. This annotation can empower variant interpretation.
Research center :
- Luxembourg Centre for Systems Biomedicine (LCSB): Bioinformatics Core (R. Schneider Group)
Disciplines :
Genetics & genetic processes
Author, co-author :
Perez-Palma, Eduardo
May, Patrick  ;  University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Iqbal, Sumaiya
Niestroj, Lisa-Marie
Du, Juanjiangmeng
Heyne, Henrike O.
Castrillon, Jessica A.
O'Donnell-Luna, Anne
Nürnberg, Peter
Palotie, Aarno
Daly, Mark
Lal, Dennis
External co-authors :
yes
Language :
English
Title :
Identification of pathogenic variant enriched regions across genes and gene families
Publication date :
January 2020
Journal title :
Genome Research
ISSN :
1549-5469
Publisher :
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, United States - New York
Volume :
30
Issue :
1
Pages :
62-71
Peer reviewed :
Peer Reviewed verified by ORBi
Focus Area :
Systems Biomedicine
Available on ORBilu :
since 03 January 2020

Statistics


Number of views
237 (2 by Unilu)
Number of downloads
0 (0 by Unilu)

Scopus citations®
 
36
Scopus citations®
without self-citations
25
OpenCitations
 
29

Bibliography


Similar publications



Contact ORBilu