References of "Iqbal, Sumaiya"
     in
Bookmark and Share    
Full Text
See detailInsights into protein structural, physicochemical, and functional consequences of missense variants in 1,330 disease-associated human genes 693259
Iqbal, Sumaiya; Jespersen, Jakob B.; Perez-Palma, Eduardo et al

E-print/Working paper (2019)

Inference of the structural and functional consequences of amino acid-altering missense variants is challenging and not yet scalable. Clinical and research applications of the colossal number of ... [more ▼]

Inference of the structural and functional consequences of amino acid-altering missense variants is challenging and not yet scalable. Clinical and research applications of the colossal number of identified missense variants is thus limited. Here we describe the aggregation and analysis of large-scale genomic variation and structural biology data for 1,330 disease-associated genes. Comparing the burden of 40 structural, physicochemical, and functional protein features of altered amino acids with 3-dimensional coordinates, we found 18 and 14 features that are associated with pathogenic and population missense variants, respectively. Separate analyses of variants from 24 protein functional classes revealed novel function-dependent vulnerable features. We then devised a quantitative spectrum, identifying variants with higher pathogenic variant-associated features. Finally, we developed a web resource (MISCAST; http://miscast.broadinstitute.org/) for interactive analysis of variants on linear and tertiary protein structures. The biological impact of missense variants available through the webtool will assist researchers in hypothesizing variant pathogenicity and disease trajectories. [less ▲]

Detailed reference viewed: 141 (1 UL)
Full Text
See detailIdentification of pathogenic variant enriched regions across genes and gene families
Pérez-Palma, Eduardo; May, Patrick UL; Iqbal, Sumaiya et al

E-print/Working paper (2019)

Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to ... [more ▼]

Missense variant interpretation is challenging. Essential regions for protein function are conserved among gene family members, and genetic variants within these regions are potentially more likely to confer risk to disease. Here, we generated 2,871 gene family protein sequence alignments involving 9,990 genes and performed missense variant burden analyses to identify novel essential protein regions. We mapped 2,219,811 variants from the general population into these alignments and compared their distribution with 65,034 missense variants from patients. With this gene family approach, we identified 398 regions enriched for patient variants spanning 33,887 amino acids in 1,058 genes. As a comparison, testing the same genes individually we identified less patient variant enriched regions involving only 2,167 amino acids and 180 genes. Next, we selected de novo variants from 6,753 patients with neurodevelopmental disorders and 1,911 unaffected siblings, and observed a 5.56-fold enrichment of patient variants in our identified regions (95% C.I. =2.76-Inf, p-value = 6.66×10−8). Using an independent ClinVar variant set, we found missense variants inside the identified regions are 111-fold more likely to be classified as pathogenic in comparison to benign classification (OR = 111.48, 95% C.I = 68.09-195.58, p-value < 2.2e−16). All patient variant enriched regions identified (PERs) are available online through a user-friendly platform for interactive data mining, visualization and download at http://per.broadinstitute.org. In summary, our gene family burden analysis approach identified novel patient variant enriched regions in protein sequences. This annotation can empower variant interpretation. [less ▲]

Detailed reference viewed: 57 (0 UL)
Full Text
Peer Reviewed
See detailVariant Score Ranker - a web application for intuitive missense variant prioritization
Du, Juanjiangmeng; Sudarsanam, Monica; Pérez-Palma, Eduardo et al

in Bioinformatics (2019)

The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same ... [more ▼]

The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same gene. However, most of the existing variant annotation tools do not reference the score range of benign population variants on gene level. Here, we present a web-application, Variant Score Ranker, which enables users to rapidly annotate variants and perform gene-specific variant score ranking on the population level. We also provide an intuitive example of how gene- and population-calibrated variant ranking scores can improve epilepsy variant prioritization. [less ▲]

Detailed reference viewed: 53 (2 UL)
Full Text
Peer Reviewed
See detailFunctional Interpretation of Single Amino Acid Substitutions in 1,330 Disease-Associated Genes
Iqbal, Sumaiya; Jespersen, Jakob Berg; Perez-Palma, Eduardo et al

in Biophysical Journal (2019, February 15), 116(3), 420-421

Elucidating molecular consequences of amino-acid-altering missense variants at scale is challenging. In this work, we explored whether features derived from three-dimensional (3D) protein structures can ... [more ▼]

Elucidating molecular consequences of amino-acid-altering missense variants at scale is challenging. In this work, we explored whether features derived from three-dimensional (3D) protein structures can characterize patient missense variants across different protein classes with similar molecular level activities. The identified disease-associated features can advance our understanding of how a single amino acid substitution can lead to the etiology of monogenic disorders. For 1,330 disease-associated genes (>80%, 1,077/1,330 implicated in Mendelian disorders), we collected missense variants from the general population (gnomAD database, N=164,915) and patients (ClinVar and HGMD databases, N=32,923). We in silico mapped the variant positions onto >14k human protein 3D structures. We annotated the protein positions of variants with 40 structural, physiochemical, and functional features. We then grouped the genes into 24 protein classes based on their molecular functions and performed statistical association analyses with the features of population and patient variants. We identified 18 (out of 40) features that are associated with patient variants in general. Specifically, patient variants are less exposed to solvent (p<1.0e-100), enriched on b-sheets (p<2.37e-39), frequently mutate aromatic residues (p<1.0e-100), occur in ligand binding sites (p<1.0e-100) and are spatially close to phosphorylation sites (p<1.0e-100). We also observed differential protein-class-specific features. For three protein classes (signaling molecules, proteases and hydrolases), patient variants significantly perturb the disulfide bonds (p<1.0e-100). Only in immunity proteins, patient variants are enriched in flexible coils (p<1.65e-06). Kinases and cell junction proteins exhibit enrichment of patient variants around SUMOylation (p<1.0e-100) and methylation sites (p<9.29e-11), respectively. In summary, we studied shared and unique features associated with patient variants on protein structure across 24 protein classes, providing novel mechanistic insights. We generated an online resource that contains amino-acid-wise feature annotation-track for 1,330 genes, summarizes the patient-variant-associated features on residue level, and can guide variant interpretation. [less ▲]

Detailed reference viewed: 86 (1 UL)
Full Text
See detailPredicting Functional Effects of Missense Variants in Voltage-Gated Sodium and Calcium Channels
Heyne, Henrike O.; Baez-Nieto, David; Iqbal, Sumaiya et al

E-print/Working paper (2019)

Malfunctions of voltage-gated sodium and calcium channels (SCN and CACNA1 genes) have been associated with severe neurologic, psychiatric, cardiac and other diseases. Altered channel activity is ... [more ▼]

Malfunctions of voltage-gated sodium and calcium channels (SCN and CACNA1 genes) have been associated with severe neurologic, psychiatric, cardiac and other diseases. Altered channel activity is frequently grouped into gain or loss of ion channel function (GOF or LOF, respectively) which is not only corresponding to clinical disease manifestations, but also to differences in drug response. Experimental studies of channel function are therefore important, but laborious and usually focus only on a few variants at a time. Based on known gene-disease-mechanisms, we here infer LOF (518 variants) and GOF (309 variants) of likely pathogenic variants from disease phenotypes of variant carriers. We show regional clustering of inferred GOF and LOF variants, respectively, across the alignment of the entire gene family, suggesting shared pathomechanisms in the SCN/CACNA1 genes. By training a machine learning model on sequence- and structure-based features we predict LOF- or GOF- associated disease phenotypes (ROC = 0.85) of likely pathogenic missense variants. We then successfully validate the GOF versus LOF prediction on 87 functionally tested variants in SCN1/2/8A and CACNA1I (ROC = 0.73) and in exome-wide data from > 100.000 cases and controls. Ultimately, functional prediction of missense variants in clinically relevant genes will facilitate precision medicine in clinical practice. [less ▲]

Detailed reference viewed: 63 (0 UL)
Full Text
Peer Reviewed
See detailIdentification and Characterization of Variant Intolerant Sites across Human Protein 3-Dimensional Structures
Iqbal, Sumaiya; Berg Jespersen, Jakob; Perez-Palma, Eduardo et al

in Biophysical Journal (2018, February 02), 114(3, Suppl. 1), 664

The functional interpretation of genetic variation in disease-associated genes is far outpaced by data generation. Existing algorithms for prediction of variant consequences do not adequately distinguish ... [more ▼]

The functional interpretation of genetic variation in disease-associated genes is far outpaced by data generation. Existing algorithms for prediction of variant consequences do not adequately distinguish pathogenic variants from benign rare variants. This lack of statistical and bioinformatics analyses, accompanied by an ever-increasing number of identified variants in biomedical research and clinical applications, has become a major challenge. Established methods to predict the functional effect of genetic variation use the degree of amino acid conservation across species in linear protein sequence alignment. More recent methods include the spatial distribution pattern of known patient and control variants. Here, we propose to combine the linear conservation and spatial constrained based scores to devise a novel score that incorporates 3-dimensional structural properties of amino acid residues, such as the solvent-accessible surface area, degree of flexibility, secondary structure propensity and binding tendency, to quantify the effect of amino acid substitutions. For this study, we develop a framework for large-scale mapping of established linear sequence-based paralog and ortholog conservation scores onto the tertiary structures of human proteins. This framework can be utilized to map the spatial distribution of mutations on solved protein structures as well as homology models. As a proof of concept, using a homology model of the human Nav1.2 voltage-gated sodium channel structure, we observe spatial clustering in distinct domains of mutations, associated with Autism Spectrum Disorder (>20 variants) and Epilepsy (>100 variants), that exert opposing effects on channel function. We are currently characterizing all variants (>300k individuals) found in ClinVar, the largest disease variant database, as well as variants identified in >140k individuals from general population. The variant mapping framework and our score, informed with structural information, will be useful in identifying structural motifs of proteins associated with disease risk. [less ▲]

Detailed reference viewed: 80 (2 UL)