[en] Complex diseases often share genetic susceptibility factors, molecular pathways, and pathological mechanisms. Understanding these commonalities through systematic cross-disease comparisons can reveal both disease-specific and shared biomarkers, potentially suggesting new therapeutic targets and opportunities for drug repurposing.
In recent years, the growth of multi-omics datasets across diverse diseases, coupled with advances in computational systems biology, has enabled sophisticated cross-disease analyses. New methodological frameworks have emerged for integrating and comparing disease-specific molecular signatures, from gene-level analyses to complex network-based approaches.
Here, we present a comprehensive framework for computational cross-disease comparison and integration of omics data, systematically covering established and emerging methodologies. These include gene-level comparative analyses, pathway-based approaches, network biology methods, matrix factorization techniques, and machine learning approaches. We examine important aspects of data preprocessing, normalization, and integration, suggesting practical solutions to common technical challenges. We provide a detailed overview of relevant software tools and databases, discussing their strengths, limitations, and optimal use cases for cross-disease analysis. Finally, we explore current trends in cross-disease omics analysis, particularly through deep learning methods, highlighting new opportunities for methodological innovation and biological discovery in this field.
This compilation of computational methods and practical insights aims to serve as a resource both for bioinformaticians seeking guidance on optimal method selection and biomedical researchers interested in applied cross-disease analyses. In addition to highlighting practical recommendations and common pitfalls, it provides an entry point to the extensive literature in the field, supporting readers in identifying and further exploring suitable methods for their research needs.
Research center :
Luxembourg Centre for Systems Biomedicine (LCSB): Biomedical Data Science (Glaab Group)
Disciplines :
Human health sciences: Multidisciplinary, general & others Life sciences: Multidisciplinary, general & others
Author, co-author :
Svinin, G.
Loo, R.T.J.
Soudy, M.
Nasta, F.
Le Bars, S.
GLAAB, Enrico ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > Biomedical Data Science
External co-authors :
no
Language :
English
Title :
Computational systems biology methods for cross-disease comparison of omics data
Y. Li, S. Jin, L. Lei, Z. Pan, and X. Zou, “Deciphering Deterioration Mechanisms of Complex Diseases Based on the Construction of Dynamic Networks and Systems Analysis,” Scientific Reports 5, no. 1 (2015): 9283.
H. Fang and L. Jiang, “Genetic Prioritization, Therapeutic Repositioning and Cross-Disease Comparisons Reveal Inflammatory Targets Tractable for Kidney Stone Disease,” Frontiers in Immunology 12 (2021): 687291.
A. Arakelyan, L. Nersisyan, D. Poghosyan, et al., “Autoimmunity and Autoinflammation: A Systems View on Signaling Pathway Dysregulation Profiles,” PLoS One 12, no. 11 (2017): e0187572.
J. Caroli, M. Dori, and S. Bicciato, “Computational Methods for the Integrative Analysis of Genomics and Pharmacological Data,” Frontiers in Oncology 10 (2020): 185.
C. C. Serdar, M. Cihan, D. Yücel, and M. A. Serdar, “Sample Size, Power and Effect Size Revisited: Simplified and Practical Approaches in Pre-Clinical, Clinical and Laboratory Studies,” Biochemia Medica(Zagreb) 31, no. 1 (2021): 10502.
Y. Nan, J. D. Ser, S. Walsh, et al., “Data Harmonisation for Information Fusion in Digital Healthcare: A State-Of-The-Art Systematic Review, Meta-Analysis and Future Research Directions,” Information Fusion 82 (2022): 99–122.
Y. Yu, Y. Mai, Y. Zheng, and L. Shi, “Assessing and Mitigating Batch Effects in Large-Scale Omics Studies,” Genome Biology 25, no. 1 (2024): 254.
E. Clough and T. Barrett, “The Gene Expression Omnibus Database,” Methods in Molecular Biology 1418 (2016): 93–110.
K. Tomczak, P. Czerwińska, and M. Wiznerowicz, “The Cancer Genome Atlas (TCGA): An Immeasurable Source of Knowledge,” Contemporary Oncology/Współczesna Onkologia 19, no. 1A (2015): A68–A77.
J. Zhang, R. Bajari, D. Andric, et al., “The international cancer genome consortium data portal,” Nature Biotechnology 37, no. 4 (2019): 367–369.
M. W. Weiner, P. S. Aisen, C. R. Jack, Jr., et al., “The Alzheimer's Disease Neuroimaging Initiative: Progress Report and Future Plans,” Alzheimer's & Dementia 6, no. 3 (2010): 202–211.
M. Orth, European Huntington's Disease Network, O. J. Handley, et al., “Observing Huntington's Disease: The European Huntington's Disease Network's REGISTRY,” Journal of Neurology, Neurosurgery, and Psychiatry 82, no. 12 (2011): 1409–1412.
K. Marek, S. Chowdhury, A. Siderowf, et al., “The Parkinson's Progression Markers Initiative (PPMI) - Establishing a PD Biomarker Cohort,” Annals of Clinical Translational Neurology 5, no. 12 (2018): 1460–1477.
T. S. P. Heng, M. W. Painter, and Immunological Genome Project Consortium, “The Immunological Genome Project: Networks of Gene Expression in Immune Cells,” Nature Immunology 9, no. 10 (2008): 1091–1094.
L. Martens, H. Hermjakob, P. Jones, et al., “PRIDE: The Proteomics Identifications Database,” Proteomics 5, no. 13 (2005): 3537–3545.
E. W. Deutsch, N. Bandeira, Y. Perez-Riverol, et al., “The ProteomeXchange Consortium at 10 Years: 2023 Update,” Nucleic Acids Research 51, no. D1 (2023): D1539–D1548.
O. Yurekten, T. Payne, N. Tejera, et al., “MetaboLights: Open Data Repository for Metabolomics,” Nucleic Acids Research 52, no. D1 (2024): D640–D646.
M. Sud, E. Fahy, D. Cotter, et al., “Metabolomics Workbench: An International Repository for Metabolomics Data and Metadata, Metabolite Standards, Protocols, Tutorials and Training, and Analysis Tools,” Nucleic Acids Research 44, no. D1 (2016): D463–D470.
J. C. Keen and H. M. Moore, “The Genotype-Tissue Expression (GTEx) Project: Linking Clinical Data With Molecular Analysis to Advance Personalized Medicine,” Journal of Personalized Medicine 5, no. 1 (2015): 22–29.
ENCODE Project Consortium, “The ENCODE (ENCyclopedia of DNA Elements) Project,” Science 306, no. 5696 (2004): 636–640.
P. Moreno, N. Huang, J. R. Manning, et al., “User-Friendly, Scalable Tools and Workflows for Single-Cell RNA-Seq Analysis,” Nature Methods 18, no. 4 (2021): 327–328.
M. Kanehisa and S. Goto, “KEGG: Kyoto Encyclopedia of Genes and Genomes,” Nucleic Acids Research 28, no. 1 (2000): 27–30.
M. Milacic, D. Beavers, P. Conley, et al., “The Reactome Pathway Knowledgebase 2024,” Nucleic Acids Research 52, no. D1 (2024): D672–D678.
A. Agrawal, H. Balcı, K. Hanspers, et al., “WikiPathways 2024: Next Generation Pathway Database,” Nucleic Acids Research 52, no. D1 (2024): D679–D689.
P. D. Karp, R. Billington, R. Caspi, et al., “The BioCyc Collection of Microbial Genomes and Metabolic Pathways,” Briefings in Bioinformatics 20, no. 4 (2019): 1085–1093.
R. Caspi, R. Billington, I. M. Keseler, et al., “The MetaCyc Database of Metabolic Pathways and Enzymes - A 2019 Update,” Nucleic Acids Research 48, no. D1 (2020): D445–D453.
K. Kandasamy, S. S. Mohan, R. Raju, et al., “NetPath: A Public Resource of Curated Signal Transduction Pathways,” Genome Biology 11, no. 1 (2010): R3.
M. Ashburner, C. A. Ball, J. A. Blake, et al., “Gene Ontology: Tool for the Unification of Biology,” Nature Genetics 25, no. 1 (2000): 25–29.
I. Rodchenkov, O. Babur, A. Luna, et al., “Pathway Commons 2019 Update: Integration, Analysis and Exploration of Pathway Data,” Nucleic Acids Research 48, no. D1 (2020): D489–D497.
A. Liberzon, C. Birger, H. Thorvaldsdóttir, M. Ghandi, J. P. Mesirov, and P. Tamayo, “The Molecular Signatures Database (MSigDB) Hallmark Gene Set Collection,” Cell Systems 1, no. 6 (2015): 417–425.
D. Szklarczyk, R. Kirsch, M. Koutrouli, et al., “The STRING Database in 2023: Protein-Protein Association Networks and Functional Enrichment Analyses for any Sequenced Genome of Interest,” Nucleic Acids Research 51, no. D1 (2023): D638–D646.
D. Szklarczyk, K. Nastou, M. Koutrouli, et al., “The STRING Database in 2025: Protein Networks With Directionality of Regulation,” Nucleic Acids Research 53, no. D1 (2025): D730–D737.
R. Oughtred, J. Rust, C. Chang, et al., “The BioGRID Database: A Comprehensive Biomedical Resource of Curated Protein, Genetic, and Chemical Interactions,” Protein Science 30, no. 1 (2021): 187–200.
L. Salwinski, C. S. Miller, A. J. Smith, F. K. Pettit, J. U. Bowie, and D. Eisenberg, “The Database of Interacting Proteins: 2004 Update,” Nucleic Acids Research 32, no. Database issue (2004): D449–D451.
L. Licata, L. Briganti, D. Peluso, et al., “MINT, the Molecular Interaction Database: 2012 Update,” Nucleic Acids Research 40, no. D1 (2012): D857–D861.
S. Orchard, M. Ammari, B. Aranda, et al., “The MIntAct Project: IntAct as a Common Curation Platform for 11 Molecular Interaction Databases,” Nucleic Acids Research 42, no. D1 (2014): D358–D363.
D. S. Wishart, Y. D. Feunang, A. C. Guo, et al., “DrugBank 5.0: A Major Update to the DrugBank Database for 2018,” Nucleic Acids Research 46, no. D1 (2018): D1074–D1082.
D. Mendez, A. Gaulton, A. P. Bento, et al., “ChEMBL: Towards Direct Deposition of Bioassay Data,” Nucleic Acids Research 47, no. D1 (2019): D930–D940.
D. Szklarczyk, A. Santos, C. von Mering, L. J. Jensen, P. Bork, and M. Kuhn, “STITCH 5: Augmenting Protein-Chemical Interaction Networks With Tissue and Affinity Data,” Nucleic Acids Research 44, no. D1 (2016): D380–D384.
V. Matys, O. V. Kel-Margoulis, E. Fricke, et al., “TRANSFAC and Its Module TRANSCompel: Transcriptional Gene Regulation in Eukaryotes,” Nucleic Acids Research 34, no. Database issue (2006): D108–D110.
I. Rauluseviciute, R. Riudavets-Puig, R. Blanc-Mathieu, et al., “JASPAR 2024: 20th Anniversary of the Open-Access Database of Transcription Factor Binding Profiles,” Nucleic Acids Research 52, no. D1 (2024): D174–D182.
Z.-P. Liu, C. Wu, H. Miao, and H. Wu, “RegNetwork: An Integrated Database of Transcriptional and Post-Transcriptional Regulatory Networks in Human and Mouse,” Database: The Journal of Biological Databases and Curation 30, no. 2015 (2015): bav095.
D. Warde-Farley, S. L. Donaldson, O. Comes, et al., “The GeneMANIA Prediction Server: Biological Network Integration for Gene Prioritization and Predicting Gene Function,” Nucleic Acids Research 38, no. suppl_2 (2010): W214–W220.
G. Alanis-Lobato, J. S. Möllmann, M. H. Schaefer, and M. A. Andrade-Navarro, “MIPPIE: The Mouse Integrated Protein-Protein Interaction Reference,” Database: The Journal of Biological Databases and Curation 2020 (2020): baaa035, https://doi.org/10.1093/database/baaa035.
K. Breuer, A. K. Foroushani, M. R. Laird, et al., “InnateDB: Systems Biology of Innate Immunity and Beyond: Recent Updates and Continuing Curation,” Nucleic Acids Research 41, no. D1 (2013): D1228–D1233.
L. M. Schriml and E. Mitraka, “The Disease Ontology: Fostering Interoperability Between Biological and Clinical Human Disease-Related Data,” Mammalian Genome 26, no. 9–10 (2015): 584–589.
C. E. Lipscomb, “Medical Subject Headings (MeSH),” Bulletin of the Medical Library Association 88, no. 3 (2000): 265–266.
N. A. Vasilevsky, N. A. Matentzoglu, S. Toro, et al., “Mondo: Unifying Diseases for the World, by the World,” Preprint at medRxiv (2022). 2022.04.13.22273750.
M. A. Gargano, N. Matentzoglu, B. Coleman, et al., “The Human Phenotype Ontology in 2024: Phenotypes Around the World,” Nucleic Acids Research 52, no. D1 (2024): D1333–D1346.
R. Hoehndorf, P. N. Schofield, and G. V. Gkoutos, “PhenomeNET: A Whole-Phenome Approach to Disease Gene Discovery,” Nucleic Acids Research 39, no. 18 (2011): e119.
M. Whirl-Carrillo, E. M. McDonagh, J. M. Hebert, et al., “Pharmacogenomics Knowledge for Personalized Medicine,” Clinical Pharmacology and Therapeutics 92, no. 4 (2012): 414–417.
A. Buniello, D. Suveges, C. Cruz-Castillo, et al., “Open Targets Platform: Facilitating Therapeutic Hypotheses Building in Drug Discovery,” Nucleic Acids Research 53, no. D1 (2025): D1467–D1475.
S. L. Freshour, S. Kiwala, K. C. Cotto, et al., “Integration of the Drug-Gene Interaction Database (DGIdb 4.0) With Open Crowdsource Efforts,” Nucleic Acids Research 49, no. D1 (2021): D1144–D1151.
M. J. Landrum, J. M. Lee, M. Benson, et al., “ClinVar: Public Archive of Interpretations of Clinically Relevant Variants,” Nucleic Acids Research 44, no. D1 (2016): D862–D868.
J. G. Tate, S. Bamford, H. C. Jubb, et al., “COSMIC: The Catalogue of Somatic Mutations in Cancer,” Nucleic Acids Research 47, no. D1 (2019): D941–D947.
S. T. Sherry, M. H. Ward, M. Kholodov, et al., “dbSNP: The NCBI Database of Genetic Variation,” Nucleic Acids Research 29, no. 1 (2001): 308–311.
J. Piñero, J. M. Ramírez-Anguita, J. Saüch-Pitarch, et al., “The DisGeNET Knowledge Platform for Disease Genomics: 2019 Update,” Nucleic Acids Research 48, no. D1 (2020): D845–D855.
J. S. Amberger and A. Hamosh, “Searching Online Mendelian Inheritance in Man (OMIM): A Knowledgebase of Human Genes and Genetic Phenotypes,” Current Protocols in Bioinformatics 58, no. 1 (2017): 1.2.1–1.2.12.
A. Santos, K. Tsafou, C. Stolte, S. Pletscher-Frankild, S. I. O'Donoghue, and L. J. Jensen, “Comprehensive Comparison of Large-Scale Tissue Expression Datasets,” PeerJ 30, no. 3 (2015): e1054.
C. Hu, T. Li, Y. Xu, et al., “CellMarker 2.0: An Updated Database of Manually Curated Cell Markers in Human/Mouse and Web Tools Based on scRNA-Seq Data,” Nucleic Acids Research 51, no. D1 (2023): D870–D876.
C. Hutter and J. C. Zenklusen, “The Cancer Genome Atlas: Creating Lasting Value Beyond Its Data,” Cell 173, no. 2 (2018): 283–285.
H. Hermjakob and R. Apweiler, “The Proteomics Identifications Database (PRIDE) and the ProteomExchange Consortium: Making Proteomics Data Accessible,” Expert Review of Proteomics 3, no. 1 (2006): 1–3.
GTEx Consortium, “Human Genomics. The Genotype-Tissue Expression (GTEx) Pilot Analysis: Multitissue Gene Regulation in Humans,” Science 348, no. 6235 (2015): 648–660.
ENCODE Project Consortium, “An Integrated Encyclopedia of DNA Elements in the Human Genome,” Nature 489, no. 7414 (2012): 57–74.
P. Moreno, S. Fexova, N. George, et al., “Expression Atlas Update: Gene and Protein Expression in Multiple Species,” Nucleic Acids Research 50, no. D1 (2022): D129–D140.
N. Del Toro, A. Shrivastava, E. Ragueneau, et al., “The IntAct Database: Efficient Access to Fine-Grained Molecular Interaction Data,” Nucleic Acids Research 50, no. D1 (2022): D648–D653.
O. Palasca, A. Santos, C. Stolte, J. Gorodkin, and L. J. Jensen, “TISSUES 2.0: An Integrative Web Resource on Mammalian Tissue Expression,” Database (Oxford) 2018 (2018): bay003.
D. Chicco, F. Cumbo, and C. Angione, “Ten Quick Tips for Avoiding Pitfalls in Multi-Omics Data Integration Analyses,” PLoS Computational Biology 19, no. 7 (2023): e1011224.
D. S. DeLuca, J. Z. Levin, A. Sivachenko, et al., “RNA-SeQC: RNA-Seq Metrics for Quality Control and Process Optimization,” Bioinformatics 28, no. 11 (2012): 1530–1532.
W.-M. Liu, S. K. Ro, and W. H. Koch, “Making Sense of DNA Microarray Data,” Methods in Molecular Medicine 113 (2005): 293–304.
C. Bielow, G. Mastrobuoni, and S. Kempa, “Proteomics Quality Control: Quality Control Software for MaxQuant Results,” Journal of Proteome Research 15, no. 3 (2016): 777–787.
D. Broadhurst, R. Goodacre, S. N. Reinke, et al., “Guidelines and Considerations for the Use of System Suitability and Quality Control Samples in Mass Spectrometry Assays Applied in Untargeted Clinical Metabolomic Studies,” Metabolomics 14, no. 6 (2018): 72.
Q. Li, J. B. Brown, H. Huang, and P. J. Bickel, “Measuring Reproducibility of High-Throughput Experiments,” Annals of Applied Statistics 5, no. 3 (2011): 1752–1779.
R. Bourgon, R. Gentleman, and W. Huber, “Independent Filtering Increases Detection Power for High-Throughput Experiments,” In Proceedings of the National Academy of Sciences of the United States of America 107, no. 21 (2010): 9546–9551.
R. Wei, J. Wang, M. Su, et al., “Missing Value Imputation Approach for Mass Spectrometry-Based Metabolomics Data,” Scientific Reports 8, no. 1 (2018): 663.
Z. Cao, M. McCabe, P. Callas, et al., “Recalibrating Single-Study Effect Sizes Using Hierarchical Bayesian Models,” Frontiers in Neuroimaging 2 (2023): 1138193.
S. Zhao, Z. Ye, and R. Stanton, “Misuse of RPKM or TPM Normalization When Comparing Across Samples and Sequencing Protocols,” RNA 26, no. 8 (2020): 903–909.
S. Anders and W. Huber, “Differential Expression Analysis for Sequence Count Data,” Genome Biology 11, no. 10 (2010): R106.
M. D. Robinson and A. Oshlack, “A Scaling Normalization Method for Differential Expression Analysis of RNA-Seq Data,” Genome Biology 11, no. 3 (2010): R25.
W. Huber, A. von Heydebreck, H. Sültmann, A. Poustka, and M. Vingron, “Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression,” Bioinformatics 18, no. Suppl 1 (2002): S96–S104.
R. A. Irizarry, B. Hobbs, F. Collin, et al., “Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data,” Biostatistics 4, no. 2 (2003): 249–264.
Z. Wu, R. A. Irizarry, R. Gentleman, F. Martinez-Murillo, and F. Spencer, “A Model-Based Background Adjustment for Oligonucleotide Expression Arrays,” Journal of the American Statistical Association 99, no. 468 (2004): 909–917.
H. V. Trinh, J. Grossmann, P. Gehrig, et al., “ITRAQ-Based and Label-Free Proteomics Approaches for Studies of Human Adenovirus Infections,” International Journal of Proteomics 2013 (2013): 1862.
O. Pagel, L. Kollipara, and A. Sickmann, “Quantitative Proteome Data Analysis of Tandem Mass Tags Labeled Samples,” Methods in Molecular Biology 2228 (2021): 409–417.
W. B. Dunn, D. Broadhurst, P. Begley, et al., “Procedures for Large-Scale Metabolic Profiling of Serum and Plasma Using Gas Chromatography and Liquid Chromatography Coupled to Mass Spectrometry,” Nature Protocols 6, no. 7 (2011): 1060–1083.
W. E. Johnson, C. Li, and A. Rabinovic, “Adjusting Batch Effects in Microarray Expression Data Using Empirical Bayes Methods,” Biostatistics 8, no. 1 (2007): 118–127.
J. T. Leek and J. D. Storey, “Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis,” PLoS Genetics 3, no. 9 (2007): 1724–1735.
T. E. Sweeney, H. R. Wong, and P. Khatri, “Robust Classification of Bacterial and Viral Infections via Integrated Host Gene Expression Diagnostics,” Science Translational Medicine 8, no. 346 (2016): 346ra91.
A. A. Shabalin, H. Tjelmeland, C. Fan, C. M. Perou, and A. B. Nobel, “Merging Two Gene-Expression Studies via Cross-Platform Normalization,” Bioinformatics 24, no. 9 (2008): 1154–1160.
M. Benito, J. Parker, Q. Du, et al., “Adjustment of Systematic Microarray Data Biases,” Bioinformatics 20, no. 1 (2004): 105–114.
Y. Zhao, M.-C. Li, M. M. Konaté, et al., “TPM, FPKM, or Normalized Counts? A Comparative Study of Quantification Measures for the Analysis of RNA-Seq Data From the NCI Patient-Derived Models Repository,” Journal of Translational Medicine 19, no. 1 (2021): 269.
J. Vandesompele, K. de Preter, F. Pattyn, et al., “Accurate Normalization of Real-Time Quantitative RT-PCR Data by Geometric Averaging of Multiple Internal Control Genes,” Genome Biology 3, no. 7 (2002): RESEARCH0034.
B. M. Bolstad, R. A. Irizarry, M. Astrand, and T. P. Speed, “A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Variance and Bias,” Bioinformatics 19, no. 2 (2003): 185–193.
C. Evans, J. Hardin, and D. M. Stoebel, “Selecting Between-Sample RNA-Seq Normalization Methods From the Perspective of Their Assumptions,” Briefings in Bioinformatics 19, no. 5 (2018): 776–792.
Y. Chen, A. T. L. Lun, and G. K. Smyth, “From Reads to Genes to Pathways: Differential Expression Analysis of RNA-Seq Experiments Using Rsubread and the edgeR Quasi-Likelihood Pipeline,” F1000Research 5 (2016): 1438.
M. I. Love, W. Huber, and S. Anders, “Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data With DESeq2,” Genome Biology 15, no. 12 (2014): 550.
J. D. Silver, M. E. Ritchie, and G. K. Smyth, “Microarray Background Correction: Maximum Likelihood Estimation for the Normal-Exponential Convolution,” Biostatistics 10, no. 2 (2009): 352–363.
D. Abdueva, D. Skvortsov, and S. Tavaré, “Non-Linear Analysis of GeneChip Arrays,” Nucleic Acids Research 34, no. 15 (2006): e105.
A. Behdenna, M. Colange, J. Haziza, et al., “pyComBat, a Python Tool for Batch Effects Correction in High-Throughput Molecular Data Using Empirical Bayes Methods,” BMC Bioinformatics 24, no. 1 (2023): 459.
S. L. Raymond, M. C. López, H. V. Baker, et al., “Unique Transcriptomic Response to Sepsis Is Observed Among Patients of Different Age Groups,” PLoS One 12, no. 9 (2017): e0184159.
B. Wang and H. Zou, “Another Look at Distance-Weighted Discrimination,” Journal of the Royal Statistical Society, Series B (Statistical Methodology) 80, no. 1 (2018): 177–198.
H. Huang, X. Lu, Y. Liu, P. Haaland, and J. S. Marron, “R/DWD: Distance-Weighted Discrimination for Classification, Visualization and Batch Adjustment,” Bioinformatics 28, no. 8 (2012): 1182–1183.
S. C. Hicks and R. A. Irizarry, “When to use quantile normalization?,” Preprint at bioRxiv (2014): 012203, https://doi.org/10.1101/012203.
S. Falcon and R. Gentleman, “Hypergeometric Testing Used for Gene Set Enrichment Analysis,” in Bioconductor Case Studies (Springer New York, 2008), 207–220.
D. Sanchez-Taltavull, T. J. Perkins, N. Dommann, et al., “Bayesian Correlation Is a Robust Gene Similarity Measure for Single-Cell RNA-Seq Data,” NAR Genomics and Bioinformatics 2, no. 1 (2020): lqaa002.
A. Subramanian, P. Tamayo, V. K. Mootha, et al., “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles,” In Proceedings of the National Academy of Sciences of the United States of America 102, no. 43 (2005): 15545–15550.
R. DerSimonian and N. Laird, “Meta-Analysis in Clinical Trials,” Controlled Clinical Trials 7, no. 3 (1986): 177–188.
F. Mosteller and R. R. Bush, “Selected Quantitative Techniques,” Handbook of Social Psychology 1 (1954): 289–334.
T. Lipták, “On the Combination of Independent Tests,” Magyar Tudományos Akadémia Matematikai Kutató Intézetének Közleményei 3 (1958): 171–197.
Statistical Methods for Research Workers, “Statistical Methods for Research Workers,” Nature 131, no. 3307 (1933): 383.
S. A. Stouffer, E. A. Suchman, L. C. Devinney, S. A. Star, and R. M. Williams, Jr., The American Soldier: Adjustment During Army Life. (Studies in Social Psychology in World War II) (Princeton University Press, 1949).
T. B. Huedo-Medina, J. Sánchez-Meca, F. Marín-Martínez, and J. Botella, “Assessing Heterogeneity in Meta-Analysis: Q Statistic or I2 Index?,” Psychological Methods 11, no. 2 (2006): 193–206.
S. M. Urbut, G. Wang, P. Carbonetto, and M. Stephens, “Flexible Statistical Methods for Estimating and Testing Effects in Genomic Studies With Multiple Conditions,” Nature Genetics 51, no. 1 (2019): 187–195.
S. Park and S. N. Beretvas, “Synthesizing Effects for Multiple Outcomes per Study Using Robust Variance Estimation Versus the Three-Level Model,” Behavior Research Methods 51, no. 1 (2019): 152–171.
M. Bersanelli, E. Mosca, D. Remondini, et al., “Methods for the Integration of Multi-Omics Data: Mathematical Aspects,” BMC Bioinformatics 17, no. S2 (2016): 15.
Y. Lu, Y.-T. Chang, E. P. Hoffman, et al., “Integrated Identification of Disease Specific Pathways Using Multi-Omics Data,” Preprint at bioRxiv (2019): 666065, https://doi.org/10.1101/666065.
P. J. Rousseeuw and M. Hubert, “Robust Statistics for Outlier Detection,” WIREs Data Mining and Knowledge Discovery 1, no. 1 (2011): 73–79.
J. A. Berlin, C. B. Begg, and T. A. Louis, “An Assessment of Publication Bias Using a Sample of Published Clinical Trials,” Journal of the American Statistical Association 84, no. 406 (1989): 381–392.
T.-M. Nguyen, A. Shafi, T. Nguyen, and S. Draghici, “Identifying Significantly Impacted Pathways: A Comprehensive Review and Assessment,” Genome Biology 20, no. 1 (2019): 203.
L. Klebanov, G. Glazko, P. Salzman, A. Yakovlev, and Y. Xiao, “A Multivariate Extension of the Gene Set Enrichment Analysis,” Journal of Bioinformatics and Computational Biology 5, no. 5 (2007): 1139–1153.
K.-L. Tiong and C.-H. Yeang, “MGSEA - a Multivariate Gene Set Enrichment Analysis,” BMC Bioinformatics 20, no. 1 (2019): 145.
S. Hänzelmann, R. Castelo, and J. Guinney, “GSVA: Gene Set Variation Analysis for Microarray and RNA-Seq Data,” BMC Bioinformatics 14, no. 1 (2013): 7.
A. Alexeyenko, W. Lee, M. Pernemalm, et al., “Network Enrichment Analysis: Extension of Gene-Set Enrichment Analysis to Gene Networks,” BMC Bioinformatics 13, no. 1 (2012): 226.
K. Shen and G. C. Tseng, “Meta-Analysis for Pathway Enrichment Analysis When Combining Multiple Genomic Studies,” Bioinformatics 26, no. 10 (2010): 1316–1323.
Y. Drier, M. Sheffer, and E. Domany, “Pathway-Based Personalized Analysis of Cancer,” Proceedings of the National Academy of Sciences of the United States of America 110, no. 16 (2013): 6388–6393.
H. Yang, C. Cheng, and W. Zhang, “Average Rank-Based Score to Measure Deregulation of Molecular Pathway Gene Sets,” PLoS One 6, no. 11 (2011): e27579.
E. Lee, H.-Y. Chuang, J.-W. Kim, T. Ideker, and D. Lee, “Inferring Pathway Activity Toward Precise Disease Classification,” PLoS Computational Biology 4, no. 11 (2008): e1000217.
J. Tomfohr, J. Lu, and T. B. Kepler, “Pathway Level Analysis of Gene Expression Using Singular Value Decomposition,” BMC Bioinformatics 6, no. 1 (2005): 225.
G. J. Odom, Y. Ban, A. Colaprico, et al., “PathwayPCA: An R/Bioconductor Package for Pathway Based Integrative Analysis of Multi-Omics Data,” Proteomics 20, no. 21–22 (2020): e1900409.
A. Hukku, C. Quick, F. Luca, R. Pique-Regi, and X. Wen, “BAGSE: A Bayesian Hierarchical Model Approach for Gene Set Enrichment Analysis,” Bioinformatics 36, no. 6 (2020): 1689–1695.
J. C. Vivar, P. Pemu, R. McPherson, and S. Ghosh, “Redundancy Control in Pathway Databases (ReCiPa): An Application for Improving Gene-Set Enrichment Analysis in Omics Studies and “Big Data” Biology,” OMICS 17, no. 8 (2013): 414–422.
B. P. Hejblum, J. Skinner, and R. Thiébaut, “Time-Course Gene Set Analysis for Longitudinal Gene Expression Data,” PLoS Computational Biology 11, no. 6 (2015): e1004310.
D. Merico, R. Isserlin, O. Stueker, A. Emili, and G. D. Bader, “Enrichment Map: A Network-Based Method for Gene-Set Enrichment Visualization and Interpretation,” PLoS One 5, no. 11 (2010): e13984.
M. Kutmon, M. P. van Iersel, A. Bohler, et al., “PathVisio 3: An Extendable Pathway Analysis Toolbox,” PLoS Computational Biology 11, no. 2 (2015): e1004085.
P. Shannon, A. Markiel, O. Ozier, et al., “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks,” Genome Research 13, no. 11 (2003): 2498–2504.
T. Liu, P. Salguero, M. Petek, et al., “PaintOmics 4: New Tools for the Integrative Analysis of Multi-Omics Datasets Supported by Multiple Pathway Databases,” Nucleic Acids Research 50, no. W1 (2022): W551–W559.
C. Ogris, M. Castresana-Aguirre, and E. L. L. Sonnhammer, “PathwAX II: Network-Based Pathway Analysis With Interactive Visualization of Network Crosstalk,” Bioinformatics 38, no. 9 (2022): 2659–2660.
M. Ostaszewski, S. Gebel, I. Kuperstein, et al., “Community-Driven Roadmap for Integrated Disease Maps,” Briefings in Bioinformatics 20, no. 2 (2019): 659–670.
M. M. Arroyo, A. Berral-González, S. Bueno-Fortes, D. Alonso-López, and J. D. L. Rivas, “Mining Drug-Target Associations in Cancer: Analysis of Gene Expression and Drug Activity Correlations,” Biomolecules 10, no. 5 (2020): 667.
M. Strickert, F.-M. Schleif, T. Villmann, and U. Seiffert, “Unleashing Pearson Correlation for Faithful Analysis of Biomedical Data,” in Lecture Notes in Computer Science (Springer Berlin Heidelberg, 2009), 70–91.
L. Sun, Y. Yu, T. Huang, et al., “Associations Between Ionomic Profile and Metabolic Abnormalities in Human Population,” PLoS One 7, no. 6 (2012): e38845.
Q. Tan, “Generalized Measure of Dependency for Analysis of Omics Data,” Journal of Data Mining in Genomics & Proteomics 07, no. 1 (2016): 1–2.
F. Monti, D. Stewart, A. Surendra, et al., “Signed Distance Correlation (SiDCo): An Online Implementation of Distance Correlation and Partial Distance Correlation for Data-Driven Network Analysis,” Bioinformatics 39, no. 5 (2023): btad210.
A. C. Skelly, J. R. Dettori, and E. D. Brodt, “Assessing Bias: The Importance of Considering Confounding,” Evidence-Based Spine-Care Journal 3, no. 1 (2012): 9–12.
R. Mazumder and T. Hastie, “The Graphical Lasso: New Insights and Alternatives,” Electronic Journal of Statistics 6 (2012): 2125–2149.
P. Langfelder and S. Horvath, “Fast R Functions for Robust Correlations and Hierarchical Clustering,” Journal of Statistical Software 46, no. 11 (2012): 1–17.
C.-H. Zheng, L. Yuan, W. Sha, and Z.-L. Sun, “Gene Differential Coexpression Analysis Based on Biweight Correlation and Maximum Clique,” BMC Bioinformatics 15, no. S15 (2014): S3.
C. Angione, M. Conway, and P. Lió, “Multiplex Methods Provide Effective Integration of Multi-Omic Data in Genome-Scale Models,” BMC Bioinformatics 17, no. S4 (2016): 83.
M. Chierici, N. Bussola, A. Marcolini, et al., “Integrative Network Fusion: A Multi-Omics Approach in Molecular Profiling,” Frontiers in Oncology 10 (2020): 1065.
M. Scutari and R. Nagarajan, “Identifying Significant Edges in Graphical Models of Molecular Networks,” Artificial Intelligence in Medicine 57, no. 3 (2013): 207–217.
T. Bröhl and K. Lehnertz, “Centrality-Based Identification of Important Edges in Complex Networks,” Chaos 29, no. 3 (2019): 33115.
W. Liu, K. A. Pratte, P. J. Castaldi, et al., “A Generalized Higher-Order Correlation Analysis Framework for Multi-Omics Network Inference,” PLoS Computational Biology 21, no. 4 (2025): e1011842.
A. Calderone, M. Formenti, F. Aprea, et al., “Comparing Alzheimer's and Parkinson's Diseases Networks Using Graph Communities Structure,” BMC Systems Biology 10 (2016): 25.
N. T. Suresh and K. U. Vimina, “Topology Driven Analysis of Protein - Protein Interactome for Prioritizing Key Comorbid Genes via Sub Graph Based Average Path Length Centrality,” IEEE/ACM Transactions on Computational Biology and Bioinformatics 20, no. 1 (2023): 742–751.
J. Dopazo and C. Erten, “Graph-Theoretical Comparison of Normal and Tumor Networks in Identifying BRCA Genes,” BMC Systems Biology 11, no. 1 (2017): 110.
R. Anglani, T. M. Creanza, V. C. Liuzzi, et al., “Loss of Connectivity in Cancer Co-Expression Networks,” PLoS One 9, no. 1 (2014): e87075.
D. Lee and K.-H. Cho, “Topological Estimation of Signal Flow in Complex Signaling Networks,” Scientific Reports 8, no. 1 (2018): 1–11.
Y. Wang, D. Xie, S. Ma, N. Shao, X. Zhang, and X. Wang, “Exploring the Common Mechanism of Vascular Dementia and Inflammatory Bowel Disease: A Bioinformatics-Based Study,” Frontiers in Immunology 15 (2024): 1347415, https://doi.org/10.3389/fimmu.2024.1347415.
D. Koschützki and F. Schreiber, “Centrality Analysis Methods for Biological Networks and Their Application to Gene Regulatory Networks,” Gene Regulation and Systems Biology 2 (2008): 193–201.
H. Yu, P. M. Kim, E. Sprecher, V. Trifonov, and M. Gerstein, “The Importance of Bottlenecks in Protein Networks: Correlation With Gene Essentiality and Expression Dynamics,” PLoS Computational Biology 3, no. 4 (2007): e59.
A. J. Bishara and J. B. Hittner, “Confidence Intervals for Correlations When Data Are Not Normal,” Behavior Research Methods 49, no. 1 (2017): 294–309.
S.-A. Lee, T. T.-H. Tsao, K.-C. Yang, et al., “Construction and Analysis of the Protein-Protein Interaction Networks for Schizophrenia, Bipolar Disorder, and Major Depression,” BMC Bioinformatics 12, no. S13 (2011): S20.
L. Meng, A. Striegel, and T. Milenković, “Local Versus Global Biological Network Alignment,” Bioinformatics 32, no. 20 (2016): 3155–3164.
D. L. Gibbs, L. Gralinski, R. S. Baric, and S. K. McWeeney, “Multi-Omic Network Signatures of Disease,” Frontiers in Genetics 4 (2014): 309.
M. K. Arici and N. Tuncbag, “Unveiling Hidden Connections in Omics Data via pyPARAGON: An Integrative Hybrid Approach for Disease Network Construction,” Brief Bioinform 25, no. 5 (2024): 1–12, https://doi.org/10.1093/bib/bbae399.
P. Langfelder, R. Luo, M. C. Oldham, and S. Horvath, “Is My Network Module Preserved and Reproducible?,” PLoS Computational Biology 7, no. 1 (2011): e1001057.
D. Li, J. B. Brown, L. Orsini, Z. Pan, G. Hu, and S. He, “MODA: MOdule Differential Analysis for Weighted Gene Co-Expression Network,” Preprint at arXiv:1605.04739 (2016), https://doi.org/10.48550/ARXIV.1605.04739.
J. Ruan, A. K. Dean, and W. Zhang, “A General Co-Expression Network-Based Approach to Gene Expression Analysis: Comparison and Applications,” BMC Systems Biology 4, no. 1 (2010): 8.
X. Xing, F. Yang, H. Li, et al., “Multi-Level Attention Graph Neural Network Based on Co-Expression Gene Modules for Disease Diagnosis and Prognosis,” Bioinformatics 38, no. 8 (2022): 2178–2186.
A. Baptista, A. Gonzalez, and A. Baudot, “Universal Multilayer Network Exploration by Random Walk With Restart,” Communications Physics 5 (2022): 170.
X. Fan, P. Zhu, and X.-Q. Tang, “VD-Analysis: A Dynamic Network Framework for Analyzing Disease Progressions,” IEEE Access 8 (2020): 153202–153214.
M. Anděl, J. Kléma, and Z. Krejčík, “Network-Constrained Forest for Regularized Classification of Omics Data,” Methods 83 (2015): 88–97.
N. Vlassis and E. Glaab, “GenePEN: Analysis of Network Activity Alterations in Complex Diseases via the Pairwise Elastic Net,” Statistical Applications in Genetics and Molecular Biology 14, no. 2 (2015): 221–224.
M. Ringnér, “What Is Principal Component Analysis?,” Nature Biotechnology 26, no. 3 (2008): 303–304.
K. Devarajan, “Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology,” PLoS Computational Biology 4, no. 7 (2008): e1000029.
J.-P. Brunet, P. Tamayo, T. R. Golub, and J. P. Mesirov, “Metagenes and Molecular Pattern Discovery Using Matrix Factorization,” Proceedings of the National Academy of Sciences of the United States of America 101, no. 12 (2004): 4164–4169.
P. Fogel, C. Geissler, N. Morizet, and G. Luta, “On Rank Selection in Non-Negative Matrix Factorization Using Concordance,” Mathematics 11, no. 22 (2023): 4611.
A. Hyvärinen and E. Oja, “Independent Component Analysis: Algorithms and Applications,” Neural Networks 13, no. 4–5 (2000): 411–430.
J. Miettinen, K. Nordhausen, and S. Taskinen, “fICA: FastICA Algorithms and Their Improved Variants,” R Journal 10, no. 2 (2018): 148–158.
D. Langlois, S. Chartier, and D. Gosselin, “An Introduction to Independent Component Analysis: InfoMax and FastICA Algorithms,” Tutorial in Quantitative Methods for Psychology 6, no. 1 (2010): 31–38.
M. Moosmann, T. Eichele, H. Nordby, K. Hugdahl, and V. D. Calhoun, “Joint Independent Component Analysis for Simultaneous EEG-fMRI: Principle and Simulation,” International Journal of Psychophysiology: Official Journal of the International Organization of Psychophysiology 67, no. 3 (2007): 212.
Z. Li, S. E. Safo, and Q. Long, “Incorporating Biological Information in Sparse Principal Component Analysis With Application to Genomic Data,” BMC Bioinformatics 18, no. 1 (2017): 332.
H. Kim and H. Park, “Sparse Non-Negative Matrix Factorizations via Alternating Non-Negativity-Constrained Least Squares for Microarray Data Analysis,” Bioinformatics 23, no. 12 (2007): 1495–1502.
S. Zhang, C.-C. Liu, W. Li, H. Shen, P. W. Laird, and X. J. Zhou, “Discovery of Multi-Dimensional Modules by Integrative Analysis of Cancer Genomic Data,” Nucleic Acids Research 40, no. 19 (2012): 9379–9391.
J. C. Chang, P. Fletcher, J. Han, et al., “Sparse Encoding for More-Interpretable Feature-Selecting Representations in Probabilistic Matrix Factorization,” Preprint at arXiv:2012.04171 (2020).
J. Fang, “Tightly Integrated Genomic and Epigenomic Data Mining Using Tensor Decomposition,” Bioinformatics 35, no. 1 (2019): 112–118.
U. Mor, Y. Cohen, R. Valdés-Mas, D. Kviatcovsky, E. Elinav, and H. Avron, “Dimensionality Reduction of Longitudinal ‘Omics Data Using Modern Tensor Factorizations,” PLoS Computational Biology 18, no. 7 (2022): e1010212.
R. Bro, “PARAFAC. Tutorial and Applications,” Chemometrics and Intelligent Laboratory Systems 38, no. 2 (1997): 149–171.
M. Mørup, L. K. Hansen, and S. M. Arnfred, “Algorithms for Sparse Nonnegative Tucker Decompositions,” Neural Computation 20, no. 8 (2008): 2112–2131.
P. Lourenço, B. J. Guerreiro, P. Batista, P. Oliveira, and C. Silvestre, “Uncertainty Characterization of the Orthogonal Procrustes Problem With Arbitrary Covariance Matrices,” Pattern Recognition 61 (2017): 210–220.
M. E. Ritchie, B. Phipson, D. Wu, et al., “Limma Powers Differential Expression Analyses for RNA-Sequencing and Microarray Studies,” Nucleic Acids Research 43, no. 7 (2015): e47.
T. M. J. Fruchterman and E. M. Reingold, “Graph Drawing by Force-Directed Placement,” Software: Practice and Experience 21, no. 11 (1991): 1129–1164.
M. Jacomy, T. Venturini, S. Heymann, and M. Bastian, “ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software,” PLoS One 9, no. 6 (2014): e98679.
B. Zhang and S. Horvath, “A General Framework for Weighted Gene Co-Expression Network Analysis,” Statistical Applications in Genetics and Molecular Biology 4 (2005): 17.
E. Acar, G. Gürdeniz, M. A. Rasmussen, D. Rago, L. O. Dragsted, and R. Bro, “Coupled Matrix Factorization with Sparse Factors to Identify Potential Biomarkers in Metabolomics,” in 2012 IEEE 12th International Conference on Data Mining Workshops (IEEE, 2012), 1–8.
Y. Benjamini and Y. Hochberg, “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Statistical Society, Series B (Statistical Methodology) 57, no. 1 (1995): 289–300.
G. Hackeling, Mastering Machine Learning With Scikit-Learn, 2nd ed. (Packt Publishing, 2023), 1.
R. Gaujoux and C. Seoighe, “A Flexible R Package for Nonnegative Matrix Factorization,” BMC Bioinformatics 11 (2010): 367.
J. Kossaifi, Y. Panagakis, A. Anandkumar, and M. Pantic, “TensorLy: Tensor Learning in Python,” Journal of Machine Learning Research 20 (2019): 1–6.
T. D. Sherman, T. Gao, and E. J. Fertig, “CoGAPS 3: Bayesian Non-Negative Matrix Factorization for Single-Cell Analysis With Asynchronous Updates and Sparse Data Structures,” BMC Bioinformatics 21, no. 1 (2020): 453.
E. F. Lock, J. Y. Park, and K. A. Hoadley, “Bidimensional Linked Matrix Factorization for Pan-Omics Pan-Cancer Analysis,” Annals of Applied Statistics 16, no. 1 (2022): 193–215.
R. Argelaguet, B. Velten, D. Arnol, et al., “Multi-Omics Factor Analysis-a Framework for Unsupervised Integration of Multi-Omics Data Sets,” Molecular Systems Biology 14, no. 6 (2018): e8124.
R. Argelaguet, D. Arnol, D. Bredikhin, et al., “MOFA+: A Statistical Framework for Comprehensive Integration of Multi-Modal Single-Cell Data,” Genome Biology 21, no. 1 (2020): 111.
K. A. Hoadley, C. Yau, D. M. Wolf, et al., “Multi-Platform Analysis of 12 Cancer Types Reveals Molecular Classification Within and Across Tissues-Of-Origin,” Cell 158, no. 4 (2014): 929.
N. Altman and M. Krzywinski, “Clustering,” Nature Methods 14, no. 6 (2017): 545–546.
A. Goder and V. Filkov, “Consensus Clustering Algorithms: Comparison and Refinement,” in 2008 Proceedings of the Tenth Workshop on Algorithm Engineering and Experiments (ALENEX) (Society for Industrial and Applied Mathematics, 2008), 109–117.
R. Caruana, M. Elhawary, N. Nguyen, and C. Smith., “Meta Clustering, ” In 2006 Sixth International Conference on Data Mining (ICDM'06), 107-118.
G. Brière, É. Darbo, P. Thébault, and R. Uricaru, “Consensus Clustering Applied to Multi-Omics Disease Subtyping,” BMC Bioinformatics 22 (2021): 361.
R. Tibshirani, G. Walther, and T. Hastie, “Estimating the Number of Clusters in a Data Set via the Gap Statistic,” Journal of the Royal Statistical Society, Series B (Statistical Methodology) 63, no. 2 (2001): 411–423.
P. J. Rousseeuw, “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis,” Journal of Computational and Applied Mathematics 20 (1987): 53–65.
M. Hahsler, M. Piekenbrock, and D. Doran, “Dbscan: Fast Density-Based Clustering With R,” Journal of Statistical Software 91, no. 1 (2019): 1–30.
M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander, “Optics,” SIGMOD Record 28, no. 2 (1999): 49–60.
J. Zhao, Q. Guan, C. Zheng, and Q. Cao, “ADSVAE: An Adaptive Density-Aware Spectral Clustering Method for Multi-Omics Data Based on Variational Autoencoder,” Current Bioinformatics 18, no. 6 (2023): 527–536.
T. Kohonen, “The Self-Organizing Map,” In Proceedings of the IEEE 78, no. 9 (1990): 1464–1480.
D. G. Covell, A. Wallqvist, A. A. Rabow, and N. Thanki, “Molecular Classification of Cancer: Unsupervised Self-Organizing Map Analysis of Gene Expression Microarray Data,” Molecular Cancer Therapeutics 2, no. 3 (2003): 317–332.
S. Laghaee, M. Eskandarian, M. Fereidoon, and S. Koohi, “scVAG: Unified Single-Cell Clustering via Variational-Autoencoder Integration With Graph Attention Autoencoder,” Heliyon 10, no. 23 (2024): e40732.
H. Yang, R. Chen, D. Li, and Z. Wang, “Subtype-GAN: A Deep Learning Approach for Integrative Cancer Subtyping of Multi-Omics Data,” Bioinformatics 37, no. 16 (2021): 2231–2237.
C. Sirocchi, M. Urschler, and B. Pfeifer, “Feature Graphs for Interpretable Unsupervised Tree Ensembles: Centrality, Interaction, and Application in Disease Subtyping,” Biodata Mining 18 (2024): 15, http://arxiv.org/abs/2404.17886.
S. Monti, P. Tamayo, J. Mesirov, and T. Golub, “Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data,” Machine Learning 52, no. 1 (2003): 91–118.
F. Morais-Rodrigues, R. Silv́erio-Machado, R. B. Kato, et al., “Analysis of the Microarray Gene Expression for Breast Cancer Progression After the Application Modified Logistic Regression,” Gene 726 (2020): 144168.
B. Wang, A. M. Mezlini, F. Demir, et al., “Similarity Network Fusion for Aggregating Data Types on a Genomic Scale,” Nature Methods 11, no. 3 (2014): 333–337.
H. Yuan, X. Fan, Y. Jin, et al., “Development of Heart Failure Risk Prediction Models Based on a Multi-Marker Approach Using Random Forest Algorithms,” Chinese Medical Journal 132, no. 7 (2019): 819–826.
X. Xu, J. Wang, T. Chen, et al., “Deciphering Novel Mitochondrial Signatures: Multi-Omics Analysis Uncovers Cross-Disease Markers and Oligodendrocyte Pathways in Alzheimer's Disease and Glioblastoma,” Frontiers in Aging Neuroscience 17 (2025): 1536142.
P. Gong, L. Cheng, Z. Zhang, et al., “Multi-Omics Integration Method Based on Attention Deep Learning Network for Biomedical Data Classification,” Computer Methods and Programs in Biomedicine 231 (2023): 107377.
T.-C. Hsu and C. Lin, “Learning From Small Medical Data-Robust Semi-Supervised Cancer Prognosis Classifier With Bayesian Variational Autoencoder,” Bioinformatics Advances 3, no. 1 (2023): vbac100.
S. H. Oh, S. Back, and J. Park, “Measuring Patient Similarity on Multiple Diseases by Joint Learning via a Convolutional Neural Network,” Sensors 22, no. 1 (2021): 131.
Y. Gu, Z. Ge, C. P. Bonnington, and J. Zhou, “Progressive Transfer Learning and Adversarial Domain Adaptation for Cross-Domain Skin Disease Classification,” IEEE Journal of Biomedical and Health Informatics 24, no. 5 (2020): 1379–1393.
P. S. Reel, S. Reel, E. Pearson, E. Trucco, and E. Jefferson, “Using Machine Learning Approaches for Multi-Omics Data Analysis: A Review,” Biotechnology Advances 49 (2021): 107739.
A. Rau, G. Marot, and F. Jaffrézic, “Differential Meta-Analysis of RNA-Seq Data From Multiple Studies,” BMC Bioinformatics 15, no. 1 (2014): 91.
S. Lu, J. Li, C. Song, K. Shen, and G. C. Tseng, “Biomarker Detection in the Integration of Multiple Multi-Class Genomic Studies,” Bioinformatics 26, no. 3 (2010): 333–340.
W. A. Haynes, F. Vallania, C. Liu, et al., “Empowering Multi-Cohort Gene Expression Analysis to Increase Reproducibility,” Pacific Symposium on Biocomputing 22 (2017): 144–153.
I. S. Piras, M. Manchia, M. J. Huentelman, et al., “Peripheral Biomarkers in Schizophrenia: A Meta-Analysis of Microarray Gene Expression Datasets,” International Journal of Neuropsychopharmacology 22, no. 3 (2019): 186–193.
R. Isserlin, D. Merico, V. Voisin, and G. D. Bader, “Enrichment Map: A Cytoscape App to Visualize and Explore OMICs Pathway Enrichment Results,” F1000Research 3 (2014): 141.
A. L. Tarca, S. Draghici, P. Khatri, et al., “A Novel Signaling Pathway Impact Analysis,” Bioinformatics 25, no. 1 (2009): 75–82.
C. J. Vaske, S. C. Benz, J. Z. Sanborn, et al., “Inference of Patient-Specific Pathway Activities From Multi-Dimensional Cancer Genomics Data Using PARADIGM,” Bioinformatics 26, no. 12 (2010): i237–i245.
P. Langfelder and S. Horvath, “WGCNA: An R Package for Weighted Correlation Network Analysis,” BMC Bioinformatics 9, no. 1 (2008): 559.
A. Fukushima, “DiffCorr: An R Package to Analyze and Visualize Differential Correlations in Biological Networks,” Gene 518, no. 1 (2013): 209–214.
J. Wang, Z. Ma, S. A. Carr, et al., “Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction,” Molecular & Cellular Proteomics 16, no. 1 (2017): 121–134.
G. Neubig, C. Dyer, Y. Goldberg, et al., “DyNet: The Dynamic Neural Network Toolkit,” Preprint at arXiv (2017): 1701.03980.
M. J. Cowley, M. Pinese, K. S. Kassahn, et al., “PINA v2.0: Mining Interactome Modules,” Nucleic Acids Research 40, no. Issue D1 (2012): D862–D865.
E. F. Lock, K. A. Hoadley, J. S. Marron, and A. B. Nobel, “Joint and Individual Variation Explained (Jive) for Integrated Analysis of Multiple Data Types,” Annals of Applied Statistics 7, no. 1 (2013): 523–542.
F. Ertam and G. Aydin, “Data Classification with Deep Learning Using Tensorflow,” in 2017 International Conference on Computer Science and Engineering (UBMK) (IEEE, 2017), 755–758.
M. Brooks, PyTorch Deep Learning: Build and Deploy Models from CNNs to Multimodal Architectures, LLMs, and Beyon (Independently Published, 2025), 1.
A. Singh, C. P. Shannon, B. Gautier, et al., “DIABLO: An Integrative Approach for Identifying Key Molecular Drivers From Multi-Omics Assays,” Bioinformatics 35, no. 17 (2019): 3055–3062.
R. Lopez, J. Regier, M. B. Cole, M. I. Jordan, and N. Yosef, “Deep Generative Modeling for Single-Cell Transcriptomics,” Nature Methods 15, no. 12 (2018): 1053–1058.
J. M. Perkel, “Terra Takes the Pain out of ‘Omics’ Computing in the Cloud,” Nature 601, no. 7891 (2022): 154–155.
P. J. A. Cock, B. A. Grüning, K. Paszkiewicz, and L. Pritchard, “Galaxy Tools and Workflows for Sequence Analysis With Applications in Molecular Plant Pathology,” PeerJ 1 (2013): e167.
S. Boluki, M. S. Esfahani, X. Qian, and E. R. Dougherty, “Incorporating Biological Prior Knowledge for Bayesian Learning via Maximal Knowledge-Driven Information Priors,” BMC Bioinformatics 18, no. Suppl 14 (2017): 552.
T. Sanavia, F. Aiolli, M. G. Da San, A. Bisognin, and B. Di Camillo, “Improving Biomarker List Stability by Integration of Biological Knowledge in the Learning Process,” BMC Bioinformatics 13, no. S4 (2012): S22.
Q. Wang, M. He, L. Guo, and H. Chai, “AFEI: Adaptive Optimized Vertical Federated Learning for Heterogeneous Multi-Omics Data Integration,” Briefings in Bioinformatics 24, no. 5 (2023): bbad269, https://doi.org/10.1093/bib/bbad269.
J. Zhou, S. Chen, Y. Wu, et al., “PPML-Omics: A Privacy-Preserving Federated Machine Learning Method Protects Patients' Privacy in Omic Data,” Science Advances 10, no. 5 (2024): eadh8601.
Q. Shi, X. Chen, and Z. Zhang, “Decoding Human Biology and Disease Using Single-Cell Omics Technologies,” Genomics, Proteomics & Bioinformatics 21, no. 5 (2023): 926–949.
F. S. Dezem, W. Arjumand, H. DuBose, N. S. Morosini, and J. Plummer, “Spatially Resolved Single-Cell Omics: Methods, Challenges, and Future Perspectives,” Annual Review of Biomedical Data Science 7, no. 1 (2024): 131–153.
A. A. Metwally, T. Zhang, S. Wu, et al., “Robust Identification of Temporal Biomarkers in Longitudinal Omics Studies,” Bioinformatics 38, no. 15 (2022): 3802–3811.
B. B. Misra, C. Langefeld, M. Olivier, and L. A. Cox, “Integrated Omics: Tools, Advances and Future Approaches,” Journal of Molecular Endocrinology 62, no. 1 (2019): R21–R45.
N. Selvaraj, A. K. Swaroop, B. S. S. Nidamanuri, R. R. Kumar, J. Natarajan, and J. Selvaraj, “Network-Based Drug Repurposing: A Critical Review,” Current Drug Research Reviews 14, no. 2 (2022): 116–131.
X. Zhang, J. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Integrated Multi-Omics Analysis Using Variational Autoencoders: Application to Pan-Cancer Classification,” in 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (IEEE, 2019), 765–769, https://doi.org/10.1109/bibm47256.2019.8983228.
R. Wei and A. Mahmood, “Recent Advances in Variational Autoencoders With Representation Learning for Biomedical Informatics: A Survey,” IEEE Access 9 (2021): 4939–4956.
J.-H. Park and Y.-R. Cho, “Computational Disease-Gene Association Prioritization Using Graph Neural Networks and Attention Mechanisms,” in 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (IEEE, 2024), 6294–6298.
Z. Liu and T. Park, “DMOIT: Denoised Multi-Omics Integration Approach Based on Transformer Multi-Head Self-Attention Mechanism,” Frontiers in Genetics 10, no. 15 (2024): 1488683.
L. Pan, P. Rong, D. Liu, P. Qin, X. Zeng, and S. Peng, “PACS: Prediction and Analysis of Cancer Subtypes from Multi-Omics Data Based on a Multi-Head Attention Mechanism Model,” in 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (IEEE, 2023), 904–909.
Y. Park, A.-C. Hauschild, and D. Heider, “Transfer Learning Compensates Limited Data, Batch-Effects, and Technological Heterogeneity in Single-Cell Sequencing,” NAR Genomics and Bioinformatics 3, no. 4 (2021): lqab104.
J. M. A. Miñoza, J. A. Rico, P. R. F. Zamora, et al., “Biomarker Discovery for Meta-Classification of Melanoma Metastatic Progression Using Transfer Learning,” Genes (Basel) 13, no. 12 (2022): 2303.
Y. Choi, W. Yu, M. B. Nagarajan, et al., “Translating AI to Clinical Practice: Overcoming Data Shift With Explainability,” Radiographics 43, no. 5 (2023): e220105.
G. Arjunan, “Implementing Explainable AI in Healthcare: Techniques for Interpretable Machine Learning Models in Clinical Decision-Making,” International Journal of Scientific Research and Management 9, no. 5 (2021): 597–603.
A. Sadilek, L. Liu, D. Nguyen, et al., “Privacy-First Health Research With Federated Learning,” npj Digital Medicine 4, no. 1 (2021): 132.
A. Raheem, Y. Zhen, H. Yu, F. Sabah, S. Ahmed, and M. Yaqub, “Empowering Biomedical Health with Federated Learning: Addressing Privacy and Data Sharing for Enhanced Disease Detection and Diagnosis,” in 2023 8th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC) (IEEE, 2023), 36–40, https://doi.org/10.1109/ic-nidc59918.2023.10388495.
S. Varrette, H. Cartiaux, S. Peter, E. Kieffer, T. Valette, and A. Olloh, “Management of an Academic HPC & Research Computing Facility: The ULHPC Experience 2.0,” in Proceedings of the 2022 6th High Performance Computing and Cluster Technologies Conference. New York, NY, USA (Association for Computing Machinery, 2022), 14–24.