[en] Machine learning (ML) methodology used in the social and health sciences needs to fit the intended research purposes of description, prediction, or causal inference. This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, counterfactual prediction, and causal structural learning to common research goals, such as estimating prevalence of adverse social or health outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes, and explain common ML performance metrics. Such mapping may help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
Research center :
- Integrative Research Unit: Social and Individual Development (INSIDE) > PEARL Institute for Research on Socio-Economic Inequality (IRSEI)
Disciplines :
Social & behavioral sciences, psychology: Multidisciplinary, general & others Sociology & social sciences Public health, health care sciences & services
Author, co-author :
Leist, Anja ; University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Social Sciences (DSOC)
Klee, Matthias ; University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Social Sciences (DSOC)
Kim, Jung Hyun ; University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Social Sciences (DSOC)
Rehkopf, David
Bordas, Stéphane ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Engineering (DoE)
Muniz-Terrera, Graciela
Wade, Sarah
External co-authors :
yes
Language :
English
Title :
Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences
Publication date :
2022
Journal title :
Science Advances
ISSN :
2375-2548
Publisher :
American Association for the Advancement of Science (AAAS), Washington, United States - District of Columbia
Volume :
8
Pages :
eabk1942
Peer reviewed :
Peer Reviewed verified by ORBi
Focus Area :
Computational Sciences
European Projects :
H2020 - 803239 - CRISP - Cognitive Aging: From Educational Opportunities to Individual Risk Profiles
T. L. Wiemken, R. R. Kelley, Machine learning in epidemiology and health outcomes research. Ann. Rev. Public Health 41, 21–36 (2020).
T. Yarkoni, J. Westfall, Choosing prediction over explanation in psychology: Lessons from machine learning. Perspect. Psychol. Sci. 12, 1100–1122 (2017).
H. R. Varian, Big data: New tricks for econometrics. J. Econ. Perspect. 28, 3–28 (2014).
J. Friedman, T. Hastie, R. Tibshirani, The Elements of Statistical Learning (Springer Series in Statistics, Springer, 2001), vol. 1.
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning (Springer, 2013), vol. 112.
M. A. Hernán, J. Hsu, B. Healy, A second chance to get causal inference right: A classification of data science tasks. Chance 32, 42–49 (2019).
J. Tukey, Exploratory Data Analysis (Addison-Wesley Series in Behavioral Science: Quantitative Methods, Addison-Wesley, 1977).
I. Y. Chen, E. Pierson, S. Rose, S. Joshi, K. Ferryman, M. Ghassemi, Ethical machine learning in healthcare. Annu. Rev. Biomed. Data Sci. 4, 123–144 (2021).
W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, B. Yu, Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. U.S.A. 116, 22071–22080 (2019).
S. J. Pan, Q. Yang, A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2009).
Y. Gao, Y. Cui, Deep transfer learning for reducing health care disparities arising from biomedical data inequality. Nat. Commun. 11, 5131 (2020).
M. Kuhn, K. Johnson, Applied Predictive Modeling (Springer, 2013), vol. 26.
D. Donoho, J. Tanner, Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing. Phil. Trans. R. Soc. A 367, 4273–4293 (2009).
C. Ding, H. Peng, Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3, 185–205 (2005).
F. Yu, C. Wei, P. Deng, T. Peng, X. Hu, Deep exploration of random forest model boosts the interpretability of machine learning studies of complicated immune responses and lung burden of nanoparticles. Sci. Adv. 7, eabf4130 (2021).
M. Anthony, P. Bartlett, Neural Network Learning: Theoretical Foundations (Cambridge Univ. Press, 1999).
M. A. Hernán, Does water kill? A call for less casual causal inferences. Ann. Epidemiol. 26, 674–680 (2016).
R. D. King, O. I. Orhobor, C. C. Taylor, Cross-validation is safe to use. Nat. Mach. Intell. 3, 276 (2021).
E. LeDell, S. Poirier, H2O AutoML: Scalable automatic machine learning. 7th ICML Workshop on Automated Machine Learning (AutoML); www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf.
F. Pedregosa, G. Vaorquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
European Commission, Directorate General for Communications Networks, Content and Technology, High Level Expert Group on Artificial Intelligence, Ethics guidelines for trustworthy AI (Publications Office, LU, 2019); https://data.europa.eu/doi/10.2759/177365.
C. Rudin, Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
K. J. Rohlfing, P. Cimiano, I. Scharlau, T. Matzner, H. M. Buhl, H. Buschmeier, E. Esposito, A. Grimminger, B. Hammer, R. Häb-Umbach, Explanation as a social practice: Toward a conceptual framework for the social design of AI systems. IEEE Trans. Cogn. Dev. Syst. 13, 717–728 (2020).
B. M. Greenwell, pdp: An R package for constructing partial dependence plots. R J. 9, 421–436 (2017).
S. M. Lundberg, G. G. Erion, S.-I. Lee, Consistent individualized feature attribution for tree ensembles. arXiv:1802.03888 [cs.LG] (12 February 2018).
S. Mangalathu, S.-H. Hwang, J.-S. Jeon, Failure mode and effects analysis of RC members based on machine-learning-based SHapley additive exPlanations (SHAP) approach. Eng. Struct. 219, 110927 (2020).
P. Hall, N. Gill, M. Kurka, W. Phan, Machine learning interpretability with h2o driverless ai (H2O.ai, 2017).
S. Barocas, M. Hardt, A. Narayanan, Fairness in Machine Learning (fairmlbook.org, 2019), vol. 1.
Z. Obermeyer, B. Powers, C. Vogeli, S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).
S. Mitchell, E. Potash, S. Barocas, A. D’Amour, K. Lum, Algorithmic fairness: Choices, assumptions, and definitions. Annu. Rev. Stat. Appl. 8, 141–163 (2021).
S. R. Pfohl, A. Foryciarz, N. H. Shah, An empirical characterization of fair machine learning for clinical risk prediction. J. Biomed. Inform. 113, 103621 (2021).
C. Birkenbihl, M. A. Emon, H. Vrooman, S. Westwood, S. Lovestone; AddNeuroMed Consortium, M. Hofmann-Apitius, H. Fröhlich, Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia-lessons for translation into clinical practice. EPMA J. 11, 367–376 (2020).
A. Tsymbal, The problem of concept drift: Definitions and related work. Computer Science Department, Trinity College Dublin. 106, 58 (2004).
M. Straat, F. Abadi, Z. Kan, C. Göpfert, B. Hammer, M. Biehl, Supervised learning in the presence of concept drift: A modelling framework. Neural Comput. Appl. 34, 101–118 (2022).
S. Salmi, S. Mérelle, R. Gilissen, W.-P. Brinkman, Content-based recommender support system for counselors in a suicide prevention chat helpline: Design and evaluation study. J. Med. Internet Res. 23, e21690 (2021).
L. Van Der Maaten, E. Postma, J. Van den Herik, Dimensionality reduction: A comparative. J. Mach. Learn. Res. 10, 66–71 (2009).
V. Kuan, H. C. Fraser, M. Hingorani, S. Denaxas, A. Gonzalez-Izquierdo, K. Direk, D. Nitsch, R. Mathur, C. A. Parisinos, R. T. Lumbers, Data-driven identification of ageing-related diseases from electronic health records. Sci. Rep. 11, 1–17 (2021).
S. M. Lundberg, B. Nair, M. S. Vavilala, M. Horibe, M. J. Eisses, T. Adams, D. E. Liston, D. K.-W. Low, S.-F. Newman, J. Kim, Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2, 749–760 (2018).
N. Demnitz, M. Anatürk, C. L. Allan, N. Filippini, L. Griffanti, C. E. Mackay, A. Mahmood, C. E. Sexton, S. Suri, A. G. Topiwala, E. Zsoldos, M. Kivimäki, A. Singh-Manoux, K. P. Ebmeier, Association of trajectories of depressive symptoms with vascular risk, cognitive function and adverse brain outcomes: The Whitehall II MRI sub-study. J. Psychiatr. Res. 131, 85–93 (2020).
S. Amoretti, N. Verdolini, G. Mezquida, F. D. Rabelo-da-Ponte, M. J. Cuesta, L. Pina-Camacho, M. Gomez-Ramiro, C. De-la-Cámara, A. González-Pinto, C. M. Díaz-Caneja, I. Corripio, E. Vieta, E. de la Serna, A. Mané, B. Solé, A. F. Carvalho, M. Serra, M. Bernardo, Identifying clinical clusters with distinct trajectories in first-episode psychosis through an unsupervised machine learning technique. Eur. Neuropsychopharmacol. 47, 112–129 (2021).
J. Weiss, E. Puterman, A. A. Prather, E. B. Ware, D. H. Rehkopf, A data-driven prospective study of dementia among older adults in the United States. PLOS ONE 15, e0239994 (2020).
M. Hernán, J. Robins, Causal Inference: What If (Chapman & Hall/CRC, 2020).
A. Galozy, Towards understanding ICU procedures using similarities in patient trajectories: An exploratory study on the MIMIC-III intensive care database (2018); www.diva-portal.org/smash/get/diva2:1229433/FULLTEXT02.
W. Luo, T. Nguyen, M. Nichols, T. Tran, S. Rana, S. Gupta, D. Phung, S. Venkatesh, S. Allender, Is demography destiny? Application of machine learning techniques to accurately predict population health outcomes from a minimal demographic dataset. PLOS ONE 10, e0125602 (2015).
H. Wang, A. Rodríguez, Identifying pediatric cancer clusters in florida using log-linear models and generalized lasso penalties. Stat. Public Policy 1, 86–96 (2014).
E. L. Aiken, A. T. Nguyen, C. Viboud, M. Santillana, Toward the use of neural networks for influenza prediction at multiple spatial resolutions. Sci. Adv. 7, eabb1237 (2021).
J. C. Eichstaedt, R. J. Smith, R. M. Merchant, L. H. Ungar, P. Crutchley, D. Preoţiuc-Pietro, D. A. Asch, H. A. Schwartz, Facebook language predicts depression in medical records. Proc. Natl. Acad. Sci. U.S.A. 115, 11203–11208 (2018).
L. C. de Langavant, E. Bayen, A. Bachoud-Lévi, K. Yaffe, Approximating dementia prevalence in population-based surveys of aging worldwide: An unsupervised machine learning approach. Alzheimers Dementia 6, e12074 (2020).
S. I. R. Bhagyashree, K. Nagaraj, M. Prince, C. H. Fall, M. Krishna, Diagnosis of Dementia by Machine learning methods in epidemiological studies: A pilot exploratory study from south India. Soc. Psychiatry Psychiatr. Epidemiol. 53, 77–86 (2018).
K. Z. Gianattasio, A. Ciarleglio, M. C. Power, Development of algorithmic dementia ascertainment for racial/ethnic disparities research in the US health and retirement study. Epidemiology 31, 126–133 (2020).
M. Morris, M. S. Handcock, D. R. Hunter, Specification of exponential-family random graph models: Terms and computational aspects. J. Stat. Softw. 24, 1548–7660 (2008).
P. Barberá, J. T. Jost, J. Nagler, J. A. Tucker, R. Bonneau, Tweeting from left to right: Is online political communication more than an echo chamber? Psychol. Sci. 26, 1531–1542 (2015).
D. Camacho, A. Panizo-LLedot, G. Bello-Orgaz, A. Gonzalez-Pardo, E. Cambria, The four dimensions of social network analysis: An overview of research methods, applications, and software tools. Inf. Fusion 63, 88–120 (2020).
J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001).
J. L. Hill, A. Linero, J. Murray, Bayesian additive regression trees: A review and look forward. Annu. Rev. Stat. Appl. 7, 251–278 (2020).
V. Dorie, M. Harada, N. B. Carnegie, J. L. Hill, A flexible, interpretable framework for assessing sensitivity to unmeasured confounding. Stat. Med. 35, 3453–3470 (2016).
V. Svetnik, A. Liaw, C. Tong, J. C. Culberson, R. P. Sheridan, B. P. Feuston, Random forest: A classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43, 1947–1958 (2003).
B. Seligman, S. Tuljapurkar, D. Rehkopf, Machine learning approaches to the social determinants of health in the health and retirement study. SSM Popul. Health 4, 95–99 (2018).
J. H. Friedman, Stochastic gradient boosting. Comput. Stat. Data Anal. 38, 367–378 (2002).
T. Hothorn, P. Bühlmann, T. Kneib, M. Schmid, B. Hofner, mboost: Model-based boosting (R package version, 2012).
A. I. Naimi, L. B. Balzer, Stacked generalization: An introduction to super learning. Eur. J. Epidemiol. 33, 459–464 (2018).
H. S. Laqueur, A. B. Shev, R. M. C. Kagawa, SuperMICE: An ensemble machine learning approach to multiple imputation by chained equations. Am. J. Epidemiol. 191, 516–525 (2022).
T. Fine, M. Hassoun, Fundamentals of artificial neural networks [Book Reviews]. IEEE Trans. Inf. Theory 42, 1322–1324 (1996).
S. Dreiseitl, L. Ohno-Machado, Logistic regression and artificial neural network classification models: A methodology review. J. Biomed. Inform. 35, 352–359 (2002).
F. Harrell, Regression Modeling Strategies. With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis (Springer, 2015).
R. Neal, Bayesian Learning for Neural Networks (Springer, 2012).
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks. arXiv:1312.6199 [cs.CV] (19 February 2014).
G. Carbone, M. Wicker, L. Laurenti, A. Patane, L. Bortolussi, G. Sanguinetti, Robustness of bayesian neural networks to gradient-based attacks. Adv. Neural Inf. Process. Syst. 33, 15602–15613 (2020).
I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, I. Chouvarda, Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 15, 104–116 (2017).
M. Beeksma, S. Verberne, A. van den Bosch, E. Das, I. Hendrickx, S. Groenewoud, Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records. BMC Med. Inform. Decis. Mak. 19, 36 (2019).
R. C. Kessler, I. Hwang, C. A. Hoffmire, J. F. McCarthy, M. V. Petukhova, A. J. Rosellini, N. A. Sampson, A. L. Schneider, P. A. Bradley, I. R. Katz, C. Thompson, R. M. Bossarte, Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int. J. Methods Psychiatr. Res. 26, e1575 (2017).
S. Rose, Mortality risk score prediction in an elderly population using machine learning. Am. J. Epidemiol. 177, 443–452 (2013).
E. Christodoulou, J. Ma, G. S. Collins, E. W. Steyerberg, J. Y. Verbakel, B. Van Calster, A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 110, 12–22 (2019).
V. Baćak, E. H. Kennedy, Principled machine learning using the super learner: An application to predicting prison violence. Sociol. Methods Res. 48, 698–721 (2019).
J. Blumenstock, G. Cadamuro, R. On, Predicting poverty and wealth from mobile phone metadata. Science 350, 1073–1076 (2015).
A. M. Alaa, T. Bolton, E. Di Angelantonio, J. H. Rudd, M. van der Schaar, Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLOS ONE 14, e0213653 (2019).
S. Licher, M. J. Leening, P. Yilmaz, F. J. Wolters, J. Heeringa, P. J. Bindels; Alzheimer’s Disease Neuroimaging Initiative, M. W. Vernooij, B. C. Stephan, E. W. Steyerberg, M. Kamran Ikram, M. Arfan Ikram, Development and validation of a dementia risk prediction model in the general population: An analysis of three longitudinal studies. Am. J. Psychiatry 176, 543–551 (2019).
D. E. Goin, K. E. Rudolph, J. Ahern, Predictors of firearm violence in urban communities: A machine-learning approach. Health Place 51, 61–67 (2018).
N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
H. Kvamme, N. Sellereite, K. Aas, S. Sjursen, Predicting mortgage default using convolutional neural networks. Expert Syst. Appl. 102, 207–217 (2018).
M. Khalilia, S. Chakraborty, M. Popescu, Predicting disease risks from highly imbalanced data using random forest. BMC Med. Inform. Decis. Mak. 11, 51 (2011).
D. Facal, S. Valladares-Rodriguez, C. Lojo-Seoane, A. X. Pereiro, L. Anido-Rifon, O. Juncos-Rabadán, Machine learning approaches to studying the role of cognitive reserve in conversion from mild cognitive impairment to dementia. Int. J. Geriatr. Psychiatry 34, 941–949 (2019).
M. Grassi, D. A. Loewenstein, D. Caldirola, K. Schruers, R. Duara, G. Perna, A clinically-translatable machine learning algorithm for the prediction of Alzheimer’s disease conversion: Further evidence of its accuracy via a transfer learning approach. Int. Psychogeriatr. 31, 937–945 (2019).
R. V. Marinescu, N. P. Oxtoby, A. L. Young, E. E. Bron, A. W. Toga, M. W. Weiner, F. Barkhof, N. C. Fox, A. Eshaghi, T. Toni, M. Salaterski, V. Lunina, M. Ansart, S. Durrleman, P. Lu, S. Iddi, D. Li, W. K. Thompson, M. C. Donohue, A. Nahon, Y. Levy, D. Halbersberg, M. Cohen, H. Liao, T. Li, K. Yu, H. Zhu, J. G. Tamez-Pena, A. Ismail, T. Wood, H. C. Bravo, M. Nguyen, N. Sun, J. Feng, B. T. Thomas Yeo, G. Chen, K. Qi, S. Chen, D. Qiu, I. Buciuman, A. Kelner, R. Pop, D. Rimocea, M. M. Ghazi, M. Nielsen, S. Ourselin, L. Sorensen, V. Venkatraghavan, K. Liu, C. Rabe, P. Manser, S. M. Hill, J. Howlett, Z. Huang, S. Kiddle, S. Mukherjee, A. Rouanet, B. Taschler, B. D. M. Tom, S. R. White, N. Faux, S. Sedai, Javier de Velasco Oriol, E. E. V. Clemente, K. Estrada, L. Aksman, A. Altmann, C. M. Stonnington, Y. Wang, J. Wu, V. Devadas, C. Fourrier, L. L. Raket, A. Sotiras, G. Erus, J. Doshi, C. Davatzikos, J. Vogel, A. Doyle, A. Tam, A. Diaz-Papkovich, E. Jammeh, I. Koval, P. Moore, T. J. Lyons, J. Gallacher, J. Tohka, R. Ciszek, B. Jedynak, K. Pandya, M. Bilgel, W. Engels, J. Cole, P. Golland, S. Klein, D. C. Alexander, The Alzheimer’s disease prediction of longitudinal evolution (TADPOLE) challenge: Results after 1 year follow-up. arXiv:2002.03419 [q-bio.PE] (9 February 2020).
K. Z. LeWinn, N. R. Bush, A. Batra, F. Tylavsky, D. Rehkopf, Identification of modifiable social and behavioral factors associated with childhood cognitive performance. JAMA Pediatr. 174, 1063–1072 (2020).
O. Weller, L. Sagers, C. Hanson, M. Barnes, Q. Snell, E. S. Tass, Predicting suicidal thoughts and behavior among adolescents using the risk and protective factor framework: A large-scale machine learning approach. PLOS ONE 16, e0258535 (2021).
N. Kimura, Y. Aso, K. Yabuuchi, M. Ishibashi, D. Hori, Y. Sasaki, A. Nakamichi, S. Uesugi, H. Fujioka, S. Iwao, M. Jikumaru, T. Katayama, K. Sumi, A. Eguchi, S. Nonaka, M. Kakumu, E. Matsubara, Modifiable lifestyle factors and cognitive function in older people: A cross-sectional observational study. Front. Neurol. 10, 401 (2019).
R. C. Schell, B. Allen, W. C. Goedel, B. D. Hallowell, R. Scagos, Y. Li, M. S. Krieger, D. B. Neill, B. D. Marshall, M. Cerda, J. Ahern, Identifying predictors of opioid overdose death at a neighborhood level with machine learning. Am. J. Epidemiol. 191, 526–533 (2022).
J. M. Platt, K. A. McLaughlin, A. R. Luedtke, J. Ahern, A. S. Kaufman, K. M. Keyes, Targeted estimation of the relationship between childhood adversity and fluid intelligence in a US population sample of adolescents. Am. J. Epidemiol. 187, 1456–1466 (2018).
J. C. Peterson, D. D. Bourgin, M. Agrawal, D. Reichman, T. L. Griffiths, Using large-scale experiments and machine learning to discover theories of human decision-making. Science 372, 1209–1214 (2021).
T. Ploner, S. Heß, M. Grum, P. Drewe-Boss, J. Walker, Using gradient boosting with stability selection on health insurance claims data to identify disease trajectories in chronic obstructive pulmonary disease. Stat. Methods Med. Res. 29, 3684–3694 (2020).
M. Anatürk, T. Kaufmann, J. H. Cole, S. Suri, L. Griffanti, E. Zsoldos, N. Filippini, A. Singh-Manoux, M. Kivimäki, L. T. Westlye, K. P. Ebmeier, A.-M. G. de Lange, Prediction of brain age and cognitive age: Quantifying brain and cognitive maintenance in aging. Hum. Brain Mapp. 42, 1626–1640 (2020).
M. M. Glymour, Using causal diagrams to understand common problems in social epidemiology, in Methods in Social Epidemiology, J. M. Oakes, J. S. Kaufman, Eds. (Wiley, ed. 2, 2006) pp. 393–428.
E. C. Matthay, M. M. Glymour, A graphical catalog of threats to validity: Linking social science with epidemiology. Epidemiology 31, 376–384 (2020).
P. W. Tennant, W. J. Harrison, E. J. Murray, K. F. Arnold, L. Berrie, M. P. Fox, S. C. Gadd, C. Keeble, L. R. Ranker, J. Textor, Use of directed acyclic graphs (DAGs) in applied health research: Review and recommendations. medRxiv 2019.12.20.19015511 (2019).
J. M. Robins, M. Á. Hernán, B. Brumback, Marginal structural models and causal inference in epidemiology. Epidemiology 11, 550–560 (2000).
X. García-Albéniz, J. Hsu, M. A. Hernán, The value of explicitly emulating a target trial when using real world evidence: An application to colorectal cancer screening. Eur. J. Epidemiol. 32, 495–500 (2017).
M. A. Hernán, J. M. Robins, Using big data to emulate a target trial when a randomized trial is not available. Am. J. Epidemiol. 183, 758–764 (2016).
I. M. Aris, A. L. Sarvet, M. J. Stensrud, R. Neugebauer, L.-J. Li, M.-F. Hivert, E. Oken, J. G. Young, Separating algorithms from questions and causal inference with unmeasured exposures: An application to birth cohort studies of early body mass index rebound. Am. J. Epidemiol. 190, 1414–1423 (2021).
G. Shmueli, To explain or to predict? Stat. Sci. 25, 289–310 (2010).
L. B. Balzer, M. L. Petersen, Machine learning in causal inference: How do I love thee? Let me count the ways. Am. J. Epidemiol. 190, 1483–1487 (2021).
T. Blakely, J. Lynch, K. Simons, R. Bentley, S. Rose, Reflection on modern methods: When worlds collide—Prediction, machine learning and causal inference. Int. J. Epidemiol. 49, 2058–2064 (2020).
H. A. Chipman, E. I. George, R. E. McCulloch, BART: Bayesian additive regression trees. Ann. Appl. Stat. 4, 266–298 (2010).
M. S. Schuler, S. Rose, Targeted maximum likelihood estimation for causal inference in observational studies. Am. J. Epidemiol. 185, 65–73 (2017).
S. Wager, S. Athey, Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113, 1228–1242 (2018).
V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, J. Robins, Double/debiased machine learning for treatment and structural parameters. Econom. J. 21, C1–C68 (2018).
N. B. Carnegie, J. Wu, Variable selection and parameter tuning for BART modeling in the fragile families challenge. Socius 5, 2378023119825886 (2019).
L. M. Bodnar, A. R. Cartus, S. I. Kirkpatrick, K. P. Himes, E. H. Kennedy, H. N. Simhan, W. A. Grobman, J. Y. Duffy, R. M. Silver, S. Parry, A. I. Naimi, Machine learning as a strategy to account for dietary synergy: An illustration based on dietary intake and adverse pregnancy outcomes. Am. J. Clin. Nutr. 111, 1235–1243 (2020).
L. M. Bodnar, A. R. Cartus, E. H. Kennedy, S. I. Kirkpatrick, S. M. Parisi, K. P. Himes, C. B. Parker, W. A. Grobman, H. N. Simhan, R. M. Silver, D. A. Wing, S. Perry, A. I. Naimi, A doubly robust machine learning-based approach to evaluate body mass index as a modifier of the association between fruit and vegetable intake and preeclampsia. Am. J. Epidemiol. 191, 1396–1406 (2022).
A. Decruyenaere, J. Steen, K. Colpaert, D. D. Benoit, J. Decruyenaere, S. Vansteelandt, The obesity paradox in critically ill patients: A causal learning approach to a casual finding. Crit. Care 24, 485 (2020).
P. Brunori, G. Neidhöfer, The evolution of inequality of opportunity in Germany: A machine learning approach. Rev. Income Wealth 67, 900–927 (2021).
S. Rose, Intersections of machine learning and epidemiological methods for health services research. Int. J. Epidemiol. 49, 1763–1770 (2020).
S. Athey, G. W. Imbens, Machine learning methods that economists should know about. Annu. Rev. Econom. 11, 685–725 (2019).
J. Delgadillo, P. Gonzalez, S. Duhne, Targeted prescription of cognitive–behavioral therapy versus person-centered counseling for depression using a machine learning approach. J. Consult. Clin. Psychol. 88, 14–24 (2020).
K. Imai, M. Ratkovic, Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Stat. 7, 443–470 (2013).
U. Shalit, Can we learn individual-level treatment policies from clinical data? Biostatistics 21, 359–362 (2020).
J. L. Hill, Y.-S. Su, Assessing lack of common support in causal inference using Bayesian nonparametrics: Implications for evaluating the effect of breastfeeding on children’s cognitive outcomes. Ann. Appl. Stat. 7, 1386–1420 (2013).
L. Keele, D. S. Small, Comparing covariate prioritization via matching to machine learning methods for causal inference using five empirical applications. Am. Stat. 75, 1–9 (2021).
H. Ishwaran, U. B. Kogalur, E. H. Blackstone, M. S. Lauer, Random survival forests. Ann. Appl. Stat. 2, 841–860 (2008).
J. Pearl, Causality (Cambridge Univ. Press, 2009).
M. M. Glymour, D. Spiegelman, Evaluating public health interventions: 5. Causal inference in public health research—Do sex, race, and biological factors cause health outcomes? Am. J. Public Health 107, 81–85 (2017).
R. Quintana, What race and gender stand for: Using Markov blankets to identify constitutive and mediating relationships. J. Comput. Soc. Sci. 5, 751–779 (2021).
R. M. Andrews, R. Foraita, V. Didelez, J. Witte, A practical guide to causal discovery with cohort data. arXiv:2108.13395 [stat.AP] (30 August 2021).
J. Friedman, T. Hastie, R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2008).
Z. R. Li, T. H. McComick, S. J. Clark, Using Bayesian latent Gaussian graphical models to infer symptom associations in verbal autopsies. Bayesian Anal. 15, 781–807 (2020).
M. Scanagatta, A. Salmerón, F. Stella, A survey on Bayesian network structure learning from data. Prog. Artif. Intell. 8, 425–439 (2019).
C. Heinze-Deml, M. H. Maathuis, N. Meinshausen, Causal structure learning. Annu. Rev. Stat Appl. 5, 371–391 (2018).
A. C. Constantinou, Y. Liu, K. Chobtham, Z. Guo, N. K. Kitson, Large-scale empirical validation of Bayesian network structure learning algorithms with noisy data. Int. J. Approx. Reason. 131, 151–188 (2021).
M. Scutari, J.-B. Denis, Bayesian Networks: With Examples in R (Chapman and Hall/CRC, ed. 2, 2021).
Z. Baranczuk, J. Estill, S. Blough, S. Meier, A. Merzouki, M. H. Maathuis, O. Keiser, Socio-behavioural characteristics and HIV: Findings from a graphical modelling analysis of 29 sub-Saharan African countries. J. Int. AIDS Soc. 22, e25437 (2019).
M. Ester, H. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (AAAI, 1996), vol. 240, p. 6.
E. Schubert, J. Sander, M. Ester, H. P. Kriegel, X. Xu, DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. 42, 1–21 (2017).
M. Jordan, R. Jacobs, Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6, 181–214 (1994).
S. Masoudnia, R. Ebrahimpour, Mixture of experts: A literature survey. Artif. Intell. Rev. 42, 275–293 (2014).
M. E. Tipping, C. M. Bishop, Probabilistic principal component analysis. J. R. Stat. Soc. Series B 61, 611–622 (1999).
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, K. Q. Weinberger, Eds. (Curran Associates Inc., 2014), vol. 27.
D. P. Kingma, M. Welling, Auto-encoding variational bayes. arXiv:1312.6114 [stat.ML] (1 May 2014).
V. A. Padilha, R. J. G. B. Campello, A systematic comparative evaluation of biclustering techniques. BMC Bioinform. 18, 55 (2017).
G. E. Moran, V. Ročková, E. I. George, Spike-and-slab Lasso biclustering. Ann. Appl. Stat. 15, 148–173 (2021).
C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20, 273–297 (1995).
C. S. Bojer, J. P. Meldgaard, Kaggle forecasting competitions: An overlooked learning opportunity. Int. J. Forecast. 37, 587–603 (2021).
D. Colombo, M. H. Maathuis, Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15, 3741–3782 (2014).
P. Spirtes, C. Glymour, An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 9, 62–72 (1991).
D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson, Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40, 294–321 (2012).
I. Tsamardinos, C. F. Aliferis, A. Statnikov, Time and sample efficient discovery of Markov blankets and direct causal relations, in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, 2003), pp. 673–678.
I. Tsamardinos, C. F. Aliferis, A. Statnikov, Algorithms for large scale Markov blanket discovery. FLAIRS Conf. 2, 376–380 (2003b).
M. Scutari, C. Vitolo, A. Tucker, Learning Bayesian networks from big data with greedy search: Computational complexity and efficient implementation. Stat. Comput. 29, 1095–1108 (2019).
J. I. Alonso-Barba, J. A. Gámez, J. M. Puerta, Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. Int. J. Approx. Reason. 54, 429–451 (2013).
I. Tsamardinos, L. E. Brown, C. F. Aliferis, The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65, 31–78 (2006).