Paul Meehl; crowdsourcing hypothesis test; dissonance theory; empirical adequacy; meta-analysis; personality research; precognition; theory construction; Psychology (all); General Psychology
Abstract :
[en] The identification of an empirically adequate theoretical construct requires determining whether a theoretically predicted effect is sufficiently similar to an observed effect. To this end, we propose a simple similarity measure, describe its application in different research designs, and use computer simulations to estimate the necessary sample size for a given observed effect. As our main example, we apply this measure to recent meta-analytical research on precognition. Results suggest that the evidential basis is too weak for a predicted precognition effect of d = 0.20 to be considered empirically adequate. As additional examples, we apply this measure to object-level experimental data from dissonance theory and a recent crowdsourcing hypothesis test, as well as to meta-analytical data on the correlation of personality traits and life outcomes.
Disciplines :
Social & behavioral sciences, psychology: Multidisciplinary, general & others
Author, co-author :
Witte, Erich H; Institute for Psychology, University of Hamburg, Hamburg, Germany
STANCIU, Adrian ; University of Luxembourg > Faculty of Humanities, Education and Social Sciences (FHSE) > Department of Behavioural and Cognitive Sciences (DBCS) > Lifespan Development, Family and Culture ; Data and Research on Society, GESIS-Leibniz Institute for the Social Sciences, Mannheim, Germany
Zenker, Frank; Department of Philosophy, Boğaziçi University, Istanbul, Turkey
External co-authors :
yes
Language :
English
Title :
Predicted as observed? How to identify empirically adequate theoretical constructs.
Andreas H. (2021). “Theoretical Terms in Science,” in The Stanford Encyclopedia of Philosophy (Fall 2021 Edition). ed. Edward E. N. Available at: https://plato.stanford.edu/archives/fall2021/entries/theoretical-terms-science/
Bem D. (2011). Feeling the future: experimental evidence for anomalous retroactive influences on cognition and affect. J. Pers. Soc. Psychol. 100, 407–425. doi: 10.1037/a0021524, PMID: 21280961
Bem D. Tressoldi P. Rabeyron T. Duggan M. (2016). Feeling the future: a meta-analysis of 90 experiments on the anticipation of random future events [version 2; referees: 2 approved]. F1000 research. 4:1188. https://f1000researchdata.s3.amazonaws.com/datasets/7177/9efe17e0-4b70-4f10-9945-a309e42de2c4_TableA1.xlsx]
Bollen K. A. Bauer D. J. Christ S. L. Edwards M. C. (2010). “An overview of structural equations models and recent extensions” in Recent developments in social science statistics. eds. Kolenikov S. Steinley D. Thombs L. (Hoboken: Wiley), 37–80.
Burnham K. P. Anderson D. R. (2004). Multimodel inference: understanding AIC and BIC in model selection. Sociol. Methods Res. 33, 261–304.
Cardena E. (2018). The experimental evidence for parapsychological phenomena: a review. Am. Psychol. 73, 663–677. doi: 10.1037/amp0000236, PMID: 29792448
Cohen J. (1977). Statistical power analysis for the behavioral sciences. New York: Academic Press.
Cornelissen J. Höllerer M. A. Seidl D. (2021). What theory is and can be: forms of theorizing in organizational scholarship. Organ. Theory 2, 263178772110203–263178772110219.
Eronen M. I. Bringmann L. F. (2021). The theory crisis in psychology: how to move forward. Perspect. Psychol. Sci. 16, 779–788. doi: 10.1177/1745691620970586, PMID: 33513314
Eronen M. I. Romeijn J. W. (2020). Philosophy of science and the formalization of psychological theory. Theory Psychol. 30, 786–799.
Fiedler K. Prager J. (2018). The regression trap and other pitfalls of replication science—illustrated by the report of the Open Science collaboration. Basic Appl. Soc. Psychol. 40, 115–124. doi: 10.1080/01973533.2017.1421953
Fleck L. (1935). Genesis and development of a scientific fact. Chicago: The University of Chicago Press.
Gelman A. (2018). The failure of null hypothesis significance testing when studying incremental changes, and what to do about it. Personal. Soc. Psychol. Bull. 44, 16–23.
Gelman A. Carlin J. (2014). Beyond power calculations: assessing type S (sign) and type M (magnitude) errors. Perspect. Psychol. Sci. 9, 641–651.
Gervais W. M. (2021). Practical methodological reform needs good theory. Perspect. Psychol. Sci. 16, 827–843. doi: 10.1177/1745691620977471, PMID: 33513312
Gigerenzer G. (1998). Surrogates for theories. Theory Psychol. 8, 195–204.
Hempel C. G. (1988). Provisoes: a problem concerning the inferential function of scientific theories. Erkenntnis 28, 147–164. doi: 10.1007/BF00166441
Henderson L. (2020). The problem of induction. in Zalta E. N. (Ed.), The Stanford encyclopedia of philosophy (online book). Stanford: Metaphysics Research Lab, Stanford University. Available at: https://plato.stanford.edu/entries/induction-problem/
Hume D. (1739). A treatise of human nature. Oxford: Oxford University Press.
Hunter J. E. Schmidt F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings. 2nd Edn. Thousand Oaks: Sage Publications.
Irvine E. (2021). The role of replication studies in theory building. Perspect. Psychol. Sci. 16, 844–853. doi: 10.1177/1745691620970558, PMID: 33440125
Kerr N. L. (1998). HARKing: hypothesizing after the results are known. Personal. Soc. Psychol. Rev. 2, 196–217.
Kish L. (1965). Survey sampling. New York: Wiley.
Klein S. B. (2014). What can recent replication failures tell us about the theoretical commitments of psychology? Theory Psychol. 24, 326–338.
Klein R. A. Vianello M. Hasselman F. Adams B. G. Adams R. B. Jr. Alper S. et al. (2018). Many labs 2: investigating variation in replicability across sample and setting. Adv. Methods Pract. Psychol. Sci. 1, 443–490.
Krefeld-Schwalb A. Witte E. H. Zenker F. (2018). Hypothesis-testing demands trustworthy data—a simulation approach to statistical inference advocating the research program strategy. Front. Psychol. 9:460. doi: 10.3389/fpsyg.2018.00460, PMID: 29740363
Kuhn T. (1962). The structure of scientific revolutions. Chicago: The University of Chicago Press.
Lakatos I. (1978). The methodology of scientific research Programmes. Cambridge: Cambridge University Press.
Lakens D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Front. Psychol. 4:863. doi: 10.3389/fpsyg.2013.00863, PMID: 24324449
Lakens D. Scheel A. M. Isager P. M. (2018). Equivalence testing for psychological research: a tutorial. Adv. Methods Pract. Psychol. Sci. 1, 259–269.
Linden A. H. Hönekopp J. (2021). Heterogeneity of research results: a new perspective from which to assess and promote progress in psychological science. Perspect. Psychol. Sci. 16, 358–376. doi: 10.1177/1745691620964193, PMID: 33400613
Lord F. M. Novick M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Meehl P. E. (1978). Theoretical risks and tabular asterisks: sir Karl, sir Ronald, and the slow progress of soft psychology. J. Consult. Clin. Psychol. 46, 806–834. doi: 10.1037/0022-006X.46.4.806
Meehl P. E. (1990). Appraising and amending theories: the strategy of Lakatosian defense and two principles that warrant it. Psychol. Inq. 1, 108–141. doi: 10.1207/s15327965pli0102_1
Meehl P. E. (1992). Cliometric metatheory: the actuarial approach to empirical, history-based philosophy of science. Psychol. Rep. 91, 339–404. doi: 10.2466/pr0.2002.91.2.339
Meehl P. E. (1997). “The problem is epistemology, not statistics: replace significance tests by confidence intervals and quantify accuracy of risky numeral predictions” in What if there were no significance tests? eds. Harlow L. L. Mulaik S. A. Steiger J. H. (Mahwah: Erlbaum), 393–425.
Miłkowski M. Hohol M. Nowakowski P. (2019). Mechanisms in psychology: the road towards unity? Theory Psychol. 29, 567–578.
Morris T. P. White I. R. Crowther M. J. (2019). Using simulation studies to evaluate statistical methods. Stat. Med. 38, 2074–2102. doi: 10.1002/sim.8086, PMID: 30652356
Muthukrishna M. Henrich J. (2019). A problem in theory. Nat. Hum. Behav. 3, 221–229. doi: 10.1038/s41562-018-0522-1
Myung I. J. (2000). The importance of complexity in model selection. Journal of Mathematical Psychology 44, 190–204.
Nosek B. A. Hardwicke T. E. Moshontz H. Allard A. Corker K. S. Dreber A. et al. (2022). Replicability, robustness, and reproducibility in psychological science. Annu. Rev. Psychol. 73, 719–748. doi: 10.1146/annurev-psych-020821-114157, PMID: 34665669
Oberauer K. Lewandowsky S. (2019). Addressing the theory crisis in psychology. Psychon. Bull. Rev. 26, 1596–1618. doi: 10.3758/s13423-019-01645-2, PMID: 31515732
Olsson-Collentine A. Wicherts J. M. van Assen M. A. (2020). Heterogeneity in direct replications in psychology and its association with effect size. Psychol. Bull. 146, 922–940. doi: 10.1037/bul0000294, PMID: 32700942
Peirce C. S. (1931–1958). in Collected papers of Charles Sanders Peirce. eds. Weiss P. Hartshorne C. Burks A. W., vol. 1–8. Cambridge, MA: Harvard University Press.
Perez-Gil J. A. Moscoso S. C. Rodriguez R. M. (2000). Validez de constructo: el uso de analisis factorial exploratorio-confirmatorio Para obtener evidencias de validez [construct validity: the use of exploratory-confirmatory factor analysis in determining validity evidence]. Psicothema 12, 442–446.
Popper K. R. (1959). Logic of discovery. London: Routledge.
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. Available at: https://www.R-project.org/
Schäfer T. Schwarz M. A. (2019). The meaningfulness of effect sizes in psychological research: differences between sub-disciplines and the impact of potential biases. Front. Psychol. 10:813. doi: 10.3389/fpsyg.2019.00813, PMID: 31031679
Schauer J. M. Hedges L. V. (2020). Assessing heterogeneity and power in replications of psychological experiments. Psychol. Bull. 146, 701–719. doi: 10.1037/bul0000232, PMID: 32271029
Schulze R. (2004). Meta-Analysis. A comparison of approaches. Cambridge: Hogrefe & Huber.
Simmons J. P. Nelson L. D. Simonsohn U. (2013). Life after p-hacking. Meet. Soc. Pers. Soc. Psychol. doi: 10.2139/ssrn.2205186
Szucs D. Ioannidis J. P. A. (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biol. 15:e2000797. doi: 10.1371/journal.pbio.2000797, PMID: 28253258
Torchiano M. (2020). Effsize: efficient effect size computation (package version 0.8.1.). doi: 10.5281/zenodo.1480624
van Fraassen B. (1980). The scientific image. Oxford: Oxford University Press. doi: 10.1093/0198244274.001.0001
van Rooij I. Baggio G. (2020). Theory before the test: how to build high-verisimilitude explanatory theories in psychological science. Perspect. Psychol. Sci. 16, 682–697. doi: 10.1177/1745691620970604, PMID: 33404356
Wagenmakers E. Farrell S. (2004). AIC model selection using Akaike weights. Psychon. Bull. Rev. 11, 192–196. doi: 10.3758/BF03206482, PMID: 15117008
Wickham H. Averick M. Bryan J. Chang W. D’Agostino McGowan L. Francois R. et al. (2019). Welcome to the tidyverse. J. Open Source Softw. 4:1686. doi: 10.21105/joss.01686
Wickham H. Francois R. Henry L. Müller K. (2021). Dplyr: a grammar of data manipulation (package version 1.0.7.). Available at: https://CRAN.R-project.org/package=dplyr
Witte E. H. Heitkamp I. (2006). Quantitative Rekonstruktionen (Retrognosen) als Instrument der Theorienbildung und Theorienprüfung in der Sozialpsychologie. Z. Sozialpsychol. 37, 205–214. doi: 10.1024/0044-3514.37.3.205
Witte E. H. Zenker F. (2017). From discovery to justification: outline of an ideal research program in empirical psychology. Front. Psychol. 8:1847. doi: 10.3389/fpsyg.2017.01847, PMID: 29163256