Revisiting the Replication Crisis and the Untrustworthiness of Empirical Evidence

Spanos, Aris

Revisiting the Replication Crisis and the Untrustworthiness of Empirical Evidence

Files

Published version (692.94 KB)

Downloads: 48

Date

2025-05-20

Authors

Spanos, Aris

Publisher

MDPI

Abstract

The current replication crisis relating to the non-replicability and the untrustworthiness of published empirical evidence is often viewed through the lens of the Positive Predictive Value (PPV) in the context of the Medical Diagnostic Screening (MDS) model. The PPV is misconstrued as a measure that evaluates ‘the probability of rejecting $H 0$ when false’, after being metamorphosed by replacing its false positive/negative probabilities with the type I/II error probabilities. This perspective gave rise to a widely accepted diagnosis that the untrustworthiness of published empirical evidence stems primarily from abuses of frequentist testing, including p-hacking, data-dredging, and cherry-picking. It is argued that the metamorphosed PPV misrepresents frequentist testing and misdiagnoses the replication crisis, promoting ill-chosen reforms. The primary source of untrustworthiness is statistical misspecification: invalid probabilistic assumptions imposed on one’s data. This is symptomatic of the much broader problem of the uninformed and recipe-like implementation of frequentist statistics without proper understanding of (a) the invoked probabilistic assumptions and their validity for the data used, (b) the reasoned implementation and interpretation of the inference procedures and their error probabilities, and (c) warranted evidential interpretations of inference results. A case is made that Fisher’s model-based statistics offers a more pertinent and incisive diagnosis of the replication crisis, and provides a well-grounded framework for addressing the issues (a)–(c), which would unriddle the non-replicability/untrustworthiness problems.

Keywords

replication crisis, untrustworthy evidence, non-replicability, false positive/negative rates, medical diagnostic testing, Positive Predictive Value, type I/II error probabilities, Neyman-Pearson testing, p-value, statistical adequacy, post-data severity evaluation

Citation

Spanos, A. Revisiting the Replication Crisis and the Untrustworthiness of Empirical Evidence. Stats 2025, 8, 41.

Persistent link

https://hdl.handle.net/10919/135589

Collections

Journal Articles, Multidisciplinary Digital Publishing Institute (MDPI)
Scholarly Works, Economics

Full item page

Revisiting the Replication Crisis and the Untrustworthiness of Empirical Evidence

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections