VTechWorks staff will be away for the Thanksgiving holiday beginning at noon on Wednesday, November 27, through Friday, November 29. We will resume normal operations on Monday, December 2. Thank you for your patience.
 

How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims

TR Number

Date

2024-01-22

Journal Title

Journal ISSN

Volume Title

Publisher

MDPI

Abstract

The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing p-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a p-value, and accept/reject H0, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject H0 results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing H0 and H1 which is often exploited to undermine the credibility of frequentist testing.

Description

Keywords

replication, untrustworthy evidence, statistical misspecification, statistical vs. substantive significance, pre-data vs. post-data error probabilities, p-hacking, post-data severity evaluation, observed confidence intervals, effect sizes

Citation

Spanos, A. How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims. Entropy 2024, 26, 95.