How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims

Spanos, Aris2024-02-012024-02-012024-01-22Spanos, A. How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims. Entropy 2024, 26, 95.https://hdl.handle.net/10919/117785The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing.application/pdfenCreative Commons Attribution 4.0 Internationalreplicationuntrustworthy evidencestatistical misspecificationstatistical vs. substantive significancepre-data vs. post-data error probabilitiesp-hackingpost-data severity evaluationobserved confidence intervalseffect sizesHow the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential ClaimsArticle - Refereed2024-01-26Entropyhttps://doi.org/10.3390/e26010095