How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims
dc.contributor.author | Spanos, Aris | en |
dc.date.accessioned | 2024-02-01T14:30:05Z | en |
dc.date.available | 2024-02-01T14:30:05Z | en |
dc.date.issued | 2024-01-22 | en |
dc.date.updated | 2024-01-26T14:10:52Z | en |
dc.description.abstract | The paper makes a case that the current discussions on replicability and the abuse of significance testing have overlooked a more general contributor to the untrustworthiness of published empirical evidence, which is the uninformed and recipe-like implementation of statistical modeling and inference. It is argued that this contributes to the untrustworthiness problem in several different ways, including [a] statistical misspecification, [b] unwarranted evidential interpretations of frequentist inference results, and [c] questionable modeling strategies that rely on curve-fitting. What is more, the alternative proposals to replace or modify frequentist testing, including [i] replacing <i>p</i>-values with observed confidence intervals and effects sizes, and [ii] redefining statistical significance, will not address the untrustworthiness of evidence problem since they are equally vulnerable to [a]–[c]. The paper calls for distinguishing between unduly data-dependant ‘statistical results’, such as a point estimate, a <i>p</i>-value, and accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula>, from ‘evidence for or against inferential claims’. The post-data severity (SEV) evaluation of the accept/reject <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> results, converts them into evidence for or against germane inferential claims. These claims can be used to address/elucidate several foundational issues, including (i) statistical vs. substantive significance, (ii) the large n problem, and (iii) the replicability of evidence. Also, the SEV perspective sheds light on the impertinence of the proposed alternatives [i]–[iii], and oppugns [iii] the alleged arbitrariness of framing <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>0</mn></msub></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>H</mi><mn>1</mn></msub></semantics></math></inline-formula> which is often exploited to undermine the credibility of frequentist testing. | en |
dc.description.version | Published version | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.citation | Spanos, A. How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims. Entropy 2024, 26, 95. | en |
dc.identifier.doi | https://doi.org/10.3390/e26010095 | en |
dc.identifier.uri | https://hdl.handle.net/10919/117785 | en |
dc.language.iso | en | en |
dc.publisher | MDPI | en |
dc.rights | Creative Commons Attribution 4.0 International | en |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | en |
dc.subject | replication | en |
dc.subject | untrustworthy evidence | en |
dc.subject | statistical misspecification | en |
dc.subject | statistical vs. substantive significance | en |
dc.subject | pre-data vs. post-data error probabilities | en |
dc.subject | p-hacking | en |
dc.subject | post-data severity evaluation | en |
dc.subject | observed confidence intervals | en |
dc.subject | effect sizes | en |
dc.title | How the Post-Data Severity Converts Testing Results into Evidence for or against Pertinent Inferential Claims | en |
dc.title.serial | Entropy | en |
dc.type | Article - Refereed | en |
dc.type.dcmitype | Text | en |