Where do statistical models come from? Revisiting the problem of specification
R. A. Fisher founded modern statistical inference in 1922 and identified its fundamental problems to be: specification, estimation and distribution. Since then the problem of statistical model specification has received scant attention in the statistics literature. The paper traces the history of statistical model specification, focusing primarily on pioneers like Fisher, Neyman, and more recently Lehmann and Cox, and attempts a synthesis of their views in the context of the Probabilistic Reduction (PR) approach. As argued by Lehmann , a major stumbling block for a general approach to statistical model specification has been the delineation of the appropriate role for substantive subject matter information. The PR approach demarcates the interrelated but complemenatry roles of substantive and statistical information summarized ab initio in the form of a structural and a statistical model, respectively. In an attempt to preserve the integrity of both sources of information, as well as to ensure the reliability of their fusing, a purely probabilistic construal of statistical models is advocated. This probabilistic construal is then used to shed light on a number of issues relating to specification, including the role of preliminary data analysis, structural vs. statistical models, model specification vs. model selection, statistical vs. substantive adequacy and model validation.