A Comparison of Discrete and Continuous Survival Analysis
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
There has been confusion in choosing a proper survival model between two popular survival models of discrete and continuous survival analysis. This study aimed to provide empirical outcomes of two survival models in educational contexts and suggest a guideline for researchers who should adopt a suitable survival model. For the model specification, the study paid attention to three factors of time metrics, censoring proportions, and sample sizes. To arrive at comprehensive understanding of the three factors, the study investigated the separate and combined effect of these factors. Furthermore, to understand the interaction mechanism of those factors, this study examined the role of the factors to determine hazard rates which have been known to cause the discrepancies between discrete and continuous survival models. To provide empirical evidence from different combinations of the factors in the use of survival analysis, this study built a series of discrete and continuous survival models using secondary data and simulated data. In the first study, using empirical data from the National Longitudinal Survey of Youth 1997 (NLSY97), this study compared analyses results from the two models having different sizes of time metrics. In the second study, by having various specifications with combination of two other factors of censoring proportions and sample sizes, this study simulated datasets to build two models and compared the analysis results. The major finding of the study is that discrete models are recommended in the conditions of large units of time metrics, low censoring proportion, or small sample sizes. Particularly, discrete model produced better outcomes for conditions with low censoring proportion (20%) and small number (i.e., four) of large time metrics (i.e., year) regardless of sample sizes. Close examination of those conditions of time metrics, censoring proportion, and sample sizes showed that the conditions resulted into high hazards (i.e., 0.20). In conclusion, to determine a proper model, it is recommended to examine hazards of each of the time units with the specific factors of time metrics, censoring proportion and sample sizes.