Recurrent-Event Models for Change-Points Detection


TR Number




Journal Title

Journal ISSN

Volume Title


Virginia Tech


The driving risk of novice teenagers is the highest during the initial period after licensure but decreases rapidly. This dissertation develops recurrent-event change-point models to detect the time when driving risk decreases significantly for novice teenager drivers. The dissertation consists of three major parts: the first part applies recurrent-event change-point models with identical change-points for all subjects; the second part proposes models to allow change-points to vary among drivers by a hierarchical Bayesian finite mixture model; the third part develops a non-parametric Bayesian model with a Dirichlet process prior. In the first part, two recurrent-event change-point models to detect the time of change in driving risks are developed. The models are based on a non-homogeneous Poisson process with piecewise constant intensity functions. It is shown that the change-points only occur at the event times and the maximum likelihood estimators are consistent. The proposed models are applied to the Naturalistic Teenage Driving Study, which continuously recorded textit{in situ} driving behaviour of 42 novice teenage drivers for the first 18 months after licensure using sophisticated in-vehicle instrumentation. The results indicate that crash and near-crash rate decreases significantly after 73 hours of independent driving after licensure. The models in part one assume identical change-points for all drivers. However, several studies showed that different patterns of risk change over time might exist among the teenagers, which implies that the change-points might not be identical among drivers. In the second part, change-points are allowed to vary among drivers by a hierarchical Bayesian finite mixture model, considering that clusters exist among the teenagers. The prior for mixture proportions is a Dirichlet distribution and a Markov chain Monte Carlo algorithm is developed to sample from the posterior distributions. DIC is used to determine the best number of clusters. Based on the simulation study, the model gives fine results under different scenarios. For the Naturalist Teenage Driving Study data, three clusters exist among the teenagers: the change-points are 52.30, 108.99 and 150.20 hours of driving after first licensure correspondingly for the three clusters; the intensity rates increase for the first cluster while decrease for other two clusters; the change-point of the first cluster is the earliest and the average intensity rate is the highest. In the second part, model selection is conducted to determine the number of clusters. An alternative is the Bayesian non-parametric approach. In the third part, a Dirichlet process Mixture Model is proposed, where the change-points are assigned a Dirichlet process prior. A Markov chain Monte Carlo algorithm is developed to sample from the posterior distributions. Automatic clustering is expected based on change-points without specifying the number of latent clusters. Based on the Dirichlet process mixture model, three clusters exist among the teenage drivers for the Naturalistic Teenage Driving Study. The change-points of the three clusters are 96.31, 163.83, and 279.19 hours. The results provide critical information for safety education, safety countermeasure development, and Graduated Driver Licensing policy making.



Bayesian Finite Mixture Model, Clustering, Constant Piecewise Intensity, Dirichlet Process Mixture Model, Maximum Likelihood Estimate, Naturalistic Teenage Driving Study, Non-Homogeneous Poisson Process