Bayesian Factor Models for Clustering and Spatiotemporal Analysis
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Multivariate data is prevalent in modern applications, yet it often presents significant analytical challenges. Factor models can offer an effective tool to address issues associated with large-scale datasets. In this dissertation, we propose two novel Bayesian factors models. These models are designed to effectively reduce the dimensionality of the data, as the number of latent factors is typically much smaller than that of the observation vectors. Therefore, our proposed models can achieve substantial dimension reduction. Our first model is for spatiotemporal areal data. In this case, the region of interest is divided into subregions, and at each time point, there is one univariate observation per subregion. Our model writes the vector of observations at each time point in a factor model form as the product of a vector of factor loadings and a vector of common factors plus a vector of error. Our model assumes that the common factor evolves through time according to a dynamic linear model. To represent the spatial relationships among subregions, each column of the factor loadings matrix is assigned intrinsic conditional autoregressive (ICAR) priors. Therefore, we call our approach the Dynamic ICAR Spatiotemporal Factor Models (DIFM). Our second model, Bayesian Clustering Factor Model (BCFM) assumes latent factors and clusters are present in the data. We apply Gaussian mixture models on common factors to discover clusters. For both models, we develop MCMC to explore the posterior distribution of the parameters. To select the number of factors and, in the case of clustering methods, the number of clusters, we develop model selection criteria that utilize the Laplace-Metropolis estimator of the predictive density and BIC with integrated likelihood.