Bayesian Factor Models for Clustering and Spatiotemporal Analysis

dc.contributor.authorShin, Hwasooen
dc.contributor.committeechairFerreira, Marco Antonio Rosaen
dc.contributor.committeememberTegge, Allisonen
dc.contributor.committeememberFranck, Christopher Thomasen
dc.contributor.committeememberKim, Inyoungen
dc.contributor.departmentStatisticsen
dc.date.accessioned2024-05-29T08:00:51Zen
dc.date.available2024-05-29T08:00:51Zen
dc.date.issued2024-05-28en
dc.description.abstractMultivariate data is prevalent in modern applications, yet it often presents significant analytical challenges. Factor models can offer an effective tool to address issues associated with large-scale datasets. In this dissertation, we propose two novel Bayesian factors models. These models are designed to effectively reduce the dimensionality of the data, as the number of latent factors is typically much smaller than that of the observation vectors. Therefore, our proposed models can achieve substantial dimension reduction. Our first model is for spatiotemporal areal data. In this case, the region of interest is divided into subregions, and at each time point, there is one univariate observation per subregion. Our model writes the vector of observations at each time point in a factor model form as the product of a vector of factor loadings and a vector of common factors plus a vector of error. Our model assumes that the common factor evolves through time according to a dynamic linear model. To represent the spatial relationships among subregions, each column of the factor loadings matrix is assigned intrinsic conditional autoregressive (ICAR) priors. Therefore, we call our approach the Dynamic ICAR Spatiotemporal Factor Models (DIFM). Our second model, Bayesian Clustering Factor Model (BCFM) assumes latent factors and clusters are present in the data. We apply Gaussian mixture models on common factors to discover clusters. For both models, we develop MCMC to explore the posterior distribution of the parameters. To select the number of factors and, in the case of clustering methods, the number of clusters, we develop model selection criteria that utilize the Laplace-Metropolis estimator of the predictive density and BIC with integrated likelihood.en
dc.description.abstractgeneralUnderstanding large-scale datasets has emerged as one of the most significant challenges for researchers recently. This is particularly true for datasets that are inherently complex and nontrivial to analyze. In this dissertation, we present two novel classes of Bayesian factor models for two classes of complex datasets. Frequently, the number of factors is much smaller than the number of variables, and therefore factor models can be an effective approach to handle multivariate datasets. First, we develop Dynamic ICAR Spatiotemporal Factor Model (DIFM) for datasets collected on a partition of a spatial domain of interest over time. The DIFM accounts for the spatiotemporal correlation and provides predictions of future trends. Second, we develop Bayesian Clustering Factor Model (BCFM) for multivariate data that cluster in a space of dimension lower than the dimension of the vector of observations. BCFM enables researchers to identify different characteristics of the subgroups, offering valuable insights into their underlying structure.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:40390en
dc.identifier.urihttps://hdl.handle.net/10919/119143en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectBayesian Factors Modelen
dc.subjectSpatiotemporal Modelen
dc.subjectClustering Methodsen
dc.subjectDimension Reductionen
dc.titleBayesian Factor Models for Clustering and Spatiotemporal Analysisen
dc.typeDissertationen
thesis.degree.disciplineStatisticsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Shin_H_D_2024.pdf
Size:
1.94 MB
Format:
Adobe Portable Document Format