VTechWorks staff will be away for the Thanksgiving holiday beginning at noon on Wednesday, November 27, through Friday, November 29. We will resume normal operations on Monday, December 2. Thank you for your patience.
 

Prediction and Anomaly Detection Techniques for Spatial Data

dc.contributor.authorLiu, Xutongen
dc.contributor.committeechairLu, Chang-Tienen
dc.contributor.committeememberChen, Ing-Rayen
dc.contributor.committeememberXuan, Jianhuaen
dc.contributor.committeememberRamakrishnan, Narenen
dc.contributor.committeememberLi, Qien
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2013-06-12T08:00:37Zen
dc.date.available2013-06-12T08:00:37Zen
dc.date.issued2013-06-11en
dc.description.abstractWith increasing public sensitivity and concern on environmental issues, huge amounts of spatial data have been collected from location based social network applications to scientific data. This has encouraged formation of large spatial data set and generated considerable interests for identifying novel and meaningful patterns. Allowing correlated observations weakens the usual statistical assumption of independent observations, and complicates the spatial analysis. This research focuses on the construction of efficient and effective approaches for three main mining tasks, including spatial outlier detection, robust inference for spatial dataset, and spatial prediction for large multivariate non-Gaussian data. spatial outlier analysis, which aims at detecting abnormal objects in spatial contexts, can help  extract important knowledge in many applications. There exist the well-known masking and swamping problems in most approaches, which can't still satisfy certain requirements aroused recently. This research focuses on development of spatial outlier detection techniques for three aspects, including spatial numerical outlier detection, spatial categorical outlier detection and identification of the number of spatial numerical outliers. First, this report introduces Random Walk based approaches to identify spatial numerical outliers. The Bipartite and an Exhaustive Combination weighted graphs are modeled based on spatial and/or non-spatial attributes, and then Random walk techniques are performed on the graphs to compute the relevance among objects. The objects with lower relevance are recognized as outliers. Second, an entropy-based method is proposed to estimate the optimum number of outliers. According to the entropy theory, we expect that, by incrementally removing outliers, the entropy value will decrease sharply, and reach a stable state when all the outliers have been removed. Finally, this research designs several Pair Correlation Function based methods to detect spatial categorical outliers for both single and multiple attribute data. Within them, Pair Correlation Ratio(PCR) is defined and estimated for each pair of categorical combinations based on their co-occurrence frequency at different spatial distances. The observations with the lower PCRs are diagnosed as potential SCOs. Spatial kriging is a widely used predictive model whose predictive accuracy could be significantly compromised if the observations are contaminated by outliers. Also, due to spatial heterogeneity, observations are often different types. The prediction of multivariate spatial processes plays an important role when there are cross-spatial dependencies between multiple responses. In addition, given the large volume of spatial data, it is computationally challenging. These raise three research topics: 1).robust prediction for spatial data sets; 2).prediction of multivariate spatial observations; and 3). efficient processing for large data sets. First, increasing the robustness of spatial kriging model can be systematically addressed by integrating heavy tailed distributions. However, it is analytically intractable inference. Here, we presents a novel robust and reduced Rank spatial kriging Model (R$^3$-SKM), which is resilient to the influences of outliers and allows for fast spatial inference. Second, this research introduces a flexible hierarchical Bayesian framework that permits the simultaneous modeling of mixed type variable. Specifically, the mixed-type attributes are mapped to latent numerical random variables that are multivariate Gaussian in nature. Finally, the knot-based techniques is utilized to model the predictive process as a reduced rank spatial process, which projects the process realizations of the spatial model to a lower dimensional subspace. This projection significantly reduces the computational cost.en
dc.description.degreePh. D.en
dc.format.mediumETDen
dc.identifier.othervt_gsexam:985en
dc.identifier.urihttp://hdl.handle.net/10919/23201en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectSpatialen
dc.subjectMultivariateen
dc.subjectRobust Inferenceen
dc.subjectAnomaly Detectionen
dc.titlePrediction and Anomaly Detection Techniques for Spatial Dataen
dc.typeDissertationen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Liu_X_D_2013.pdf
Size:
5.87 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Liu_X_D_2013_support_1.pdf
Size:
5.06 MB
Format:
Adobe Portable Document Format
Description:
Supporting documents