Spatio-temporal Event Detection and Forecasting in Social Media
Nowadays, knowledge discovery on social media is attracting growing interest. Social media has become more than a communication tool, effectively functioning as a social sensor for our society.
This dissertation focuses on the development of methods for social media-based spatiotemporal event detection and forecasting for a variety of event topics and assumptions. Five methods are proposed, namely dynamic query expansion for event detection, a generative framework for event forecasting, multi-task learning for spatiotemporal event forecasting, multi-source spatiotemporal event forecasting, and deep learning based epidemic modeling for forecasting influenza outbreaks. For the first of these methods, existing solutions for spatiotemporal event detection are mostly supervised and lack the flexibility to handle the dynamic keywords used in social media. The contributions of this work are: (1) Develop an unsupervised framework; (2) Design a novel dynamic query expansion method; and (3) Propose an innovative local modularity spatial scan algorithm.
For the second of these methods, traditional solutions are unable to capture the spatiotemporal context, model mixed-type observations, or utilize prior geographical knowledge. The contributions of this work include: (1) Propose a novel generative model for spatial event forecasting; (2) Design an effective algorithm for model parameter inference; and (3) Develop a new sequence likelihood calculation method. For the third method, traditional solutions cannot deal with spatial heterogeneity or handle the dynamics of social media data effectively. This work's contributions include: (1) Formulate a multi-task learning framework for event forecasting; (2) simultaneously model static and dynamic terms; and (3) Develop efficient parameter optimization algorithms.
For the fourth method, traditional multi-source solutions typically fail to consider the geographical hierarchy or cope with incomplete data blocks among different sources. The contributions here are: (1) Design a framework for event forecasting based on hierarchical multi-source indicators; (2) Propose a robust model for geo-hierarchical feature selection; and (3) Develop an efficient algorithm for model parameter optimization.
For the last method, existing work on epidemic modeling either cannot ensure timeliness, or cannot characterize the underlying epidemic propagation mechanisms. The contributions of this work include: (1) Propose a novel integrated framework for computational epidemiology and social media mining; (2) Develop a semi-supervised multilayer perceptron for mining epidemic features; and (3) Design an online training algorithm.