Automated Vocabulary Building for Characterizing and Forecasting Elections using Social Media Analytics

Mahendiran, Aravindan

Automated Vocabulary Building for Characterizing and Forecasting Elections using Social Media Analytics

Files

Mahendiran_A_T_2014.pdf (2.52 MB)

Downloads: 1799

Date

2014-02-12

Authors

Mahendiran, Aravindan

Publisher

Virginia Tech

Abstract

Twitter has become a popular data source in the recent decade and garnered a significant amount of attention as a surrogate data source for many important forecasting problems. Strong correlations have been observed between Twitter indicators and real-world trends spanning elections, stock markets, book sales, and flu outbreaks. A key ingredient to all methods that use Twitter for forecasting is to agree on a domain-specific vocabulary to track the pertinent tweets, which is typically provided by subject matter experts (SMEs). The language used in Twitter drastically differs from other forms of online discourse, such as news articles and blogs. It constantly evolves over time as users adopt popular hashtags to express their opinions. Thus, the vocabulary used by forecasting algorithms needs to be dynamic in nature and should capture emerging trends over time. This thesis proposes a novel unsupervised learning algorithm that builds a dynamic vocabulary using Probabilistic Soft Logic (PSL), a framework for probabilistic reasoning over relational domains. Using eight presidential elections from Latin America, we show how our query expansion methodology improves the performance of traditional election forecasting algorithms. Through this approach we demonstrate how we can achieve close to a two-fold increase in the number of tweets retrieved for predictions and a 36.90% reduction in prediction error.

Keywords

Election Forecasting, Twitter, Query Expansion, Social Group Modeling, Probabilistic Soft Logic

Persistent link

http://hdl.handle.net/10919/25430

Collections

Masters Theses

Full item page

Automated Vocabulary Building for Characterizing and Forecasting Elections using Social Media Analytics

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections