Unsupervised Event Extraction from News and Twitter

Abstract

Living in the age of big data, we are facing massive information every day, especially that from the mainstream news and the social networks. Due to its gigantic volume, one may get frustrated when trying to identify the key information which really matters. Thus, how to summarize the key information from the enormous amount of news and tweets becomes essential. Addressing this problem, this project explores the approaches to extract key events from newswires and Twitter data in an unsupervised manner, where Topic Modeling and Named Entity Recognition have been applied. Various methods have been tried regarding the different traits of news and tweets. The relevance between the news events and the corresponding Twitter events is studied as well. Tools have been developed to implement and evaluate these methods. Our experiments show that these tools can effectively extract key events from the news and tweets data sets. The tools, documents and data sets can be used for educational purposes and as a part of the IDEAL project of Virginia Tech.

Description
We appreciate the help of our client, Mohamed Magdy, a Ph.D. student of DLRL, Virginia Tech. We also thank NSF, who has funded the IDEAL project (NSF IIS – 1319578). Since we are working on some publications based on this project, we are not going to share our source code at this moment. We’ll consider sharing that after papers are published.
Keywords
Unsupervised event extraction, Topic model, Named entity recognition, Newstream and Twitter, Deep learning
Citation