Event Trend Detector

Abstract

The Global Event and Trend Archive Research (GETAR) project is supported by NSF (IIS-1619028 and 1619371) through 2019. It will devise interactive, integrated, digital library/archive systems coupled with linked and expert-curated webpage/tweet collections. In support of GETAR, the 2017 project built a tool to scrape the news to identify important global events. It generates seeds (URLs of relevant webpages, as well as Twitter-related hashtags and keywords and mentions). A display of the results can be seen from the hall outside 2030 Torgersen Hall.

This project extends that work in multiple ways. The quality of the work done has been improved. This is evident in changes done to the clustering algorithm and the user interface changes to the clustering display of global events. Second, in addition to events reported in the news, trends have been identified, and a database of trends and related events were built with a corresponding user interface according to the client’s preferences. Third, the results of the detection are connected to software for collecting tweets and crawling webpages, so automated daily runs find and archive webpages related to each trend and event.

The final deliverables include development of a trend detection feature with Reddit news, integration of Google Trends into trend detection, an improved clustering algorithm to have more accurate clusters according to k-means, an improved UI for important global events according to what the client wanted, and an aesthetically pleasing UI to display the trend information. Work accomplished included setting up a table of tagged entities for trend detection and configuring the database for clustering and trends to work with our personal machines, and completing the deliverables. Many lessons were learned regarding the importance of using existing tools, starting early, doing research, having regular meetings, and having good documentation.

Description
Keywords
Trend Detection, Trends, Python, GETAR, Reddit, Google, News trends
Citation