A Cost-Effective Semi-Automated Approach for Comprehensive Event Extraction

Saraf, Parang

A Cost-Effective Semi-Automated Approach for Comprehensive Event Extraction

Files

Saraf_P_D_2018.pdf (34.81 MB)

Downloads: 651

Date

2018-04-26

Authors

Saraf, Parang

Publisher

Virginia Tech

Abstract

Automated event extraction from free text remains an open problem, particularly when the goal is to identify all relevant events. Manual extraction is currently the only alternative for comprehensive and reliable extraction. Therefore, it is required to have a system that can comprehensively extract events reported in news articles (high recall) and is also scalable enough to handle a large number of articles.

In this dissertation, we explore various methods to develop an event extraction system that can mitigate these challenges. We primarily investigate three major problems related to event extraction as follows. (i) What are the strengths and weaknesses of the automated event extractors? A thorough understanding of what can be automated with high success and what leads to common pitfalls is crucial before we could develop a superior event extraction system. (ii) How can we build a hybrid event extraction system that can bridge the gap between manual and automated event extraction? Hybrid extraction is a semi-automated approach that uses an ecosystem of machine learning models along with a carefully designed user interface for extracting events. Since this method is semi-automated it also requires a meticulous understanding of user behavior in order to identify tasks that humans can perform with ease while diverting the more tedious task to the machine learning methods (iii) Finally, we explore methods for displaying extracted events that could simplify the analytical and inference generation processes for an analyst. We particularly aim to develop visualizations that would allow analysts can perform macro and micro level analysis of significant societal events.

Keywords

Event Extraction, Visual Analytics, News Analytics, Civil Unrest

Persistent link

http://hdl.handle.net/10919/82926

Collections

Doctoral Dissertations

Full item page

A Cost-Effective Semi-Automated Approach for Comprehensive Event Extraction

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections