Show simple item record

dc.contributor.authorSaraf, Parangen
dc.date.accessioned2018-04-27T08:00:25Zen
dc.date.available2018-04-27T08:00:25Zen
dc.date.issued2018-04-26en
dc.identifier.othervt_gsexam:15211en
dc.identifier.urihttp://hdl.handle.net/10919/82926en
dc.description.abstractAutomated event extraction from free text remains an open problem, particularly when the goal is to identify all relevant events. Manual extraction is currently the only alternative for comprehensive and reliable extraction. Therefore, it is required to have a system that can comprehensively extract events reported in news articles (high recall) and is also scalable enough to handle a large number of articles. In this dissertation, we explore various methods to develop an event extraction system that can mitigate these challenges. We primarily investigate three major problems related to event extraction as follows. (i) What are the strengths and weaknesses of the automated event extractors? A thorough understanding of what can be automated with high success and what leads to common pitfalls is crucial before we could develop a superior event extraction system. (ii) How can we build a hybrid event extraction system that can bridge the gap between manual and automated event extraction? Hybrid extraction is a semi-automated approach that uses an ecosystem of machine learning models along with a carefully designed user interface for extracting events. Since this method is semi-automated it also requires a meticulous understanding of user behavior in order to identify tasks that humans can perform with ease while diverting the more tedious task to the machine learning methods (iii) Finally, we explore methods for displaying extracted events that could simplify the analytical and inference generation processes for an analyst. We particularly aim to develop visualizations that would allow analysts can perform macro and micro level analysis of significant societal events.en
dc.format.mediumETDen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectEvent Extractionen
dc.subjectVisual Analyticsen
dc.subjectNews Analyticsen
dc.subjectCivil Unresten
dc.titleA Cost-Effective Semi-Automated Approach for Comprehensive Event Extractionen
dc.typeDissertationen
dc.contributor.departmentComputer Scienceen
dc.description.degreePh. D.en
thesis.degree.namePh. D.en
thesis.degree.leveldoctoralen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.disciplineComputer Science and Applicationsen
dc.contributor.committeechairRamakrishnan, Narenen
dc.contributor.committeememberHouse, Leanna L.en
dc.contributor.committeememberCorley, Courtneyen
dc.contributor.committeememberNorth, Christopher L.en
dc.contributor.committeememberLu, Chang Tienen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record