Towards Explainable Event Detection and Extraction
dc.contributor.author | Mehta, Sneha | en |
dc.contributor.committeechair | Ramakrishnan, Narendran | en |
dc.contributor.committeemember | Riloff, Ellen | en |
dc.contributor.committeemember | Prakash, B. Aditya | en |
dc.contributor.committeemember | Lu, Chang-Tien | en |
dc.contributor.committeemember | Rangwala, Huzefa | en |
dc.contributor.department | Computer Science | en |
dc.date.accessioned | 2021-07-23T08:00:09Z | en |
dc.date.available | 2021-07-23T08:00:09Z | en |
dc.date.issued | 2021-07-22 | en |
dc.description.abstract | Event extraction refers to extracting specific knowledge of incidents from natural language text and consolidating it into a structured form. Some important applications of event extraction include search, retrieval, question answering and event forecasting. However, before events can be extracted it is imperative to detect events i.e. identify which documents from a large collection contain events of interest and from those extracting the sentences that might contain the event related information. This task is challenging because it is easier to obtain labels at the document level than finegrained annotations at the sentence level. Current approaches for this task are suboptimal because they directly aggregate sentence probabilities estimated by a classifier to obtain document probabilities resulting in error propagation. To alleviate this problem we propose to leverage recent advances in representation learning by using attention mechanisms. Specifically, for event detection we propose a method to compute document embeddings from sentence embeddings by leveraging attention and training a document classifier on those embeddings to mitigate the error propagation problem. However, we find that existing attention mechanisms are inept for this task, because either they are suboptimal or they use a large number of parameters. To address this problem we propose a lean attention mechanism which is effective for event detection. Current approaches for event extraction rely on finegrained labels in specific domains. Extending extraction to new domains is challenging because of difficulty of collecting finegrained data. Machine reading comprehension(MRC) based approaches, that enable zero-shot extraction struggle with syntactically complex sentences and long-range dependencies. To mitigate this problem, we propose a syntactic sentence simplification approach that is guided by MRC model to improve its performance on event extraction. | en |
dc.description.abstractgeneral | Event extraction is the task of extracting events of societal importance from natural language texts. The task has a wide range of applications from search, retrieval, question answering to forecasting population level events like civil unrest, disease occurrences with reasonable accuracy. Before events can be extracted it is imperative to identify the documents that are likely to contain the events of interest and extract the sentences that mention those events. This is termed as event detection. Current approaches for event detection are suboptimal. They assume that events are neatly partitioned into sentences and obtain document level event probabilities directly from predicted sentence level probabilities. In this dissertation, under the same assumption by leveraging representation learning we mitigate some of the shortcomings of the previous event detection methods. Current approaches to event extraction are only limited to restricted domains and require finegrained labeled corpora for their training. One way to extend event extraction to new domains in by enabling zero-shot extraction. Machine reading comprehension (MRC) based approach provides a promising way forward for zero-shot extraction. However, this approach suffers from the long-range dependency problem and faces difficulty in handling syntactically complex sentences with multiple clauses. To mitigate this problem we propose a syntactic sentence simplification algorithm that is guided by the MRC system to improves its performance. | en |
dc.description.degree | Doctor of Philosophy | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:31611 | en |
dc.identifier.uri | http://hdl.handle.net/10919/104359 | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | deep learning | en |
dc.subject | natural language processing | en |
dc.subject | information extraction | en |
dc.subject | representation learning | en |
dc.title | Towards Explainable Event Detection and Extraction | en |
dc.type | Dissertation | en |
thesis.degree.discipline | Computer Science and Applications | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | doctoral | en |
thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1