Towards Explainable Event Detection and Extraction

Mehta, Sneha

Towards Explainable Event Detection and Extraction

dc.contributor.author	Mehta, Sneha	en
dc.contributor.committeechair	Ramakrishnan, Narendran	en
dc.contributor.committeemember	Riloff, Ellen	en
dc.contributor.committeemember	Prakash, B. Aditya	en
dc.contributor.committeemember	Lu, Chang-Tien	en
dc.contributor.committeemember	Rangwala, Huzefa	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2021-07-23T08:00:09Z	en
dc.date.available	2021-07-23T08:00:09Z	en
dc.date.issued	2021-07-22	en
dc.description.abstract	Event extraction refers to extracting specific knowledge of incidents from natural language text and consolidating it into a structured form. Some important applications of event extraction include search, retrieval, question answering and event forecasting. However, before events can be extracted it is imperative to detect events i.e. identify which documents from a large collection contain events of interest and from those extracting the sentences that might contain the event related information. This task is challenging because it is easier to obtain labels at the document level than finegrained annotations at the sentence level. Current approaches for this task are suboptimal because they directly aggregate sentence probabilities estimated by a classifier to obtain document probabilities resulting in error propagation. To alleviate this problem we propose to leverage recent advances in representation learning by using attention mechanisms. Specifically, for event detection we propose a method to compute document embeddings from sentence embeddings by leveraging attention and training a document classifier on those embeddings to mitigate the error propagation problem. However, we find that existing attention mechanisms are inept for this task, because either they are suboptimal or they use a large number of parameters. To address this problem we propose a lean attention mechanism which is effective for event detection. Current approaches for event extraction rely on finegrained labels in specific domains. Extending extraction to new domains is challenging because of difficulty of collecting finegrained data. Machine reading comprehension(MRC) based approaches, that enable zero-shot extraction struggle with syntactically complex sentences and long-range dependencies. To mitigate this problem, we propose a syntactic sentence simplification approach that is guided by MRC model to improve its performance on event extraction.	en
dc.description.abstractgeneral	Event extraction is the task of extracting events of societal importance from natural language texts. The task has a wide range of applications from search, retrieval, question answering to forecasting population level events like civil unrest, disease occurrences with reasonable accuracy. Before events can be extracted it is imperative to identify the documents that are likely to contain the events of interest and extract the sentences that mention those events. This is termed as event detection. Current approaches for event detection are suboptimal. They assume that events are neatly partitioned into sentences and obtain document level event probabilities directly from predicted sentence level probabilities. In this dissertation, under the same assumption by leveraging representation learning we mitigate some of the shortcomings of the previous event detection methods. Current approaches to event extraction are only limited to restricted domains and require finegrained labeled corpora for their training. One way to extend event extraction to new domains in by enabling zero-shot extraction. Machine reading comprehension (MRC) based approach provides a promising way forward for zero-shot extraction. However, this approach suffers from the long-range dependency problem and faces difficulty in handling syntactically complex sentences with multiple clauses. To mitigate this problem we propose a syntactic sentence simplification algorithm that is guided by the MRC system to improves its performance.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:31611	en
dc.identifier.uri	http://hdl.handle.net/10919/104359	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	deep learning	en
dc.subject	natural language processing	en
dc.subject	information extraction	en
dc.subject	representation learning	en
dc.title	Towards Explainable Event Detection and Extraction	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Mehta_S_D_2021.pdf
Size:: 1.21 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations