N-ary Cross-sentence Relation Extraction: From Supervised to Unsupervised Learning

Yuan, Chenhan

N-ary Cross-sentence Relation Extraction: From Supervised to Unsupervised Learning

dc.contributor.author	Yuan, Chenhan	en
dc.contributor.committeechair	Eldardiry, Hoda	en
dc.contributor.committeemember	Lourentzou, Ismini	en
dc.contributor.committeemember	Huang, Lifu	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2022-11-11T07:00:13Z	en
dc.date.available	2022-11-11T07:00:13Z	en
dc.date.issued	2021-05-19	en
dc.description.abstract	Relation extraction is the problem of extracting relations between entities described in the text. Relations identify a common "fact" described by distinct entities. Conventional relation extraction approaches focus on supervised binary intra-sentence relations, where the assumption is relations only exist between two entities within the same sentence. These approaches have two key limitations. First, binary intra-sentence relation extraction methods can not extract a relation in a fact that is described by more than two entities. Second, these methods cannot extract relations that span more than one sentence, which commonly occurs as the number of entities increases. Third, these methods assume a supervised setting and are therefore not able to extract relations in the absence of sufficient labeled data for training. This work aims to overcome these limitations by developing n-ary cross-sentence relation extraction methods for both supervised and unsupervised settings. Our work has three main goals and contributions: (1) two unsupervised binary intra-sentence relation extraction methods, (2) a supervised n-ary cross-sentence relation extraction method, and (3) an unsupervised n-ary cross-sentence relation extraction method. To achieve these goals, our work includes the following contributions: (1) an automatic labeling method for n-ary cross-sentence data, which is essential for model training, (2) a reinforcement learning-based sentence distribution estimator to minimize the impact of noise on model training, (3) a generative clustering-based technique for intra-sentence unsupervised relation extraction, (4) a variational autoencoder-based technique for unsupervised n-ary cross-sentence relation extraction, and (5) a sentence group selector that identifies groups of sentences that form relations.	en
dc.description.abstractgeneral	In this work, we designed multiple models to automatically extract relations from text. These relations represent the semantic connection between two or more proper nouns. Previous work includes models that can only extract relations between two proper nouns in a single sentence, while the methods proposed in this thesis can extract relations between two or more proper nouns in multiple sentences. We propose three models. The first model can automatically remove erroneous annotations in training data, thereby making the models more credible. We also propose a more effective model that can automatically extract relations between two proper nouns in a single sentence without the need for data annotation. We later extend this model so that it can extract relations between two or more proper nouns in multiple sentences.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:30539	en
dc.identifier.uri	http://hdl.handle.net/10919/112572	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Relation Extraction	en
dc.subject	Unsupervised Learning	en
dc.title	N-ary Cross-sentence Relation Extraction: From Supervised to Unsupervised Learning	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Yuan_C_T_2021.pdf
Size:: 1.31 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses