Few-Shot and Zero-Shot Learning for Information Extraction
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Information extraction aims to automatically extract structured information from unstructured texts. Supervised information extraction requires large quantities of labeled training data, which is time-consuming and labor-intensive. This dissertation focuses on information extraction, especially relation extraction and attribute-value extraction in e-commerce, with few labeled (few-shot learning) or even no labeled (zero-shot learning) training data. We explore multi-source auxiliary information and novel learning techniques to integrate semantic auxiliary information with the input text to improve few-shot learning and zero-shot learning. For zero-shot and few-shot relation extraction, the first method explores the existing data statistics and leverages auxiliary information including labels, synonyms of labels, keywords, and hypernyms of name entities to enable zero-shot learning for the unlabeled data. We build an automatic hypernym extraction framework to help acquire hypernyms of different entities directly from the web. The second method explores the relations between seen classes and new classes. We propose a prompt-based model with semantic knowledge augmentation to recognize new relation triplets under the zero-shot setting. In this method, we transform the problem of zero-shot learning into supervised learning with the generated augmented data for new relations. We design the prompts for training using the auxiliary information based on an external knowledge graph to integrate semantic knowledge learned from seen relations. The third work utilizes auxiliary information from images to enhance few-shot learning. We propose a multi-modal few-shot relation extraction model that leverages both textual and visual semantic information to learn a multi-modal representation jointly. To supplement the missing contexts in text, this work integrates both local features (object-level) and global features (pixel-level) from different modalities through image-guided attention, object-guided attention, and hybrid feature attention to solve the problem of sparsity and noise. We then explore the few-shot and zero-shot aspect (attribute-value) extraction in the e-commerce application field. The first work studies the multi-label few-shot learning by leveraging the auxiliary information of anchor (label) and category description based on the prototypical networks, where the hybrid attention helps alleviate ambiguity and capture more informative semantics by calculating both the label-relevant and query-related weights. A dynamic threshold is learned by integrating the semantic information from support and query sets to achieve multi-label inference. The second work explores multi-label zero-shot learning via semi-inductive link prediction of the heterogeneous hypergraph. The heterogeneous hypergraph is built with higher-order relations (generated by the auxiliary information of user behavior data and product inventory data) to capture the complex and interconnected relations between users and the products.