Neural Sequence Modeling for Domain-Specific Language Processing: A Systematic Approach

TR Number

Date

2023-08-14

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

In recent years, deep learning based sequence modeling (neural sequence modeling) techniques have made substantial progress in many tasks, including information retrieval, question answering, information extraction, machine translation, etc. Benefiting from the highly scalable attention-based Transformer architecture and enormous open access online data, large-scale pre-trained language models have shown great modeling and generalization capacity for sequential data. However, not all domains benefit equally from the rapid development of neural sequence modeling. Domains like healthcare and software engineering have vast amounts of sequential data containing rich knowledge, yet remain under-explored due to a number of challenges: 1) the distribution of the sequences in specific domains is different from the general domain; 2) the effective comprehension of domain-specific data usually relies on domain knowledge; and 3) the labelled data is usually scarce and expensive to get in domain-specific settings. In this thesis, we focus on the research problem of applying neural sequence modeling methods to address both common and domain-specific challenges from the healthcare and software engineering domains. We systematically investigate neural-based machine learning approaches to address the above challenges in three research directions: 1) learning with long sequences, 2) learning from domain knowledge and 3) learning under limited supervision. Our work can also potentially benefit more domains with large amounts of sequential data.

Description

Keywords

Machine Learning for Natural Language Processing, Machine Learning for Code, Machine Learning for Healthcare, Information Retrieval, Question Answering, Entity Linking, Program Translation, Code Refinement, Sequence-to-Sequence Models

Citation