Neural Sequence Modeling for Domain-Specific Language Processing: A Systematic Approach

dc.contributor.authorZhu, Mingen
dc.contributor.committeechairLourentzou, Isminien
dc.contributor.committeechairYao, Danfengen
dc.contributor.committeememberAhmad, Wasi Uddinen
dc.contributor.committeememberBrown, Dwayne Christianen
dc.contributor.committeememberFox, Edward A.en
dc.contributor.departmentComputer Science and Applicationsen
dc.date.accessioned2023-08-15T08:00:15Zen
dc.date.available2023-08-15T08:00:15Zen
dc.date.issued2023-08-14en
dc.description.abstractIn recent years, deep learning based sequence modeling (neural sequence modeling) techniques have made substantial progress in many tasks, including information retrieval, question answering, information extraction, machine translation, etc. Benefiting from the highly scalable attention-based Transformer architecture and enormous open access online data, large-scale pre-trained language models have shown great modeling and generalization capacity for sequential data. However, not all domains benefit equally from the rapid development of neural sequence modeling. Domains like healthcare and software engineering have vast amounts of sequential data containing rich knowledge, yet remain under-explored due to a number of challenges: 1) the distribution of the sequences in specific domains is different from the general domain; 2) the effective comprehension of domain-specific data usually relies on domain knowledge; and 3) the labelled data is usually scarce and expensive to get in domain-specific settings. In this thesis, we focus on the research problem of applying neural sequence modeling methods to address both common and domain-specific challenges from the healthcare and software engineering domains. We systematically investigate neural-based machine learning approaches to address the above challenges in three research directions: 1) learning with long sequences, 2) learning from domain knowledge and 3) learning under limited supervision. Our work can also potentially benefit more domains with large amounts of sequential data.en
dc.description.abstractgeneralIn the last few years, computer programs that learn and understand human languages (an area called machine learning for natural language processing) have significantly improved. These advances are visible in various areas such as retrieving information, answering questions, extracting key details from texts, and translating between languages. A key to these successes has been the use of a type of neural network structure known as a "Transformer", which can process and learn from lots of information found online. However, these successes are not uniform across all areas. Two fields, healthcare and software engineering, still present unique challenges despite having a wealth of information. Some of these challenges include the different types of information in these fields, the need for specific expertise to understand this information, and the shortage of labeled data, which is crucial for training machine learning models. In this thesis, we focus on the use of machine learning for natural language processing methods to solve these challenges in the healthcare and software engineering fields. Our research investigates learning with long documents, learning from domain-specific expertise, and learning when there's a shortage of labeled data. The insights and techniques from our work could potentially be applied to other fields that also have a lot of sequential data.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:38221en
dc.identifier.urihttp://hdl.handle.net/10919/116034en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectMachine Learning for Natural Language Processingen
dc.subjectMachine Learning for Codeen
dc.subjectMachine Learning for Healthcareen
dc.subjectInformation Retrievalen
dc.subjectQuestion Answeringen
dc.subjectEntity Linkingen
dc.subjectProgram Translationen
dc.subjectCode Refinementen
dc.subjectSequence-to-Sequence Modelsen
dc.titleNeural Sequence Modeling for Domain-Specific Language Processing: A Systematic Approachen
dc.typeDissertationen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Zhu_M_D_2023.pdf
Size:
5.77 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Zhu_M_D_2023_support_1.pdf
Size:
27.92 KB
Format:
Adobe Portable Document Format
Description:
Supporting documents