Browsing by Author "Zhu, Ming"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Neural Sequence Modeling for Domain-Specific Language Processing: A Systematic ApproachZhu, Ming (Virginia Tech, 2023-08-14)In recent years, deep learning based sequence modeling (neural sequence modeling) techniques have made substantial progress in many tasks, including information retrieval, question answering, information extraction, machine translation, etc. Benefiting from the highly scalable attention-based Transformer architecture and enormous open access online data, large-scale pre-trained language models have shown great modeling and generalization capacity for sequential data. However, not all domains benefit equally from the rapid development of neural sequence modeling. Domains like healthcare and software engineering have vast amounts of sequential data containing rich knowledge, yet remain under-explored due to a number of challenges: 1) the distribution of the sequences in specific domains is different from the general domain; 2) the effective comprehension of domain-specific data usually relies on domain knowledge; and 3) the labelled data is usually scarce and expensive to get in domain-specific settings. In this thesis, we focus on the research problem of applying neural sequence modeling methods to address both common and domain-specific challenges from the healthcare and software engineering domains. We systematically investigate neural-based machine learning approaches to address the above challenges in three research directions: 1) learning with long sequences, 2) learning from domain knowledge and 3) learning under limited supervision. Our work can also potentially benefit more domains with large amounts of sequential data.
- StructCoder: Structure-Aware Transformer for Code GenerationTipirneni, Sindhu; Zhu, Ming; Reddy, Chandan (ACM, 2024)There has been a recent surge of interest in automating software engineering tasks using deep learning. This paper addresses the problem of code generation, where the goal is to generate target code given source code in a different language or a natural language description. Most state-of-the-art deep learning models for code generation use training strategies primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model where both the encoder and decoder are explicitly trained to recognize the syntax and data flow in the source and target codes, respectively. We not only make the encoder structure-aware by leveraging the source code?s syntax tree and data flow graph, but we also support the decoder in preserving the syntax and data flow of the target code by introducing two novel auxiliary tasks: AST (Abstract Syntax Tree) paths prediction and data flow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder that models both syntax and data flow to enhance the quality of generated code. The proposed StructCoder model achieves state-of-the-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark, and improves over baselines of similar size on the APPS code generation benchmark. Our code is publicly available at https://github.com/reddy-lab-code-research/StructCoder/.