StructCoder: Structure-Aware Transformer for Code Generation

dc.contributor.authorTipirneni, Sindhuen
dc.contributor.authorZhu, Mingen
dc.contributor.authorReddy, Chandanen
dc.date.accessioned2024-03-01T13:19:15Zen
dc.date.available2024-03-01T13:19:15Zen
dc.date.issued2024en
dc.date.updated2024-01-01T08:54:38Zen
dc.description.abstractThere has been a recent surge of interest in automating software engineering tasks using deep learning. This paper addresses the problem of code generation, where the goal is to generate target code given source code in a different language or a natural language description. Most state-of-the-art deep learning models for code generation use training strategies primarily designed for natural language. However, understanding and generating code requires a more rigorous comprehension of the code syntax and semantics. With this motivation, we develop an encoder-decoder Transformer model where both the encoder and decoder are explicitly trained to recognize the syntax and data flow in the source and target codes, respectively. We not only make the encoder structure-aware by leveraging the source code?s syntax tree and data flow graph, but we also support the decoder in preserving the syntax and data flow of the target code by introducing two novel auxiliary tasks: AST (Abstract Syntax Tree) paths prediction and data flow prediction. To the best of our knowledge, this is the first work to introduce a structure-aware Transformer decoder that models both syntax and data flow to enhance the quality of generated code. The proposed StructCoder model achieves state-of-the-art performance on code translation and text-to-code generation tasks in the CodeXGLUE benchmark, and improves over baselines of similar size on the APPS code generation benchmark. Our code is publicly available at https://github.com/reddy-lab-code-research/StructCoder/.en
dc.description.versionAccepted versionen
dc.identifier.doihttps://doi.org/10.1145/3636430en
dc.identifier.urihttps://hdl.handle.net/10919/118237en
dc.language.isoenen
dc.publisherACMen
dc.rightsIn Copyrighten
dc.rights.holderThe author(s)en
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.titleStructCoder: Structure-Aware Transformer for Code Generationen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
3636430.pdf
Size:
1.01 MB
Format:
Adobe Portable Document Format
Description:
Accepted version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: