Discrete Diffusion for Text Infilling

Zhang, Andrew Xinghua

Discrete Diffusion for Text Infilling

dc.contributor.author	Zhang, Andrew Xinghua	en
dc.contributor.committeechair	Thomas, Christopher Lee	en
dc.contributor.committeemember	Wang, Xuan	en
dc.contributor.committeemember	Yanardag Delul, Pinar	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2025-07-11T08:00:32Z	en
dc.date.available	2025-07-11T08:00:32Z	en
dc.date.issued	2025-07-10	en
dc.description.abstract	Generative modeling of text is a fundamental challenge in natural language processing. While autoregressive models have achieved remarkable success, they face limitations in parallelizability and flexible control. Discrete diffusion models offer a promising alternative paradigm, leveraging iterative refinement and potentially enabling bidirectional context use, parallel generation, and flexible prompting. However, existing discrete text diffusion models typically assume fixed token positions, hindering their application to tasks requiring dynamic sequence lengths, such as unconstrained text infilling where ground-truth positional information is absent. vspace{baselineskip} This thesis introduces textbf{D}iscrete textbf{D}iffusion with textbf{O}ptimal textbf{T}ransport Position Coupling (DDOT) to overcome this critical limitation. DDOT is presented as the first discrete diffusion framework capable of handling flexible-length text infilling. At its core, DDOT employs a novel diffusion process that jointly models discrete token identities and continuous token positions. To maintain sequence coherence during the iterative generation process, a sample-level optimal transport (OT) coupling is integrated, ensuring consistent relative ordering of tokens. vspace{baselineskip} The methodology developed in this thesis is designed to be compatible with various underlying discrete diffusion techniques and pretrained denoising models. Comprehensive experimental validation on challenging constrained text generation benchmarks demonstrates DDOT's effectiveness. Results show that DDOT achieves performance competitive with state-of-the-art non-autoregressive methods, nears the quality of autoregressive models, and provides significant gains in training efficiency and flexibility for position-aware generation tasks. This research thus advances the capabilities of discrete diffusion models for complex text generation scenarios.	en
dc.description.abstractgeneral	This thesis addresses the challenge of automatically filling missing text segments of arbitrary length, from a single word to entire passages, without prior knowledge of token positions. Traditional generation methods proceed in a fixed order, selecting one token at a time and assuming positions are given. The proposed framework, Discrete Diffusion with Optimal Transport Position Coupling (DDOT), treats both token identity and placement as part of a unified iterative refinement process. At each step, DDOT refines its guesses for words in the blank regions and determines their positions, allowing it to perform variable length infilling without constraints. vspace{baselineskip} To ensure that token arrangements remain coherent and respect natural word order, DDOT uses a sample level optimal transport coupling. This mechanism softly aligns tentative token placements with plausible positions relative to each other by guiding elements toward correct spatial configurations. Integrating this transport based guidance into the discrete diffusion denoising steps preserves sentence fluency even when reconstructing heavily disrupted inputs. vspace{baselineskip} Extensive experiments on standard infilling benchmarks show that DDOT matches leading non autoregressive methods and approaches the performance of strong autoregressive baselines. At the same time, it offers advantages in training efficiency and flexibility for tasks with incomplete or variable positional information. These results demonstrate that DDOT is a significant advancement in position aware text generation with potential applications in research and real world text editing tools.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:44258	en
dc.identifier.uri	https://hdl.handle.net/10919/135961	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	Creative Commons Attribution 4.0 International	en
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	en
dc.subject	Discrete Diffusion	en
dc.subject	Text Modeling	en
dc.subject	Text Infilling	en
dc.subject	Masked Diffusion	en
dc.title	Discrete Diffusion for Text Infilling	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhang_AX_T_2025.pdf
Size:: 1.8 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses