Adapting Transformers for Structured Data Domains

dc.contributor.authorTipirneni, Sai Sindhuraen
dc.contributor.committeechairReddy, Chandan K.en
dc.contributor.committeememberLourentzou, Isminien
dc.contributor.committeememberHuang, Lifuen
dc.contributor.committeememberYuan, Changheen
dc.contributor.committeememberSubbian, Vigneshen
dc.contributor.departmentComputer Science and#38; Applicationsen
dc.date.accessioned2025-05-31T08:04:07Zen
dc.date.available2025-05-31T08:04:07Zen
dc.date.issued2025-05-30en
dc.description.abstractThis research aims to enhance the adaptability and effectiveness of Transformers in structured data domains beyond their traditional use in natural language processing (NLP). We revisit key elements of the transformer framework - including input representations, attention formulations, auxiliary tasks, prediction layers and loss functions - and adapt them to better suit the structure and semantics of specific data domains. Focusing on four structured domains - (i) sparse and irregularly sampled multivariate time-series, (ii) general-purpose programming languages, (iii) short text clustering, and (iv) natural language interfaces to relational databases —this dissertation proposes novel domain-specific Transformer based models. For the first domain, we present STraTS, a self-supervised Transformer that represents data as observation triplets and adds forecasting as an auxiliary task to improve mortality prediction on multivariate clinical time-series. An interpretable version of this model is also proposed to enhance its utility for critical applications like healthcare. In the programming domain, we build StructCoder, which is an encoder-decoder Transformer designed to effectively capture source code structures and concurrently handle auxiliary tasks associated with predictions on target code structures. For short text clustering, we develop CACTUS, which is a Transformer for context-aware supervised clustering. This model incorporates efficient inter-entity interactions through sparse attention, employs a specialized loss function tailored for supervised clustering, and integrates a novel self-supervised clustering task to enhance performance on the primary clustering task. Finally, we present RAFT-S3, a framework for reasoning-aware finetuning of small language models (SLMs) on the text-to-SQL task. RAFT-S3 collects synthetic text-to-SQL data with diverse schemas using large language models (LLMs), along with intermediate reasoning traces which are incorporated into the two-stage finetuning process. We conduct extensive experiments to compare proposed methods to competitive baselines in each domain, conduct ablation studies, and discuss qualitative results. This research contributes to an improved understanding of Transformer architectures and provides opportunities for more applications across a spectrum of structured data domains.en
dc.description.abstractgeneralTransformers are a type of machine learning model known for their success in applications like chatbots and language translation. However, using them for other types of structured data such as time-series and programming code requires significant adjustments to the framework. This research explores how to redesign Transformers so they can better understand and work with these more complex data domains. The research looks at four specific domains: (1) medical time-series data, where patient measurements are taken irregularly over time; (2) source code from programming languages; (3) short texts that need to be grouped into categories; and (4) translating natural language questions into database queries. For each of these, new Transformer-based models were developed. For medical time-series, the STraTS model helps predict patient outcomes by analyzing irregular and incomplete time-series information, and even includes an interpretable version that explains why certain predictions were made which is crucial for clinical use. In the programming domain, the StructCoder model was built to better understand the structure of code and improve code generation. For grouping short texts into categories which finds applications in e-commerce, we propose the CACTUS model which utilizes better context to make smarter dynamic grouping decisions. Finally, the RAFT-S3 framework teaches small language models to convert natural language into SQL database queries by using synthetic examples and incorporating the logical reasoning behind each example. Each of these innovations was tested against existing methods and showed improved results. Together, they demonstrate how Transformers can be adapted to a wide range of practical applications beyond simple tasks based on human language.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:43868en
dc.identifier.urihttps://hdl.handle.net/10919/134958en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjecttransformeren
dc.subjectattentionen
dc.subjectsupervised learningen
dc.subjectself-supervisionen
dc.subjectclinical time-seriesen
dc.subjectcode generationen
dc.subjecttext clusteringen
dc.subjectLLMsen
dc.subjectsynthetic dataen
dc.titleAdapting Transformers for Structured Data Domainsen
dc.typeDissertationen
thesis.degree.disciplineComputer Science & Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tipirneni_S_D_2025.pdf
Size:
2.34 MB
Format:
Adobe Portable Document Format