Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces
| dc.contributor.author | Yousuf, Raquib Bin | en |
| dc.contributor.committeechair | Ramakrishnan, Narendran | en |
| dc.contributor.committeemember | Wang, Xuan | en |
| dc.contributor.committeemember | Muthiah, Sathappan | en |
| dc.contributor.committeemember | Lu, Chang Tien | en |
| dc.contributor.committeemember | North, Christopher L. | en |
| dc.contributor.department | Computer Science and#38; Applications | en |
| dc.date.accessioned | 2026-05-21T08:00:30Z | en |
| dc.date.available | 2026-05-21T08:00:30Z | en |
| dc.date.issued | 2026-05-20 | en |
| dc.description.abstract | Large Language Models (LLMs) excel at fluent language generation but face critical challenges in high-stakes domains that require reasoning over long contexts, structured information use, grounded retrieval, and human-verifiable outputs. This dissertation explores how to improve LLM performance on complex, context-rich tasks through four contributions. First, we introduce memory-augmented architectures for multi-document reasoning, highlighting gaps between summarization and true inference. Second, we benchmark relational reasoning by reconstructing latent graphs from long texts, revealing a limitation we term "memory drift." Third, we show that incorporating structured metadata as a first-class signal in retrieval-augmented generation (RAG) systems improves retrieval consistency in large, repetitive corpora by better disambiguating context. Finally, we present a human-in-the-loop system for structured data analysis that enables transparent, code-centric interaction and supports iterative sensemaking over complex datasets. Together, these efforts advance LLM capabilities in analytical synthesis, structured retrieval, long-context evaluation, and explainability, offering practical tools for building more trustworthy and effective AI systems in real-world applications. | en |
| dc.description.abstractgeneral | Large Language Models (LLMs) can generate fluent text, but they often struggle with real-world analytical tasks that require following information over long contexts, using structured details, retrieving the right evidence, and allowing users to verify results. This dissertation explores how to make LLMs more reliable for such tasks. We first develop methods to help LLMs organize and connect information across multiple documents. We then show that LLMs have difficulty retaining and using relationships over long inputs, introducing a new way to measure this limitation, called "memory drift." Next, we improve how LLM systems retrieve relevant information by incorporating structured details that help distinguish similar documents. Finally, we present an interactive system that allows users to guide and refine structured data analysis, making the process more transparent, inspectable, and reliable. Together, these contributions show that improving LLMs requires not only better models, but also better ways to structure information, retrieve relevant context, and involve users in the analysis process. | en |
| dc.description.degree | Doctor of Philosophy | en |
| dc.format.medium | ETD | en |
| dc.identifier.other | vt_gsexam:46600 | en |
| dc.identifier.uri | https://hdl.handle.net/10919/143122 | en |
| dc.language.iso | en | en |
| dc.publisher | Virginia Tech | en |
| dc.rights | In Copyright | en |
| dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
| dc.subject | Large Language Models | en |
| dc.subject | Long-Context Reasoning | en |
| dc.subject | Retrieval-Augmented Generation | en |
| dc.subject | Metadata-Aware Retrieval | en |
| dc.subject | Human-in-the-Loop Systems | en |
| dc.title | Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces | en |
| dc.type | Dissertation | en |
| thesis.degree.discipline | Computer Science & Applications | en |
| thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
| thesis.degree.level | doctoral | en |
| thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1