Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces

Yousuf, Raquib Bin

Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces

dc.contributor.author	Yousuf, Raquib Bin	en
dc.contributor.committeechair	Ramakrishnan, Narendran	en
dc.contributor.committeemember	Wang, Xuan	en
dc.contributor.committeemember	Muthiah, Sathappan	en
dc.contributor.committeemember	Lu, Chang Tien	en
dc.contributor.committeemember	North, Christopher L.	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2026-05-21T08:00:30Z	en
dc.date.available	2026-05-21T08:00:30Z	en
dc.date.issued	2026-05-20	en
dc.description.abstract	Large Language Models (LLMs) excel at fluent language generation but face critical challenges in high-stakes domains that require reasoning over long contexts, structured information use, grounded retrieval, and human-verifiable outputs. This dissertation explores how to improve LLM performance on complex, context-rich tasks through four contributions. First, we introduce memory-augmented architectures for multi-document reasoning, highlighting gaps between summarization and true inference. Second, we benchmark relational reasoning by reconstructing latent graphs from long texts, revealing a limitation we term "memory drift." Third, we show that incorporating structured metadata as a first-class signal in retrieval-augmented generation (RAG) systems improves retrieval consistency in large, repetitive corpora by better disambiguating context. Finally, we present a human-in-the-loop system for structured data analysis that enables transparent, code-centric interaction and supports iterative sensemaking over complex datasets. Together, these efforts advance LLM capabilities in analytical synthesis, structured retrieval, long-context evaluation, and explainability, offering practical tools for building more trustworthy and effective AI systems in real-world applications.	en
dc.description.abstractgeneral	Large Language Models (LLMs) can generate fluent text, but they often struggle with real-world analytical tasks that require following information over long contexts, using structured details, retrieving the right evidence, and allowing users to verify results. This dissertation explores how to make LLMs more reliable for such tasks. We first develop methods to help LLMs organize and connect information across multiple documents. We then show that LLMs have difficulty retaining and using relationships over long inputs, introducing a new way to measure this limitation, called "memory drift." Next, we improve how LLM systems retrieve relevant information by incorporating structured details that help distinguish similar documents. Finally, we present an interactive system that allows users to guide and refine structured data analysis, making the process more transparent, inspectable, and reliable. Together, these contributions show that improving LLMs requires not only better models, but also better ways to structure information, retrieve relevant context, and involve users in the analysis process.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:46600	en
dc.identifier.uri	https://hdl.handle.net/10919/143122	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Large Language Models	en
dc.subject	Long-Context Reasoning	en
dc.subject	Retrieval-Augmented Generation	en
dc.subject	Metadata-Aware Retrieval	en
dc.subject	Human-in-the-Loop Systems	en
dc.title	Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Yousuf_R_D_2026.pdf
Size:: 29.15 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations