Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces

Yousuf, Raquib Bin

Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces

Files

Yousuf_R_D_2026.pdf (29.15 MB)

Downloads: 97

Date

2026-05-20

Authors

Yousuf, Raquib Bin

Publisher

Virginia Tech

Abstract

Large Language Models (LLMs) excel at fluent language generation but face critical challenges in high-stakes domains that require reasoning over long contexts, structured information use, grounded retrieval, and human-verifiable outputs. This dissertation explores how to improve LLM performance on complex, context-rich tasks through four contributions. First, we introduce memory-augmented architectures for multi-document reasoning, highlighting gaps between summarization and true inference. Second, we benchmark relational reasoning by reconstructing latent graphs from long texts, revealing a limitation we term "memory drift." Third, we show that incorporating structured metadata as a first-class signal in retrieval-augmented generation (RAG) systems improves retrieval consistency in large, repetitive corpora by better disambiguating context. Finally, we present a human-in-the-loop system for structured data analysis that enables transparent, code-centric interaction and supports iterative sensemaking over complex datasets. Together, these efforts advance LLM capabilities in analytical synthesis, structured retrieval, long-context evaluation, and explainability, offering practical tools for building more trustworthy and effective AI systems in real-world applications.

Keywords

Large Language Models, Long-Context Reasoning, Retrieval-Augmented Generation, Metadata-Aware Retrieval, Human-in-the-Loop Systems

Persistent link

https://hdl.handle.net/10919/143122

Collections

Doctoral Dissertations

Full item page

Improving LLM Reasoning and Retrieval for Structured and Complex Information Spaces

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections