QuOTE: Question-Oriented Text Embeddings

Neeser, Andrew Kyle

QuOTE: Question-Oriented Text Embeddings

dc.contributor.author	Neeser, Andrew Kyle	en
dc.contributor.committeechair	Ramakrishnan, Narendran	en
dc.contributor.committeemember	Latimer, Chris	en
dc.contributor.committeemember	Lu, Chang Tien	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2025-06-14T08:02:08Z	en
dc.date.available	2025-06-14T08:02:08Z	en
dc.date.issued	2025-06-13	en
dc.description.abstract	We present QuOTE (Question-Oriented Text Embeddings), a novel enhancement to retrieval- augmented generation (RAG) systems, aimed at improving document representation for accurate and nuanced retrieval. Unlike traditional RAG pipelines, which rely on embed- ding raw text chunks, QuOTE augments chunks with hypothetical questions that the chunk can potentially answer, enriching the representation space. This better aligns document embeddings with user query semantics, and helps address issues such as ambiguity and context-dependent relevance. Through extensive experiments across diverse benchmarks, we demonstrate that QuOTE significantly enhances retrieval accuracy, including in multi-hop question-answering tasks. Our findings highlight the versatility of question generation as a fundamental indexing strategy, opening new avenues for integrating question generation into retrieval-based AI pipelines.	en
dc.description.abstractgeneral	Modern artificial intelligence tools often help users by searching through large collections of documents and then using those search results to generate answers. This process can sometimes misinterpret a question or miss important connections in the text. In our work, we introduce QuOTE, a simple yet powerful method that teaches the system to think in terms of questions: each piece of text is paired with relevant, hypothetical questions it could answer. By organizing information around questions and answers, QuOTE creates clearer, more meaningful representations of documents. In tests that include cases where answers require combining information from different parts of a document, QuOTE consistently retrieves more accurate and relevant information than traditional approaches. This question-based indexing approach makes search-and-answer systems more reliable and could enhance a wide range of everyday tools, from virtual assistants to online help desks.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:44058	en
dc.identifier.uri	https://hdl.handle.net/10919/135522	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Retrieval-Augmented Generation	en
dc.subject	Synthetic Question Generation	en
dc.subject	Document Representation	en
dc.subject	Information Retrieval	en
dc.title	QuOTE: Question-Oriented Text Embeddings	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Neeser_AK_T_2025.pdf
Size:: 1.21 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses