QuOTE: Question-Oriented Text Embeddings

dc.contributor.authorNeeser, Andrew Kyleen
dc.contributor.committeechairRamakrishnan, Narendranen
dc.contributor.committeememberLatimer, Chrisen
dc.contributor.committeememberLu, Chang Tienen
dc.contributor.departmentComputer Science and#38; Applicationsen
dc.date.accessioned2025-06-14T08:02:08Zen
dc.date.available2025-06-14T08:02:08Zen
dc.date.issued2025-06-13en
dc.description.abstractWe present QuOTE (Question-Oriented Text Embeddings), a novel enhancement to retrieval- augmented generation (RAG) systems, aimed at improving document representation for accurate and nuanced retrieval. Unlike traditional RAG pipelines, which rely on embed- ding raw text chunks, QuOTE augments chunks with hypothetical questions that the chunk can potentially answer, enriching the representation space. This better aligns document embeddings with user query semantics, and helps address issues such as ambiguity and context-dependent relevance. Through extensive experiments across diverse benchmarks, we demonstrate that QuOTE significantly enhances retrieval accuracy, including in multi-hop question-answering tasks. Our findings highlight the versatility of question generation as a fundamental indexing strategy, opening new avenues for integrating question generation into retrieval-based AI pipelines.en
dc.description.abstractgeneralModern artificial intelligence tools often help users by searching through large collections of documents and then using those search results to generate answers. This process can sometimes misinterpret a question or miss important connections in the text. In our work, we introduce QuOTE, a simple yet powerful method that teaches the system to think in terms of questions: each piece of text is paired with relevant, hypothetical questions it could answer. By organizing information around questions and answers, QuOTE creates clearer, more meaningful representations of documents. In tests that include cases where answers require combining information from different parts of a document, QuOTE consistently retrieves more accurate and relevant information than traditional approaches. This question-based indexing approach makes search-and-answer systems more reliable and could enhance a wide range of everyday tools, from virtual assistants to online help desks.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:44058en
dc.identifier.urihttps://hdl.handle.net/10919/135522en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectRetrieval-Augmented Generationen
dc.subjectSynthetic Question Generationen
dc.subjectDocument Representationen
dc.subjectInformation Retrievalen
dc.titleQuOTE: Question-Oriented Text Embeddingsen
dc.typeThesisen
thesis.degree.disciplineComputer Science & Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Neeser_AK_T_2025.pdf
Size:
1.21 MB
Format:
Adobe Portable Document Format

Collections