QuOTE: Question-Oriented Text Embeddings
dc.contributor.author | Neeser, Andrew Kyle | en |
dc.contributor.committeechair | Ramakrishnan, Narendran | en |
dc.contributor.committeemember | Latimer, Chris | en |
dc.contributor.committeemember | Lu, Chang Tien | en |
dc.contributor.department | Computer Science and#38; Applications | en |
dc.date.accessioned | 2025-06-14T08:02:08Z | en |
dc.date.available | 2025-06-14T08:02:08Z | en |
dc.date.issued | 2025-06-13 | en |
dc.description.abstract | We present QuOTE (Question-Oriented Text Embeddings), a novel enhancement to retrieval- augmented generation (RAG) systems, aimed at improving document representation for accurate and nuanced retrieval. Unlike traditional RAG pipelines, which rely on embed- ding raw text chunks, QuOTE augments chunks with hypothetical questions that the chunk can potentially answer, enriching the representation space. This better aligns document embeddings with user query semantics, and helps address issues such as ambiguity and context-dependent relevance. Through extensive experiments across diverse benchmarks, we demonstrate that QuOTE significantly enhances retrieval accuracy, including in multi-hop question-answering tasks. Our findings highlight the versatility of question generation as a fundamental indexing strategy, opening new avenues for integrating question generation into retrieval-based AI pipelines. | en |
dc.description.abstractgeneral | Modern artificial intelligence tools often help users by searching through large collections of documents and then using those search results to generate answers. This process can sometimes misinterpret a question or miss important connections in the text. In our work, we introduce QuOTE, a simple yet powerful method that teaches the system to think in terms of questions: each piece of text is paired with relevant, hypothetical questions it could answer. By organizing information around questions and answers, QuOTE creates clearer, more meaningful representations of documents. In tests that include cases where answers require combining information from different parts of a document, QuOTE consistently retrieves more accurate and relevant information than traditional approaches. This question-based indexing approach makes search-and-answer systems more reliable and could enhance a wide range of everyday tools, from virtual assistants to online help desks. | en |
dc.description.degree | Master of Science | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:44058 | en |
dc.identifier.uri | https://hdl.handle.net/10919/135522 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | Retrieval-Augmented Generation | en |
dc.subject | Synthetic Question Generation | en |
dc.subject | Document Representation | en |
dc.subject | Information Retrieval | en |
dc.title | QuOTE: Question-Oriented Text Embeddings | en |
dc.type | Thesis | en |
thesis.degree.discipline | Computer Science & Applications | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |
Files
Original bundle
1 - 1 of 1