Oreo: A Plug-in Context Reconstructor to Enhance Retrieval-Augmented Generation

Loading...
Thumbnail Image

Files

TR Number

Date

2025-07-18

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Abstract

Retrieval-Augmented Generation (RAG) aims to augment the capabilities of Large Language Models (LLMs) by retrieving and incorporating external documents or chunks prior to generation. However, even improved retriever relevance can bring erroneous or contextually distracting information, undermining the effectiveness of RAG in downstream tasks. We introduce a compact, efficient, and pluggable module designed to refine retrieved chunks before using them for generation. The module aims to extract and reorganize the most relevant and supportive information into a concise, query-specific, format. Through a three-stage training paradigm— comprising supervised fine-tuning, contrastive multi-task learning, and reinforcement learning-based alignment—it prioritizes critical knowledge and aligns it with the generator’s preferences. This approach enables LLMs to produce outputs that are more accurate, reliable, and contextually appropriate.

Description

Keywords

Citation