Teaching RAG to "Remember": How MemoRAG Boosts Question-Answering Skills Through Memory

Sep 22, 2024

Existing RAG systems are limited when dealing with complex or ambiguous information needs that cannot be directly retrieved from external databases. For instance, traditional RAG systems excel in well-structured question-answering tasks, but they struggle when the task requires an implicit understanding of the underlying query or when the data to be retrieved is unstructured.

Today, we introduce a new study called "MemoRAG," a novel approach designed to address this issue by utilizing a long-term memory system that recalls relevant information based on context, significantly enhancing retrieval efficiency for complex tasks.

Overview

As illustrated in Figure 1, on the left, Standard RAG struggles to accurately locate the necessary evidence due to the implicit nature of the input query, resulting in a less precise answer. On the right, MemoRAG constructs a global memory across the entire database. When presented with the query, MemoRAG first recalls relevant clues, enabling useful information to be retrieved and thus leading to a precise and comprehensive answer.

Figure 1: Comparison between Standard RAG and MemoRAG in processing queries that demand high-level understanding across the entire database, using the HARRY POTTER books as the database. Source: MemoRAG.

Compared to standard RAG systems, MemoRAG's memory-based mechanism enables it to handle ambiguous queries and unstructured knowledge more effectively. Traditional RAG methods rely heavily on lexical and semantic matching, which can fall short when handling implicit information needs. MemoRAG, on the other hand, employs a light but longrange LLM to form the global memory of database, allowing it to recall key clues and generate a comprehensive response.

In the broader sense, MemoRAG combines memory-inspired knowledge discovery with RAG to tackle both complex and simple retrieval tasks. Its key innovation lies in forming a memory model that generates clues based on a compressed representation of the database. These clues help the system retrieve relevant information even in cases where traditional keyword-based retrieval would fail.

Detailed Mechanism

MemoRAG introduces a structured and multi-step process that involves both memory and retrieval mechanisms to improve RAG systems. Its detailed workflow can be broken down into the following key stages.

Input Query and Initial Memory Generation

When a query is presented to MemoRAG, it is first processed by the memory model, which generates a "staging answer" or a set of clues y based on the global memory formed over the database D.

This initial output is not the final answer but a rough draft or outline that helps guide the retrieval process by identifying what type of information needs to be retrieved. This memory mechanism allows MemoRAG to go beyond simple keyword-based retrieval, bridging the gap between implicit queries and the underlying database.

Figure 2: Case for Information Seeking with Implicit Query. Source: MemoRAG.

In the case of a query like “How does the Harry Potter series convey the theme of love?”, the memory module recalls key clues like “Lily Potter’s sacrifice,” “The Weasley family,” and “Harry’s romantic relationship with Ginny Weasley”.

Clue-Based Retrieval

Once the memory model generates clues, MemoRAG uses them to initiate a precise retrieval process from the database D. These clues guide the retriever to locate the most relevant portions of the dataset, enabling MemoRAG to handle queries where information is scattered across different parts of the database. This process is highly advantageous for tasks involving ambiguous or implicit queries, where traditional retrieval systems might struggle to identify the relevant context without explicit instructions.

Figure 3: Case for Information Seeking with Distributed Evidence Query. Source: MemoRAG.

For example, if the query is financial, like "Which year had the highest revenue in the past three years?", the system breaks down the query into sub-queries like "Revenue of Year 2021", "Revenue of Year 2022," and "Revenue of Year 2023". This allows MemoRAG to retrieve specific pieces of data required to form an accurate response.

Final Answer Generation

After retrieving the necessary information, MemoRAG uses an expressive generation model to synthesize the final answer Y. The generation model takes the retrieved context C, which includes the retrieved evidence text, and produces a comprehensive and precise answer to the original query. The global memory acts as a context provider that enhances the overall quality of the generated answer.

Training the Memory Module

The memory module in MemoRAG is essential for effectively storing and recalling large amounts of information. Its training involves two key stages: pre-training and supervised fine-tuning (SFT).

1. Pre-training with Long Contexts

In the pre-training phase, the memory model learns to compress raw input into memory tokens using long-context data from the RedPajama dataset.

This process enables the model to retain important semantic information while discarding less relevant details. Essentially, it mimics how humans remember key points after reading lengthy articles.

2. Supervised Fine-tuning (SFT)

After pre-training, the memory module undergoes supervised fine-tuning, where it is trained on labeled datasets containing queries and their corresponding answers. This stage focuses on generating task-specific clues that guide information retrieval.

The labeled datasets for SFT are constructed using the six steps illustrated in Figure 4.

Figure 4: The construction steps of SFT datasets. Image by author.

The training objective maximizes the probability of generating the next token based on previous tokens and memory. For instance, in a financial analysis task, the model may generate clues like "revenue growth," aiding the retrieval of relevant information.

Evaluation

MemoRAG was evaluated using a comprehensive benchmark called ULTRADOMAIN, which includes tasks from diverse domains such as law, finance, and education.

MemoRAG outperformed traditional RAG models, especially in handling tasks involving long input contexts and implicit information needs. For example, in legal and financial tasks, MemoRAG achieved significantly higher accuracy and precision compared to other models. This demonstrates the effectiveness of MemoRAG in both complex and straightforward question-answering tasks.

Conclusion and Insights

This article presented MemoRAG, an advanced Retrieval-Augmented Generation system that introduces memory-based knowledge discovery to enhance retrieval quality for complex and ambiguous tasks.

In summary, MemoRAG’s innovation lies in its ability to:

Form and use a global memory for long-context retrieval.
Generate retrieval clues to improve information access for ambiguous queries.
Significantly outperform standard RAG systems in tasks with long or complex inputs.

However, I believe that challenges remain.

Memory Accuracy: Although the memory module can store essential information, ensuring the generated clues are both accurate and useful is a persistent challenge. This issue can affect the system's overall performance when handling complex queries.
Real-World Application Challenges: Implementing MemoRAG in real-world scenarios may encounter issues such as rapidly changing data and diverse information sources. The system must adapt quickly to new data to sustain its performance.

AI Exploration Journey

Discussion about this post