A New Approach to Optimizing Query Generation in RAG

Aug 04, 2024

In the fast-evolving field of natural language processing, large language models (LLMs) have demonstrated impressive performance across various language tasks. However, the challenge of generating incorrect information, known as "hallucinations," has become increasingly critical.

To address this challenge, Retrieval-Augmented Generation (RAG) systems leverage document retrieval to provide more accurate answers. Despite their promise, existing RAG systems still face challenges due to vague queries.

This article introduces a new study titled "Optimizing Query Generation for Enhanced Document Retrieval in RAG". This study aimes to enhance document retrieval in RAG by optimizing query generation.

Solution

A method called QOQA (Query Optimization using Query Expansion) was proposed, as illustrated in Figure 1.

Figure 1: Concept figure of QOQA. Given expansion query with top-k docs, we add top-3 rephrased queries and scores to LLM. We optimize the query based on the scores and generate the rephrased query. Source: https://arxiv.org/pdf/2407.12325.

The corresponding prompt is shown in Figure 2.

Figure 2: Prompt template used in QOQA. The black texts describe instructions of the optimizing task. The blue texts are original query with top-N retrieved documents with the original query. The purple texts are revised queries by LLM optimizer and scores. Source: https://arxiv.org/pdf/2407.12325.

The QOQA method optimizes query generation through the following steps, thereby improving the document retrieval accuracy in RAG systems.

Query Expansion and Reconstruction

The core idea of QOQA is to utilize large language models (LLMs) to rephrase queries. First, the system retrieves the top N relevant documents using the original query. These retrieved documents, together with the original query, form an expanded query. Then, LLMs generate rephrased queries based on these expanded queries.

Query-Document Alignment Score

To evaluate and optimize the generated queries, the researchers introduced a query-document alignment score. This scoring system includes three evaluation criteria: BM25 score, dense score, and hybrid score.

BM25 Score: Based on a sparse retrieval model, it evaluates the frequency and weight of query terms in the document.
Dense Score: Using a dense retrieval model, it evaluates the alignment between queries and documents through similarity in the embedding vector space.
Hybrid Score: Combines both BM25 and dense scores, optimizing the final score by adjusting the parameter α.

Optimization Process

During the optimization process, the QOQA method updates the query template, including the original query, retrieved documents, and the top K rephrased queries. In each iteration, LLMs generate new rephrased queries based on these scores and add them to the query bucket. Through multiple iterations, the system continuously optimizes the queries to ensure they outperform the original query.

Evaluation

The experimental results showed that the QOQA method significantly improved document retrieval performance compared to baseline models. The performance of various document retrieval models across these datasets is summarized in Figure 3.

Figure 3: Results of document retrieval task. All scores denote nDCG@10. Bold indicates the best result across all models, and the second best is underlined. Source: https://arxiv.org/pdf/2407.12325.

Additionally, case analyses demonstrated that queries generated using QOQA were more precise and concrete than the original queries, leading to the retrieval of more relevant documents. Examples from the SciFact and FiQA datasets are presented in Figure 4.

Figure 4: Examples from SciFact, and FiQA dataset. Blue texts are overlapping keywords between answer document and rephrased query. Source: https://arxiv.org/pdf/2407.12325.

Conclusion

This article introduces a method to mitigate the hallucination problem in Retrieval-Augmented Generation (RAG) systems by optimizing query generation.

Utilizing a top-k averaged query-document alignment score, QOQA refined queries using LLMs to improve the precision and computational efficiency of document retrieval. This study highlights the importance of precise query generation in enhancing the reliability and effectiveness of RAG systems.

AI Exploration Journey

Discussion about this post