Graph RAG: An Approach to Answering Global Queries

Jun 10, 2024

Traditional RAG struggles to answer global queries directed at an entire text corpus, such as determining the main theme of a dataset. This is essentially a Query Focused Summary (QFS) task, rather than a straightforward retrieval task.

To address this, Graph RAG offers a solution. It employs LLM to construct a graph-based text index in two stages:

Initially, it derives a knowledge graph from the source documents.
Subsequently, it generates community summaries for all closely connected entity groups.

Given a query, each community summary contributes to a partial response. These partial responses are then aggregated to form the final answer.

Overview

Figure 1 shows the pipeline of Graph RAG. The purple box signifies the indexing operations, while the green box indicates the query operations.

Figure 1: The pipeline of Graph RAG. Source: Graph RAG.

Graph RAG employs LLM prompts specific to the dataset's domain to detect, extract, and summarize nodes (like entities), edges (like relations), and covariates (like claims).

Community detection is utilized to divide the graph into groups of elements (nodes, edges, covariates) that LLM can summarize at both the time of indexing and querying.

The "global answer" for a particular query is produced by conducting a final round of query-focused summarization on all community summaries associated with that query.

The implementation of each step in Figure 1 will be explained below. It's worth noting that as of June 10, 2024, Graph RAG is not currently open source, so it can't be discussed in relation to the source code.

Step 1 : Source Documents → Text Chunks

The trade-off of chunk size is a longstanding problem.

If the chunk is too long, the number of LLM calls decreases. However, due to the constraints of the context window, it becomes challenging to fully comprehend and manage large amounts of information. This situation can lead to a decline in the recall rate.

Figure 2: How the entity references detected in the HotPotQA dataset varies with chunk size and gleanings for the generic entity extraction prompt with gpt-4-turbo. Source: Graph RAG.

As illustrated in Figure 2, for the HotPotQA dataset, a chunk size of 600 tokens extracts twice as many effective entities compared to a chunk size of 2400 tokens.

Step 2: Text Chunks → Element Instances (Entities and Relationships)

The method involves constructing a graph by extracting entities and their relationships from each chunk. This is achieved through the combination of LLMs and prompt engineering.

Simultaneously, Graph RAG employs a multi-stage iterative process. This process requires the LLM to determine if all entities have been extracted, similar to a binary classification problem.

Step 3: Element Instances → Element Summaries → Graph Communities → Community Summaries

In the previous step, the extracting entities, relationships, and claims is actually a form of abstractive summarization.

However, Graph RAG believes that this is not sufficient and that further summarizations of these "elements" are required using LLMs.

A potential concern is that LLMs may not always extract references to the same entity in the same text format. This could lead to duplicate entity elements, consequently generating duplicate nodes in the graph.

This concern will quickly dissipate.

Figure 3: Graph communities were detected using Leiden algorithm on the MultiHop-RAG dataset. Circles in the graph are entity nodes, and their sizes are proportional to their degrees. The colors of the nodes represent different entity communities, shown at two levels of hierarchical clustering: (a) Level 0, which corresponds to the hierarchical partition with maximum modularity, and (b) Level 1, revealing the internal structure within these root-level communities. Source: Graph RAG.

Graph RAG employs community detection algorithms to identify community structures within the graph, incorporating closely linked entities into the same community. Figure 3 presents the graph communities identified in the MultiHop-RAG dataset using Leiden algorithm.

In this scenario, even if LLM fails to identify all variants of an entity consistently during extraction, community detection can help establish the connections between these variants. Once grouped into a community, it signifies that these variants refer to the same entity connotation, just with different expressions or synonyms. This is akin to entity disambiguation in the field of knowledge graph.

Figure 4: The generation method of community summary. Source: Graph RAG.

After identifying the community, we can generate report-like summaries for each community within the Leiden hierarchy. These summaries are independently useful in understanding the global structure and semantics of the dataset. They can also be used to comprehend the corpus without any problems.

Figure 4 shows the generation method of community summary.

Step 4: Community Summaries → Community Answers → Global Answer

We've now reached the final step: generating the final answer based on the community summary from the previous step.

Due to the hierarchical nature of community structure, summaries from different levels can answer various questions.

However, this brings us to another question: with multiple levels of community summaries available, which level can strike a balance between detail and coverage?

Graph RAG, upon further evaluation(section 3 in the paper of Graph RAG), selects the most suitable level of abstraction.

For a given community level, the global answer to any user query is generated, as shown in Figure 5.

Figure 5: The process of generating the global answer for a given community level. Image by author.

Conclusion

This article presents Graph RAG. This method merges knowledge graph generation, RAG, and Query-Focused Summarization (QFS) to facilitate comprehensive understanding of the entire text corpus.

AI Exploration Journey

Discussion about this post