Will Long-Context LLMs Cause the Extinction of RAG

Intuitive Perspective, Academic Research and Insights

Aug 02, 2024

∙ Paid

In 2023, the context window of LLMs was generally around 4K-8K. However, as of July 2024, LLMs with context windows exceeding 128K are common.

For example, Claude 2 has a 100K context window. Gemini 1.5 claim a 2M context, and later LongRoPE claims to extend the LLM context window beyond 2 million tokens. Additionally, Llama-3–8B-Instruct-Gradient-4194k has a context length of 4194K. It seems that the size of the context window is no longer a concern when using LLMs.

So people naturally thought: if LLMs can handle all the data at once, why bother establishing a RAG system?

Therefore, some researchers claim “RAG is dead”. However, other researchers insist that long-context LLMs will not lead to the demise of RAG, and RAG can still be revitalized.

This article focuses on the interesting topic: Will Long-Context LLMs cause the extinction of Retrieval-Augmented Generation(RAG)?

Figure 1: RAG vs Long-Context LLMs. Image by author.

First, this article introduces the comparison between RAG and long-context LLMs from an intuitive perspective. Then, it examines the research of several recent academic papers on this topic. Finally, it shares my thoughts and insights.

Keep reading with a 7-day free trial

Subscribe to AI Exploration Journey to keep reading this post and get 7 days of free access to the full post archives.