Will Long-Context LLMs Cause the Extinction of RAG
Intuitive Perspective, Academic Research and Insights
In 2023, the context window of LLMs was generally around 4K-8K. However, as of July 2024, LLMs with context windows exceeding 128K are common.
For example, Claude 2 has a 100K context window. Gemini 1.5 claim a 2M context, and later LongRoPE claims to extend the LLM context window beyond 2 million tokens. Additionally, Llama-3–8B-Instruct-Gradient-4194k has a context length of 4194K. It seems that the size of the context window is no longer a concern when using LLMs.
So people naturally thought: if LLMs can handle all the data at once, why bother establishing a RAG system?
Therefore, some researchers claim “RAG is dead”. However, other researchers insist that long-context LLMs will not lead to the demise of RAG, and RAG can still be revitalized.
This article focuses on the interesting topic: Will Long-Context LLMs cause the extinction of Retrieval-Augmented Generation(RAG)?
First, this article introduces the comparison between RAG and long-context LLMs from an intuitive perspective. Then, it examines the research of several recent academic papers on this topic. Finally, it shares my thoughts and insights.
Keep reading with a 7-day free trial
Subscribe to AI Exploration Journey to keep reading this post and get 7 days of free access to the full post archives.


