Most existing Retrieval-Augmented Generation (RAG) methods only retrieve short, continuous chunks from a corpus, limiting the comprehensive understanding of the entire document context.
Recursive Abstractive Processing for Tree-Organized Retrieval (RAPTOR) introduces a new tree-based retrieval system for recursively embedding, clustering, and summarizing text chunks. It builds a tree from the bottom up, with varying summarization levels. During inference, RAPTOR retrieves information from this tree, integrating data from longer documents at different abstraction levels.
This article provides an introduction to the principles and code of RAPTOR, in addition to sharing insights.
Keep reading with a 7-day free trial
Subscribe to AI Exploration Journey to keep reading this post and get 7 days of free access to the full post archives.