Chunking involves dividing a long text or document into smaller, logically coherent segments or “chunks.” Each chunk usually contains one or more sentences, with the segmentation based on the text’s structure or meaning. Once divided, each chunk can be processed independently or used in subsequent tasks, such as retrieval or generation.
The role of chunking in the mainstream RAG pipeline is shown in Figure 1.
In the previous article, we explored various methods of semantic chunking, explaining their underlying principles and practical applications. These methods included:
Embedding-based methods: When the similarity between consecutive sentences drops below a certain threshold, a chunk boundary is introduced.
Model-based methods: Utilize deep learning models, such as BERT, to segment documents effectively.
LLM-based methods: Use LLMs to construct propositions, achieving more refined chunks.
However, since the previous article was published on February 28, 2024, there have been significant advancements in chunking over the past few months. Therefore, this article presents some of the latest developments in chunking within the RAG pipeline, focusing primarily on the following topics:
Keep reading with a 7-day free trial
Subscribe to AI Exploration Journey to keep reading this post and get 7 days of free access to the full post archives.