AI Exploration Journey

AI Exploration Journey

BookRAG: A Document = One Tree + One Graph + One Agent — AI Innovations and Insights 95

Dec 09, 2025
∙ Paid

In real-world enterprise environments, knowledge rarely lives in a tidy FAQ. More often, it’s buried in dense technical manuals, API references, SOPs, and research papers—long documents that look and behave more like books. They come with chapters and sub-sections, embedded tables and formulas, and a clear but complex hierarchical layout.

But existing RAG systems—including text-based graph methods and layout-segmented approaches—tend to break down due to disconnected structure-semantics and static workflows.

This post might offer a useful perspective.

AI Exploration Journey is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Why Most RAG Systems Struggle with “Book-like” Documents

Two Traditional Approaches (and Their Limitations)

There are two mainstream paradigms people use to process these kinds of documents.

  1. Text-first approach: This method flattens everything into plain text, primarily relying on OCR. Then it applies retrieval techniques like BM25, classic chunk-based RAG, or graph-based methods like GraphRAG or RAPTOR.

    • GraphRAG builds a knowledge graph from the text and applies community detection to form hierarchical clusters with summaries.

    • RAPTOR recursively clusters and summarizes chunks to form a tree-like structure.

  2. Layout-first approach: This one preserves the original document layout. It segments content into structured blocks (paragraphs, tables, figures, equations) and uses multimodal retrieval or LLM-based processing pipelines (like DocETL) to handle relevant chunks.

Keep reading with a 7-day free trial

Subscribe to AI Exploration Journey to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Florian June · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture