AI Exploration Journey
Subscribe
Sign in
Home
RAG
Document Parsing
AI Innovations and Insights
LLM Reasoning
LLMs
Data Structures
Archive
Leaderboard
About
PDF Parsing and Document Intelligence
Latest
Top
Discussions
HunyuanOCR: Unifying Multi-Stage OCR Pipelines into an End-to-End 1B VLM — AI Innovations and Insights 91
Real-world OCR isn’t just about reading crisp PDFs.
22 hrs ago
•
Florian
4
1
1
MonkeyOCR v1.5: Making Complex PDFs Parseable — AI Innovations and Insights 90
If you’ve ever worked with real scanned documents or PDFs, you’ve likely run into this mess: a table nested inside another table, split awkwardly across…
Nov 22
•
Florian
1
1
From Résumés(PDFs) to Clean Data: Layout-Aware Parsing with Tiny LLMs — AI Innovations and Insights 88
Have you ever wondered how to parse a résumé, or had to work with résumés on the job?
Nov 11
•
Florian
2
1
Making Documents Talk with Doc-Researcher: Document Parsing + Hybrid Retrieval for Multi-Agent Research — AI Innovations and Insights 86
Did you know that document parsing can be integrated with Deep Research?
Nov 4
•
Florian
2
1
Hybrid OCR-LLM: Not a Bigger Model, but a Smarter Pipeline — AI Innovations and Insights 84
Have you ever encountered documents like forms, certificates, or reports during document parsing?
Oct 30
•
Florian
3
2
DeepSeek-OCR: See Less, Remember More — AI Innovations and Insights 83
Document parsing can cut token usage without sacrificing OCR accuracy?
Oct 26
•
Florian
2
1
From Big Picture to Details: MinerU 2.5 Redefines Document Parsing — AI Innovations and Insights 77
In a previous post, I introduced MinerU, a popular open-source framework for document parsing built around a pipeline-based design (AI Innovations and…
Oct 5
•
Florian
1
2
Taming Chaotic Layouts: SFT + Layout-Centric RL for Document Understanding — AI Innovations and Insights 75
Complex layouts and reading order have always been among the trickiest parts of document understanding.
Sep 28
•
Florian
2
2
DianJin-OCR-R1: a Smarter OCR Pipeline for LVLMs That Think Before Response — AI Innovations and Insights 74
How do we turn a "story-telling" LVLM into an engineer that sees clearly and speaks accurately, and bring OCR hallucinations down to earth?AI…
Sep 21
•
Florian
5
3
Rethinking Scanned Document Parsing with Layout-Aware RL — AI Innovations and Insights 67
Welcome back, let's dive into Chapter 67 of this insightful series!AI Exploration Journey is a reader-supported publication.
Aug 22
•
Florian
2
1
PMA: Adaptive and Parallel Documents Understanding in Multi-Agent Systems — AI Innovations and Insights 65
Welcome back, let's dive into Chapter 65 of this insightful series!AI Exploration Journey is a reader-supported publication.
Aug 12
•
Florian
4
1
4 AI Breakthroughs You Can't Miss — AI Innovations and Insights 63
Welcome back, let's dive into Chapter 63 of this insightful series!AI Exploration Journey is a reader-supported publication.
Aug 6
•
Florian
1
1
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts