dots-ocr.py · uv-scripts/ocr at main
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Some models can also predict boduning boxes for images, charts etc. You would still need to do the extra step of grabbing the images from the bounding boxes but it can work quite well. dots.ocr is the one I've used most for this, i.e. this mode huggingface.co/datasets/uv-...
05.03.2026 15:48
👍 0
🔁 0
💬 1
📌 0
Still an early experiment. Would love feedback on whether something like this would be useful for your work!
05.03.2026 14:48
👍 0
🔁 0
💬 1
📌 0
GitHub - davanstrien/ocr-bench: Per-collection OCR leaderboards using VLM-as-judge
Per-collection OCR leaderboards using VLM-as-judge - davanstrien/ocr-bench
Point it at any Hugging Face dataset, launches OCR models, compares outputs pairwise using a VLM judge, and publishes an interactive leaderboard.
Inspired by datalab's benchmarks approach, but open source so you can run it on your own collections.
github.com/davanstrien/...
05.03.2026 14:48
👍 2
🔁 0
💬 1
📌 0
Screenshot of plot showing ELO vs paramter count for different OCR models
There is no best VLM OCR model - rankings can flip completely by document type.
I built ocr-bench: run open OCR models on YOUR documents, get a per-collection leaderboard.
VLM-as-judge with Bradley-Terry ELO, all running on @hf.co. No local GPU needed.
05.03.2026 14:48
👍 48
🔁 10
💬 1
📌 1
This sounds great!
02.03.2026 17:58
👍 1
🔁 0
💬 0
📌 0
Screenshot of a search UI showing a text box with search results showing index cards next to the ocr for the card
Is it worth re-OCR'ing old library index cards?
Re-OCR'd 453,000 from @bpl.boston.gov's rare books catalogue.
~$50 compute using @huggingface Jobs
BPL's own guide calls their search "extremely unreliable." Does better OCR + semantic search help fix it?
Demo space link below
27.02.2026 17:09
👍 41
🔁 8
💬 1
📌 0
Screenshot of a search UI showing a text box with search results showing index cards next to the ocr for the card
Is it worth re-OCR'ing old library index cards?
Re-OCR'd 453,000 from @bpl.boston.gov's rare books catalogue.
~$50 compute using @huggingface Jobs
BPL's own guide calls their search "extremely unreliable." Does better OCR + semantic search help fix it?
Demo space link below
27.02.2026 17:09
👍 41
🔁 8
💬 1
📌 0
Ran the same OCR models on 68 pages of historic newspaper. Every model hallucinated or looped.
DeepSeek-OCR-2, LightOnOCR-2, GLM-OCR – all melt down on dense newspaper columns.
You can try yourself using this @hf.co dataset: huggingface.co/datasets/dav...
23.02.2026 14:07
👍 20
🔁 3
💬 4
📌 0
Great to hear of some fresh eyes on this task! Think there is a lot that wasn't possible a few years ago that is now.
23.02.2026 17:26
👍 2
🔁 0
💬 0
📌 0
Looking forward to reading it! Looking forward to it even more if it comes with data 😛
23.02.2026 14:26
👍 1
🔁 0
💬 0
📌 0
Ran the same OCR models on 68 pages of historic newspaper. Every model hallucinated or looped.
DeepSeek-OCR-2, LightOnOCR-2, GLM-OCR – all melt down on dense newspaper columns.
You can try yourself using this @hf.co dataset: huggingface.co/datasets/dav...
23.02.2026 14:07
👍 20
🔁 3
💬 4
📌 0
llama.cpp logo + Hugging Face logo
Llama.cpp joins Hugging Face
github.com/ggml-org/lla...
20.02.2026 14:04
👍 54
🔁 7
💬 2
📌 1
Nice!
20.02.2026 21:27
👍 0
🔁 0
💬 0
📌 0
llama.cpp logo + Hugging Face logo
Llama.cpp joins Hugging Face
github.com/ggml-org/lla...
20.02.2026 14:04
👍 54
🔁 7
💬 2
📌 1
@willwhim.com bsky.app/profile/dani...
19.02.2026 14:28
👍 5
🔁 0
💬 0
📌 0
Yeah quality is very mixed by language. I have vague recollection of someone working a lot on sanskrit ocr using open models on the Hub. Will post if I remember where that was!
19.02.2026 13:17
👍 3
🔁 0
💬 0
📌 0
Will try to write something a bit more detailed for this!
19.02.2026 12:58
👍 2
🔁 0
💬 1
📌 0
Screenshot of old vs new ocr.
old ocr text is garbled. New ocr much cleaner.
Re-OCR'd the complete 1771 Encyclopaedia Britannica (2,724 pages) with a single command on @hf.co Jobs.
- 0.9B model (GLM-OCR)
~$0.002/page
~$5 total on an L4 GPU
Before (old Tesseract ocr) → After
19.02.2026 11:29
👍 96
🔁 16
💬 5
📌 6
table of contents showing ocr models supported in the repo
The uv-scripts/ocr collection now includes 13 models, including GLM-OCR, a 0.9B model that scores 94.6% on OmniDocBench.
One command to run any of them on your dataset via @hf.co Jobs.
huggingface.co/datasets/uv-...
17.02.2026 15:06
👍 10
🔁 1
💬 0
📌 0
Spaces Configuration Reference
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
If you have the model ID in the spaces demo code, it will often get picked up automatically. Otherwise, you can specify the model in the `models` field in the space YAML metadata, see huggingface.co/docs/hub/en/...
16.02.2026 10:10
👍 1
🔁 0
💬 1
📌 0
Join us tomorrow for a demo of IIIF Illustration Detector!
Zoom link: iiif.io/community
10.02.2026 17:22
👍 3
🔁 3
💬 0
📌 0
Datasets and benchmarks drive AI progress, but finding papers that introduce new ones means digging through thousands of arXiv abstracts.
Updated the Dataset Papers on ArXiv app to surface them: 52K+ papers classified as introducing new datasets from 212K CS papers.
09.02.2026 10:13
👍 9
🔁 1
💬 1
📌 0