Show HN: OCR Workbench: AI OCR for hard documents

2 viking2917 3 5/13/2025, 7:06:11 PM github.com ↗

OCR on old documents is hard. OCR Workbench uses AI for OCR and provides an editing environment to clean things up, as is inevitably required.

Inspired by this Hacker News post: https://news.ycombinator.com/item?id=43048698

Backstory: I was having trouble producing transcriptions of Colonial American documents, which have their own unique challenges for OCR, and things like Tesseract fail miserably. So I built something. Uses Gemini and seems to work pretty well (disclaimer: you need your own API key). I didn't build Claude but I expect it works similarly well.

FWIW: largely vibe coded, with human review and intervention as required.

Comments (3)

keepsweet · 21h ago

Interesting concept. I tried it with a text written in Church Slavonic, didn't work. I guess the documents don't have to be THAT old. It would also be nice if you could upload images individually instead of selecting everything from a folder. Either way, nice work.

viking2917 · 11h ago

Thanks! I have not tried anything other than English, not sure how good the LLMs are for that. Did you use Tesseract or Gemini?

Once the page structure is set up from the images (via the directory upload), you can upload new images for each page, but I didn't include an option to just create all the pages manually. It's a good idea. Going to add that...

viking2917 · 9h ago

I updated the app to allow manual add/delete of pages. Then tap the page and do New Image to upload per page. Thanks for the feedback!

Ask HN: How are you acquiring your first hundred users?

Ask HN: How are you cleaning and transforming data before imports/uploads?

Ask HN: What will tech employment look like in 10 years?

Ask HN: How do you store the knowledge gained in a day?

Ask HN: Cursor or Windsurf?

How Not to Be Overwhelmed

FlyLoop – AI Agent for Scheduling Meetings and Managing Your Calendar

iOS app that analyzes link behavior like nutrition label, no cloud, open source

Ask HN: What application or website would you use to teach a kid how to type?

Ask HN: How do you use the knowledge gained in a day?

Ask HN: AI Model for Adult Chat?

Ask HN: Proven Passive Income Streams That Uses AI Agents?

Ask HN: Is Slack Down?

Ask HN: Who's building AI systems rooted in human presence, not performance?

Location-Based Gifting Apps

All the games I could find for the Intellivision Home gaming console

Which AI Agent is your favorite?

Ask HN: Not sure about the future of tech

Ask HN: How do you like the Framework matte screen?

Good luck to everyone applying for YC summer 2925 batch

Ask HN: Should bots actively be banned on HN

Ask HN: I burnt out, quit my job – any advice on moving to freelance/consulting?

New AI Chatbot Apps

Ask HN: Did GitHub UI become unbearably slow?

Ask HN: How did you fund your early stage hardware startup?

Ask HN: Should You Include a Certificate in a SAML AuthnRequest?

Ask HN: Do You Prepare for Job Interviews? If So, How?

Image to 3D

Ask HN: Are LLMs useful or harmful when learning to program?

Ask HN: What is the worst communications tool you've ever used?

I Passed the CKA and Built the Kubernetes Scenario Book I Wish I Had

Ask HN: Gemini Reliability Degrading?

Ask HN: Any recommendations for a portable music player

Why is it so hard to find founders to bounce off ideas in city you are visiting?

Ask HN: Is big tech still more stable?

Ask HN: Where to get used hardware cheap?

Ask HN: RAG or shared memory for task planning across physical agents?

Show HN: OCR Workbench: AI OCR for hard documents

Comments (3)