Show HN: OCR Workbench: AI OCR for hard documents

2 viking2917 3 5/13/2025, 7:06:11 PM github.com ↗

OCR on old documents is hard. OCR Workbench uses AI for OCR and provides an editing environment to clean things up, as is inevitably required.

Inspired by this Hacker News post: https://news.ycombinator.com/item?id=43048698

Backstory: I was having trouble producing transcriptions of Colonial American documents, which have their own unique challenges for OCR, and things like Tesseract fail miserably. So I built something. Uses Gemini and seems to work pretty well (disclaimer: you need your own API key). I didn't build Claude but I expect it works similarly well.

FWIW: largely vibe coded, with human review and intervention as required.

Comments (3)

keepsweet · 18h ago

Interesting concept. I tried it with a text written in Church Slavonic, didn't work. I guess the documents don't have to be THAT old. It would also be nice if you could upload images individually instead of selecting everything from a folder. Either way, nice work.

viking2917 · 8h ago

Thanks! I have not tried anything other than English, not sure how good the LLMs are for that. Did you use Tesseract or Gemini?

Once the page structure is set up from the images (via the directory upload), you can upload new images for each page, but I didn't include an option to just create all the pages manually. It's a good idea. Going to add that...

viking2917 · 6h ago

I updated the app to allow manual add/delete of pages. Then tap the page and do New Image to upload per page. Thanks for the feedback!

StackAI (YC W23) Is Hiring Pydantic and FastAPI Wizard (ycombinator.com)

Artie (YC S23) Is Hiring a Senior Product Marketing Manager (SF) (ycombinator.com)

Legion Health (YC S21) is hiring engineers to help fix mental health with AI (workatastartup.com)

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco (ycombinator.com)

Synder (YC S21) Is Hiring (ycombinator.com)

Roame (YC S23) Is Hiring Lead Fullstack Engineer (ycombinator.com)

Weave (YC W25) is hiring a founding engineer (ycombinator.com)

Rollstack (YC W23) Is Hiring TypeScript Engineers (Remote US/CA) (ycombinator.com)

Ciro (YC S22) is hiring a software engineer to build AI agents for sales (ycombinator.com)

Artifact (YC W25) Is Hiring (ycombinator.com)

Thunder Compute (YC S24) Is Hiring a C++ Low-Latency Systems Developer (ycombinator.com)

GovEagle (YC W23) Is Hiring (ycombinator.com)

Motion (YC W20) Is Hiring a Senior Engineers (jobs.ashbyhq.com)

Tabular (YC S24) Is Hiring (ycombinator.com)

Continue (YC S23) is hiring software engineers in San Francisco (ycombinator.com)

Instant (YC S22) Is Hiring a Founding TypeScript Engineer (instantdb.com)

Jiga (YC W21) Is Hiring Engineers (workatastartup.com)

KaiPod Learning (YC S21) Is Hiring VP of Engineering (ycombinator.com)

Hightouch (YC S19) Is Hiring (ycombinator.com)

Helpcare AI (YC F24) Is Hiring (docs.google.com)

Stellar Sleep (YC S23) is hiring a product engineer in SF (ycombinator.com)

OneText (YC W23) Is Hiring a DevOps/DBA Lead Engineer

Toma (YC W24) Is Hiring Engs #3-4 (AI for Automotive) (ycombinator.com)

Waypoint Transit (YC W25) is hiring a software engineer (workatastartup.com)

GroMo (YC W21) Is Hiring (ycombinator.com)

Archil (YC F24) Is Hiring a Distributed Systems Engineer (In-Person, SF)

Modern Realty (YC S24) Is Hiring (workatastartup.com)

Hestus, Inc. (YC S24) Is Hiring an ML Engineer to Revolutionize CAD (ycombinator.com)

Activeloop (YC S18) is hiring a VP of Engineering in Mountain View (on-site) (careers.activeloop.ai)

Optery (YC W22) – Engineering Team Lead and Engineers with Node.js (U.S., Latam) (jobs.ashbyhq.com)

Extend (YC W23) is hiring engineers to build LLM document processing (jobs.ashbyhq.com)

Show HN: OCR Workbench: AI OCR for hard documents

Comments (3)