Show HN: ClearDoc – Extract fields from any document using OCR and LLM

1 Mignet 1 7/28/2025, 12:50:01 PM cleardoc.v5ent.com ↗
Hi HN!

I recently launched a prototype of *ClearDoc*, an AI-powered tool to extract structured data from unstructured documents like invoices, bills of lading, certificates, etc.

It uses *OCR (PaddleOCR)* and *LLMs* to detect and align key fields — even for complex documents with tables, nested fields, or in different languages.

It doesn't require templates and can be *self-hosted* (demo runs on my own GPU).

Live demo (no sign-up): http://cleardoc.v5ent.com/ Demo video: https://www.youtube.com/watch?v=u83T6iewfNs

Right now: - Fields are auto-aligned visually on the document - Works with PDFs, images, scans - No custom field design/editing in the demo yet

Would love feedback on: - Which use cases matter most to you? - What would make this valuable enough to adopt?

Thanks!

Comments (1)

Mignet · 6h ago
pls feel free to report any issue