Show HN: AI-Powered Receipt and Invoice Generator (LLM-Agnostic, Prompt-Based)
We’ve open-sourced a small tool to generate synthetic receipts and invoices using LLMs — no templates, no HTML, just prompts.
Why? We needed large, diverse, semi-realistic documents to evaluate our document extraction pipeline (OCR + AI). Most generators were either too rigid (static templates) or tied to a single provider. So we built our own:
Features:
LLM-agnostic (OpenAI, local models, etc)
Prompt-driven JSON output (not rendered docs)
Optional Faker-based fallback if the model isn't used
Configurable schemas, locales, and output count
Lightweight and hackable — runs with a single config
This makes it easy to:
Generate test data at scale
Create edge cases (missing fields, odd currencies, broken totals)
Evaluate or fine-tune document understanding models
Example use case: You can ask the model to generate a receipt from a Vietnamese coffee shop, with broken math and a typo in the merchant name — and get structured JSON output that mimics what real OCR might return.
Would love feedback from folks working on:
LLM eval tooling
Synthetic data generation
OCR / document AI
AI agents that touch financial data
Happy to support anyone who wants to extend it or wire up other backends (Claude, Mistral, LM Studio, etc).