Show HN: SwellDB – Query AI-generated tables with SQL

1 giannakouris 0 7/22/2025, 4:22:51 PM github.com ↗
I'm building a data system called SwellDB that uses LLMs to generate its tables on the fly.

Traditional databases only work over data that's already loaded and cleaned. But in the real world, data lives everywhere — in files, PDFs, web pages, APIs. To query it, we usually need custom ETL pipelines: extract, clean, transform, load. It’s slow, brittle, and different every time.

SwellDB flips that model: you define a table (schema + a description as a natural language prompt) and it generates the table just-in-time — using LLMs and your schema/prompt, on top of the connected data sources (files, databases, LLMs, web). Think: querying a DataFrame that materializes itself from raw input without you writing the ingestion logic.

It supports:

- Structured + unstructured sources: CSV, SQL, web search results (PDF to be added soon)

- Declarative table definitions in Python

- Output compatible with any SQL query engine (DuckDB, Apache DataFusion) or ingestible into any database

Repo: https://github.com/SwellDB/SwellDB

Short paper (4 pages): https://github.com/gsvic/gsvic.github.io/blob/gh-pages/paper...

Would love feedback if you get a chance to try it out, especially from folks dealing with hybrid or messy data sources.

Comments (0)

No comments yet