Show HN: DataKit – Complete data analysis platform now self-hostable

3 aminkhorrami 2 6/2/2025, 3:47:22 PM datakit.page ↗
DataKit started as my solution to Excel crashing on large files, but it's grown into a full browser-based data platform that handles CSV/Parquet/XLSXL/JSON files up to 20GB+ entirely client-side. What it does:

Drag files → instant SQL querying (joins, aggregations, everything) Automatic data profiling (quality issues, null values, duplicates) Smart visualizations for every column type Export transformed/filtered results

Now self-hostable: After requests from teams needing this behind firewalls, the entire platform can run on your infrastructure via pip/Docker/brew/NPM. Technical details: Built on DuckDB-WASM with heavy performance optimizations. All processing happens in-browser – your data never leaves your environment, whether using the hosted version or self-hosted setup. Live demo: https://datakit.page Self-hosting docs: https://docs.datakit.page Previous discussion: https://www.reddit.com/r/dataengineering/comments/1l1i3ry Built this because I was tired of the choose-two problem: fast analysis, large files, or keeping data local. Now you can have all three. Feedback welcome – what data analysis pain points should I tackle next? (Would be super happy to have a talk on Discord: https://discord.gg/grKvFZHh)

Comments (2)

simlevesque · 9h ago
Nice product ! I'm going to hop on Discord.
aminkhorrami · 8h ago
Great! Let's talk!