Show HN: DataKit – Complete data analysis platform now self-hostable
Drag files → instant SQL querying (joins, aggregations, everything) Automatic data profiling (quality issues, null values, duplicates) Smart visualizations for every column type Export transformed/filtered results
Now self-hostable: After requests from teams needing this behind firewalls, the entire platform can run on your infrastructure via pip/Docker/brew/NPM. Technical details: Built on DuckDB-WASM with heavy performance optimizations. All processing happens in-browser – your data never leaves your environment, whether using the hosted version or self-hosted setup. Live demo: https://datakit.page Self-hosting docs: https://docs.datakit.page Previous discussion: https://www.reddit.com/r/dataengineering/comments/1l1i3ry Built this because I was tired of the choose-two problem: fast analysis, large files, or keeping data local. Now you can have all three. Feedback welcome – what data analysis pain points should I tackle next? (Would be super happy to have a talk on Discord: https://discord.gg/grKvFZHh)