GizmoSQL completed the 1T row challenge
Instance: r8gd.metal-48xl (Graviton4) 192 vCPUs, 1536 GiB RAM 11.4TB RAID0 NVMe on AWS $14.11 / hr on-demand $2.82 / hr spot
Dataset: 2.3TB Parquet files (100,000 files with 10,000,000 rows each!) Copy from S3: 11m 24s Challenge Query (cold start): 2m 22s Challenge Query (warm start): 2m 09s Total cost: ~$0.11 (spot) per query And yes...
SELECT COUNT(*) on 1 trillion rows? Just 21.8 seconds.
Why it matters: GizmoSQL lets you query massive Parquet datasets interactively, directly, and efficiently – using the DuckDB SQL engine, and the Arrow Flight SQL protocol for blazing-fast client connections.
No complex Spark or distributed compute cluster setup. No hidden egress fees. Just pure, fast SQL on massive data.
Check out open-source GizmoSQL here: https://github.com/gizmodata/gizmosql and https://gizmosql.com
GizmoSQL is free for self-hosting!
Full 1TRC details, scripts, and results: https://github.com/coiled/1trc/issues/7
No comments yet