Weiser: A Lightweight, OSS, AI-Friendly Data Quality Framework

Comments (1)

pacofvf · 18h ago

After becoming frustrated with the difficulty of implementing reliable and transparent data quality checks, I developed a new framework called Weiser. It’s inspired by tools like Soda and Great Expectations, but built with a different philosophy: simplicity, openness, and zero lock-in.

If you’ve tried Soda, you’ve probably noticed that many of the proper checks (like change over time, anomaly detection, etc.) are hidden behind their cloud product. Great Expectations, while powerful, can feel overly complex and brittle for modern analytics workflows. I wanted something that falls between lightweight, expressive, and flexible enough to integrate into any analytics stack.

Weiser is config-based; you define checks in YAML, and it runs them as SQL against your data warehouse. There’s no SaaS platform, no telemetry, no signup: just a CLI tool and some opinionated YAML.

Some examples of built-in checks:

1. Row count drops compared to a historical window

2. Unexpected nulls or category values

3. Distribution shifts

4. Anomaly detection

5. Cardinality changes

The framework is fully open-source (MIT license), and its goal is to be both human- and machine-readable. I’ve been using LLMs to help generate and refine Weiser configs, which work surprisingly well, far better than trying to wrangle pandas or SQL directly via prompt. I already have an MCP server that works well but it's a pain in the ass to install it in Claude Desktop, I don't want you to waste time doing that. Once Anthropic fixes their dxt format, I will release a MCP tool for Claude Desktop.

Currently it only supports PostgreSQL and Cube as datasource, and for destination for the checks results it supports postgres and duckdb(S3), I will add snowflake and databricks for datasources in the next few days. It doesn’t do orchestration, you can run it via cron, Airflow, GitHub Actions, whatever you want.

If you’ve ever duct-taped together DBT tests, SQL scripts, or ad hoc dashboards to catch data quality issues, Weiser might be helpful. I would love any feedback or ideas. It’s early days, but I’m trying to keep it clean and useful for both analysts and engineers. I'm also working on a better GUI.

GitHub: https://github.com/weiser-ai/weiser Docs: https://weiser.ai/docs/tutorial/getting-started

I'm happy to answer questions or hear about what other folks are doing to address this problem.

Kyber (YC W23) Is Hiring Enterprise BDRs (ycombinator.com)

MindsDB (YC W20) is hiring an AI solutions engineer (job-boards.greenhouse.io)

Recurse Center (YC S10) Is Hiring a Career Facilitator (recurse.notion.site)

Cua (YC X25) is hiring an engineer (ycombinator.com)

Noloco (YC S21) is hiring a founder's associate in Barcelona (ycombinator.com)

14.ai (YC W24) hiring founding engineers in SF to build a Zendesk alternative (14.ai)

Lago (Open-Source Usage Based Billing) is hiring for ten roles (ycombinator.com)

Spark AI (YC W24) is hiring a full-stack engineer in SF (founding team) (ycombinator.com)

Bitmovin (YC S15) Is Hiring a Junior Solutions Engineer in Denver (bitmovin.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring DevRel Engineers (Remote)(US) (ycombinator.com)

AccessOwl (YC S22) is hiring an Elixir Engineer to connect 100s of SaaS (ycombinator.com)

FurtherAI (YC W24) Is Hiring for Software and AI Roles (ycombinator.com)

Yarn (YC W24) is hiring engineers in NYC (ycombinator.com)

Expand.ai (YC S24) is hiring a founding engineer

Optifye.ai (YC W25) is hiring a back end engineer

Kastle (S24) is hiring an engineer (ycombinator.com)

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

Qfex (YC X25) – Back End Engineer for a 24/7 Stock Exchange (ycombinator.com)

Attimet (YC F24) – Quant Trading Research Lab – Is Hiring Founding Engineer (ycombinator.com)

Jiga (YC W21) Is Hiring Software Engs to Make Life of Mech Engs Easier (workatastartup.com)

Foundry (YC F24) Hiring Early Engineer to Build Web Agent Infrastructure (ycombinator.com)

Blaze (YC S24) Is Hiring (ycombinator.com)

Infracost (YC W21) is hiring software engineers (GMT+2 to GMT-6) (infracost.io)

Solidroad (YC W25) Is Hiring (solidroad.com)

Kyber (YC W23) Is Hiring a Technical Account Manager (ycombinator.com)

Roundtable (YC S23) Is Hiring a President / CRO (ycombinator.com)

Roame (YC S23) Is Hiring (ycombinator.com)

GauntletAI (YC S17): All expenses paid AI training and guaranteed $200k+ job (gauntletai.com)

SchemeFlow (YC S24) Is Hiring an Engineer (London) to Speed Up Construction (ycombinator.com)

Shaped (YC W22) Is Hiring (ycombinator.com)

Weiser: A Lightweight, OSS, AI-Friendly Data Quality Framework

Comments (1)