Fight Chat Control (fightchatcontrol.eu)

Hi All I wanted to share an early preview of Okapi an in-memory metrics engine that also integrates with existing datalakes. Modern software systems produce a mammoth amount of telemetry. While we can discuss whether or not this is necessary, we can all agree that it happens.

Most metrics engines today use proprietary formats to store data and don’t use disaggregated storage and compute. Okapi changes that by leveraging open data formats and integrating with existing data lakes. This makes it possible to use standard OLAP tools like Snowflake, Databricks, DuckDB or even Jupyter / Polars to run analysis workflows (such as anomaly detection) while avoiding vendor lock-in in two ways - you can bring your own workflows and have a swappable compute engine. Disaggregation also reduces Ops burden of maintaining your own storage and the compute engine can be scaled up and down on demand.

Not all data can reside in a data-lake/object store though - this doesn’t work for recent data. To ease realtime queries Okapi first writes all metrics data to an in memory store and reads on recent data are served from this store. Metrics are rolled up as they arrive which helps ease memory pressure. Metrics are held in-memory for a configurable retention period after which it gets shipped out to object storage/datalake (currently only Parquet export is supported). This allows fast reads on recent data while offloading query-processing for older data. On benchmarks queries on in-memory data finish in under a millisecond while having write throughput of ~280k samples per second. On a real deployment, there’d be network delays so YMMV.

Okapi it is still early — feedback, critiques, and contributions welcome. Cheers !

Comments (2)

shaheeraslam · 4h ago

This is amazing work man! Can we setup a demo with you guys.

kushal2048 · 4h ago

Okapi is OSS, so its basically free. The docs should help in getting set up. If there are problems, please submit an issue on Github and I'll help out. There's not a very good demo video as of yet, but I am working on it.

GPT-5 (openai.com)

Fight Chat Control (fightchatcontrol.eu)

GitHub is no longer independent at Microsoft after CEO resignation (theverge.com)

I tried every todo app and ended up with a .txt file (al3rez.com)

Claude Sonnet 4 now supports 1M tokens of context (anthropic.com)

I want everything local – Building my offline AI workspace (instavm.io)

Ultrathin business card runs a fluid simulation (github.com)

Streaming services are driving viewers back to piracy (theguardian.com)

Wikipedia loses challenge against Online Safety Act (bbc.com)

Anna's Archive: An Update from the Team (annas-archive.org)

FFmpeg 8.0 adds Whisper support (code.ffmpeg.org)

Emailing a one-time code is worse than passwords (blog.danielh.cc)

Debian 13 “Trixie” (debian.org)

Steve Wozniak: Life to me was never about accomplishment, but about happiness (yro.slashdot.org)

Good system design (seangoedecke.com)

Vibechart (vibechart.net)

Why LLMs can't really build software (zed.dev)

VC-backed company just killed my EU trademark for a small OSS project

Claude Code is all you need (dwyer.co.za)

Gemma 3 270M: Compact model for hyper-efficient AI (developers.googleblog.com)

Nginx introduces native support for ACME protocol (blog.nginx.org)

Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

Claude says “You're absolutely right!” about everything (github.com)

AGENTS.md – Open format for guiding coding agents (agents.md)

PYX: The next step in Python packaging (astral.sh)

Copilot broke audit logs, but Microsoft won't tell customers (pistachioapp.com)

Open hardware desktop 3D printing is dead? (josefprusa.com)

How I code with AI on a budget/free (wuu73.org)

Show HN: Building a web search engine from scratch with 3B neural embeddings (blog.wilsonl.in)

Obsidian Bases (help.obsidian.md)

Show HN: I built an app to block Shorts and Reels (scrollguard.app)

How we exploited CodeRabbit: From simple PR to RCE and write access on 1M repos (research.kudelskisecurity.com)

Try and (ygdp.yale.edu)

This website is for humans (localghost.dev)

I accidentally became PureGym’s unofficial Apple Wallet developer (drobinin.com)

GPT-5: Key characteristics, pricing and system card (simonwillison.net)

Wikimedia Foundation Challenges UK Online Safety Act Regulations (wikimediafoundation.org)

Web apps in a single, portable, self-updating, vanilla HTML file (hyperclay.com)

OpenFreeMap survived 100k requests per second (blog.hyperknot.com)

Jim Lovell, Apollo 13 commander, has died (nasa.gov)

Search all text in New York City (alltext.nyc)

Lazy-brush – smooth drawing with mouse or finger (lazybrush.dulnan.net)

OpenMower – An open source lawn mower (github.com)

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

What's the strongest AI model you can train on a laptop in five minutes? (seangoedecke.com)

Show HN: Whispering – Open-source, local-first dictation you can trust (github.com)

The future of large files in Git is Git (tylercipriani.com)

Historical Tech Tree (historicaltechtree.com)

Why are there so many rationalist cults? (asteriskmag.com)

Meta Leaks Part 1: Israel and Meta (archive.org)

Show HN: Okapi – a metrics engine based on open data formats

Comments (2)