Does OLAP Need an ORM

59 craneca0 36 8/17/2025, 4:03:21 PM clickhouse.com ↗

Comments (36)

barrkel · 2h ago
If you're doing OLAP, you probably want dimensions, measures and operators that operate on time aggregations and shifts. You want rollups and drill downs along multiple axes, with subtotals and probably pivots.

SQL isn't wholly adequate for this, it's hard work to get the SQL right and if there's joins involved it's not hard to accidentally fan out and start double counting.

If you ask me, you want an analytic model of the data that is designed around measures, dimensions, with an anointed time dimension, and a way of expressing higher level queries such that it automatically aggregates depending on which dimensions you leave out, and gives you options to sort, pivot, filter etc. dynamically.

This doesn't look like entities, really, but it is a model between you and the SQL.

From my scan - not detailed - reading of the article, Moose looks too low level and not a useful abstraction to sit in the same logical place that ORMs do in OLTP databases.

timgdelisle · 1h ago
Very much agree with you, at this point the abstraction is too low-level to be considered a proper ORM (or whatever the acronym should be for OLAP) and we're progressively working our way up to the right level. I love the idea of operating at the dimensions/measures level. Hoping we address this concern in the next couple of releases! Really appreciate the feedback
bob1029 · 7h ago
> If you’ve got your OLAP schemas as objects in your application code

I guess I have a wildly different interpretation of typical OLAP scenarios. To me this acronym mostly means "reporting". And in 99% of cases where the business desires a new report, the ideal views or type systems have not been anticipated. In these cases (most of them), I can't imagine a faster way to give the business an answer than just writing some sql.

ElatedOwl · 7h ago
I agree with that being fastest, but not cheapest.

In my experience these one off reports are very brittle. The app ends up making schema changes that are breaking to these one off reports, and you usually don’t find out until it goes to production.

I’ve dealt with the maintenance nightmare before. At current gig we’re exploring solutions, curious what a robust pipeline looks like in 2025.

The ORM piece is interesting — we use ActiveRecord and Ruby, and accidentally breaking schema changes within app will get caught by the unit test suite. I would love for a way to bring OLAP reports in similarly to test at CI time.

lpapez · 7h ago
Why not test the OLAP reports?

Surely there is a way to run a raw query in Rails/ActiveRecord and use it in a smoke test?

wredcoll · 7h ago
I mean, if you're relying on tests to catch schema changes... then test your sql reports? This doesn't seem like an amzingly cool solution but if that's the one you're already using...
timgdelisle · 7h ago
I'm one of the Moose maintainers, and yes, most OLAP use cases fall into data warehousing categories where exposing the database to analysts and letting them run loose with SQL is viable. We're seeing more and more that OLAP is becoming a core part of the application stack, for user and agent-facing analytics. There, we see a lot more appetite for building on the analytical stack the way we build on the transactional one.
sdairs · 7h ago
Without doubt, the majority of the market for OLAP today is still internal warehousing & BI. But the market for using OLAP behind features inside user-facing B2C/B2B apps has been kicking off for quite a few years now. Big consumer apps like Stripe, Uber, Shopify...pretty much every B2B SaaS with a usage/metrics dashboard...they're usually punting queries off to an OLAP to populate those stats/charts. That's where something like this might come in handy, I can't imagine it being using for general internal reporting (in the current form, anyway.)
michaelmarkell · 7h ago
The way my company uses Clickhouse is basically that we have one giant flat table, and have written our own abstraction layer on top of it based around "entities" which are functions of data in the underlying table, potentially adding in some window functions or joins. Pretty much every query we write with Clickhouse tacks on a big "Group By All" at the end of it, because we are always trying to squash down the number of rows and aggregate as aggressively as possible.

I imagine we're not alone in this type of abstraction layer, and some type-safety would be very welcome there. I tried to build our system on top of Kysely (https://kysely.dev/) but the Clickhouse extension was not far along enough to make sense for our use-case. As such, we basically had to build our own parser that compiles down to sql, but there are many type-error edge cases, especially when we're joining in against data from S3 that could be CSV, Parquet, etc.

Side note: One of the things I love most about Clickhouse is how easy it is to combine data from multiple sources other than just the source database at query time. I imagine this makes the problem of building an ORM much harder as well, since you could need to build type-checking / ORM against sql queries to external databases, rather than to the source table itself

saadatq · 5h ago
> Borrow the best core concepts:

Schemas as application code means you get version control, PR review and type‑safe changes. A query builder that feels like SQL and lets you write “real” ClickHouse queries with IDE autocompletion and compile‑time checking. Local development and CI should mirror production so you can preview schema changes before they apply to prod.

>>>

I believe this is what dbt set out to accomplish. They came at the problem from the point of view of a data transformation language that is essentially a pseudo type checked SQL for analytical engines with some additional batteries included (ie macros) but the motivation was similar. I’ve always felt that what has held dbt back from more mainstream adoption by the dev community is because they’ve prioritized data transformation over data access to the application layer - ie business intelligence tools over a web app.

Moosestack looks interesting- will definitely check it out.

reactordev · 6h ago
The reasoning behind yes, it would help is in building data tools for people. So you load up your parque files with data, ingest it into your platform, it uses clickhouse (or some OLAP) for tabulation of data, the platform presents a UI that allows the data engineer to select which fields, etc.

This can only be achieved by utilizing some sort of type system. Whether it's reflecting on the tables, codegen on the fly, or having to write custom adapters for each structure. All of which can be greatly simplified with an ORM.

It's not going to help much with bespoke report asks from the business though.

oatsandsugar · 3h ago
Yeah, this is mainly aimed at applications that need an OLAP backend (think user facing analytics, or a database that backs chat applications)
RobinL · 3h ago
When I read the title my brain immediately jumped to a slightly different idea. With olap, I often find it annoying to figure out the joins from the fk/pk relationships, so I was imagining a tool that kind of automatically followed the links for you. A bit like how a orm gives you auto complete, but without the user having to manually enter the schema.

And I wanted it to emit the raw SQL because that's generally what I want for olap.

So I had to go at building it. If anyone's interested a very rough demo/prototype is here: https://www.robinlinacre.com/vite_live_pg_orm/

Load in the demo Northwind schema and click some tables/columns to see the generated joins

reactordev · 3h ago
I like how it joins the inner tables to get to your desired table. Going from employees to products is a walk.
ram_rar · 3h ago
Unpopular opinion: in 2025, nobody should be reaching for an ORM first. They’re an anti-pattern at this point. The “abstraction” it promises rarely delivers—what you actually get is leaky, slow, and a nightmare to operate at scale.

The sane middle ground is libraries that give you nicer ergonomics around SQL without hiding it (like Golangs sqlx https://github.com/jmoiron/sqlx). Engineers should be writing SQL, period.

jelder · 3h ago
Strongly agree! Rust’s sqlx is also insanely great, and I like sqlc for Go as well.

I’ve written a lot about this particular topic: https://www.jacobelder.com/2025/01/31/where-shift-left-fails...

oatsandsugar · 3h ago
> The sane middle ground is libraries that give you nicer ergonomics around SQL without hiding it (like Golangs sqlx https://github.com/jmoiron/sqlx). Engineers should be writing SQL, period.

The blog suggests that an ORM for OLAP would do exactly that

bakugo · 3h ago
I think you're fundamentally misunderstanding the primary purpose of an ORM. It's in the name - Object Relational Mapper. It's meant to ease the mapping from SQL query results into objects in your code, and from objects back to SQL queries. Doing this manually at scale when you have a large number of tables is also a nightmare.

There's no rule saying you can't integrate your own manually written SQL with an ORM, and in fact, any production-ready, feature-complete ORM will allow you do it, because it's effectively a requirement for any non-trivial use case.

parpfish · 1h ago
the trouble is that even if people embrace that thinking, the ORM encourages them to pull entities out of the db and do a bunch of computation in the server that would be much faster to do in the db.
marcjschmidt · 2h ago
Well said. Please never be silent over this fact. It's important to educate people on what an ORM is, what it means and especially what it doesn't mean. Especially in times where VC-baked companies misinform and manipulate people about that, like Prisma is doing
cyberax · 1h ago
> Engineers should be writing SQL, period.

Is it a variation of: "I suffered when I was young, so everyone must suffer as I did?"

SQL is terrible, however you slice it. It's a verbose bass-ackwards language. ORMs that remove the need to deal with SQL for 99% of trivial cases are great.

The rest 1% can remain painful.

cies · 7h ago
My take is...

No one needs an ORM: https://dev.to/cies/the-case-against-orms-5bh4

The article opens with "ORMs have proven to be useful for many developers" -- I believe the opposite is true.

lucisferre · 6h ago
I agree with this as well. I started my career at the height of ORMs. Most software developers were only learning the ORM APIs (which of course all differed significantly) and very few were learning SQL outside of the bare basics.

ORMs, like all abstractions, are a leaky abstraction. But I would argue because of the ubiquity and utility of SQL itself they are a very leaky one where eventually you are going to need to work around them.

After switching to just using SQL in all situations I found my life got a lot simpler. Performance also improved as most ORMs (Rails in particular) are not very well implemented from a performance standpoint, even for very simple use cases.

I can not recommend enough that people skip the ORM entirely.

sdairs · 6h ago
Is the proposition in the OP not pretty much what you're suggesting in your blog? They're currently not using the query builder syntax, instead its pretty much "improving on SQL as strings" with a bunch of the other ORM-like benefits (type safety, autocomplete, etc.)

Perhaps saying "ORM" is a bit of a misnomer, but they're discussing the DX ergonomics of an ORM and acknowledging the exact challenges you describe

marcjschmidt · 2h ago
Yeah it's funny they even mention ORM while at the same offering something that has nothing to do with ORMs at all. Yes, many ORM libraries offer additional tools like migration and querybuiler, but that's not the point of an ORM. ORM maps relation data to your OOP data structures. They completely misused the term entirely, which is kinda surprising.
odie5533 · 6h ago
How do you convert your type-safe native objects to and from the database in a reusable way? If you do anything in a reusable way, you're 95% of the way to an ORM. Or do you just accept that you get back random dictionaries from the database and don't care about type-safety?
wswope · 6h ago
You write INSERT and SELECT statements for the object types you want to persist.

What is your concern re: random types popping up? SQLite springs to mind as a prime offender due to not enforcing column types OOTB, but most dialects have rather strong typing.

If we’re talking about mapping UUIDs and datetimes from their DB representations to types defined by the language stdlib, that’s usually the responsibility of the DB driver, no?

odie5533 · 5h ago
I'm talking knowing what the shape of the response and shape of the data going in ahead of time and being consistent. SQL is like a black box in and out. I mainly use Python. For that, it's nice to have things like Dataclasses for DTOs or Pydantic models or some sort of DTO class that has known field names and known types. When you use raw SQL, you lose all that or have to roll it yourself. And at that point, you're most of the way to an ORM or at least the data mapping portion of SQLAlchemy.
sgarland · 2h ago
Mapping a DB response to a Pydantic model is hardly an ORM.
dijksterhuis · 5m ago
[delayed]
marcjschmidt · 2h ago
No, that is exactly what an ORM is, plus mapping it back. Anything around that is additional toolings that no ORM needs to be ORM, but is nonetheless usefull.
wredcoll · 7h ago
I think what I just really want is a language that treats sql as a "first class" component in the same way perl treats regexes.

The devil is of course in the details, but it's a nice dream.

weinzierl · 6h ago
Not quite first class citizen, but you might like sqlx. At least it embraces the idea that writing SQL directly is in fact a good idea and helps you to do so safely.

https://docs.rs/sqlx/latest/sqlx/

SoftTalker · 5h ago
> a language that treats sql as a "first class" component

pl/pgSQL (or pl/sql in oracle) and variants.

caspper69 · 6h ago
LINQ? Just throwing it out there; obviously not everybody can or wants to run a C#/.NET stack, but entity framework (core) is about as close as you can get to the perl and regex integration. I think Ruby on Rails gets there too, but I'm not a RoR guy, so I can't comment.
cyberax · 5h ago
This really depends on the quality of the ORM. I used to write Java software, and Hibernate with QueryDSL saved me probably _months_ of typing. And I dare say, produced much nicer-looking code.

And for most of the code, the performance and overheads were negligible. C# with LINQ is even better, it provides strong typecheck for the queries and often has almost zero overhead.

I'm using Go now, and I don't even want to touch any of the available ORMs because they all suck, compared to the state-of-the-art in Java circa 20 years ago.

AmazingGuy · 5h ago
it's called the Semantic Layer.