Datalog in miniKanren

95 deosjr 9 6/15/2025, 4:16:08 PM deosjr.github.io ↗

Comments (9)

deosjr · 16h ago
Seems like interest in Datalog is high this week, so I thought I'd share a write-up of a minimal Datalog implementation I did a while ago.

Runs in the browser using Hoot (https://spritely.institute/hoot/) which compiles Guile Scheme to WebAssembly.

upghost · 9h ago
Datalog is a syntactic subset of Prolog[1], which this is... not.

I think the most misunderstood thing about Prolog (and Datalog, the functor-free subset of pure Prolog) is that the syntax is really, really important.

It's like, the whole gimmick of the language. It is designed to efficiently and elegantly query and transform itself. If you lose the syntax you lose all of intermediate and advanced Prolog (and Datalog).

[1]: https://en.m.wikipedia.org/wiki/Datalog

kragen · 8h ago
Semantics are more important than syntax. Prolog's flexible syntax is a nice-to-have rather than essential when you're in Lisp. And Datalog is purely first-order, so the advanced Prolog you're talking about doesn't exist in it.

However, syntax does matter, and this is not acceptable

    (dl-find 
     (fresh-vars 1 
      (lambda (?id) 
       (dl-findo dl
        ((,?id reachable ,?id)))))))
as a way to ask

    reachable(Id, Id).
I think you could, however, write a bit more Scheme and be able to ask

    (?id reachable ?id)
which would be acceptable.

However, the ordering betrays a deeper semantic difference with orthodox Datalog, which is about distinct N-ary relations, like a relational database, not binary relations. This implementation seems to be specific to binary relations, so it's not really Datalog for reasons that go beyond mere syntax.

On the other hand, this (from the initial goal) would be perfectly fine:

    (dl-rule! dl (reachable ,?x ,?y) :- 
                     (edge ,?x ,?z) (reachable ,?z ,?y))
The orthodox Datalog syntax is:

    reachable(X, Y) :- edge(X, Z), reachable(Z, Y).
deosjr · 1h ago
Thank you for the feedback! I agree with all of the above.

Should've probably been a bit more clear on the dl-find syntax; I find it just as unacceptable as you do. It is the result of laziness: my intended use of this minimal Datalog does not include any querying whatsoever but abuses fixpoint analysis for side-effects (see https://github.com/deosjr/deosjr.github.io/blob/master/dynam... which I intend to go over in a future post). I initially had it working like you described but butchered it for the above and haven't repaired it yet (see https://github.com/deosjr/whistle?tab=readme-ov-file#datalog). This version relied on some monstrous eval-hacking using a homebrew Lisp, which I've mostly cleaned up now in this version (https://github.com/deosjr/whistle/blob/main/datalog/datalog.... is a crime, for example).

The semantics are indeed limited to binary relations atm, which I agree is the main thing that disqualifies this as a proper Datalog. iirc the tutorial on Datalog that I based this implementation on only handled triples as well so I stopped there, but extending to N-ary relations is on my list to look into for sure.

jitl · 7h ago
Shouldn’t lisp macros make it easy to present such a nice syntax? Perhaps the author could easily implement that bit, if not the wide rows. Or is that the point you’re making?

There is a dl-rule here: https://github.com/deosjr/deosjr.github.io/blob/15b5f7e02153...

kragen · 7h ago
I don't think you need Lisp macros for it; you could use just a regular Lisp function. I don't think the standard R5RS macros are powerful enough to grovel over the query expression to make a list of the free variables, but then, standard Scheme also doesn't have records. I think Guile has a procedural macro system that you could use, but I don't think it would be a good idea.

Yes, I think the semantic divergence is more fundamental. Triple stores and graph databases and binary relations are awesome, but they aren't what Datalog is.

j-pb · 7h ago
Most database literature simply uses Datalog to mean the query language fragment of conjunctive queries + recursion/fixpoint-iteration and potentially stratified negation.

Yes it started out as a Prolog subset, but the definition as the fragments it supports has become much more prevalent, mainly to contrast it to non-recursive fragments with arbitrary negation (e.g. SQL).

This usage dates back to database literature of the 80s by Ullman et. al.

fithisux · 14h ago
What scheme is this?
deosjr · 14h ago