Solving LinkedIn Queens with SMT

128 azhenley 38 6/12/2025, 4:18:01 PM buttondown.com ↗

Comments (38)

GZGavinZhao · 1d ago
Formal Methods in general are underrated in the industry. Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.

Edit: maybe there are large companies that use them behind the curtains, but AWS is the only place I know of where they publicly acknowledge how much they appreciate and use formal methods. If you know any of them, please comment and I'd be curious to learn about how they're using it!

Twirrim · 22h ago
Microsoft, Amazon, Oracle, Google, all sorts of large companies use them, and talk about it, publicly. They've all published whitepapers and resources about them. Microsoft even employs Dr. Leslie Lamport who created and maintains TLA+ (among other things).

Just for some quick examples:

Microsoft: https://github.com/Azure/azure-cosmos-tla, https://www.youtube.com/watch?v=kYX6UrY_ooA

Google: https://research.google/pubs/specifying-bgp-using-tla/, https://www.researchgate.net/publication/267120559_Formal_Mo...

Oracle: https://blogs.oracle.com/cloud-infrastructure/post/sleeping-... (note the author is a "Formal Verification Engineer", it's literally his job at Oracle to do this stuff)

Intel: https://dl.acm.org/doi/10.1145/1391469.1391675, https://link.springer.com/chapter/10.1007/978-3-540-69850-0_...

Jetbrains: https://lp.jetbrains.com/research/hott-and-dependent-types/

Arm: https://ieeexplore.ieee.org/document/9974354

dfc · 19h ago
Lamport+Microsoft was the first thing that I thought of when I read the comment. FWIW he retired at the beginning of this year.
IshKebab · 1d ago
I don't think they are underrated. They are heavily used where they work really well and bugs have a very high cost (e.g. hardware design).

For the vast majority of software though they don't really make much sense because formally verifying the software is 10-100x more work than writing it and testing it with normal methods. And formal verification of software generally requires faaaaar more expertise than most people have. (The situation is much better for hardware design because it tends to be way simpler.)

It's a very powerful tool but also extremely difficult to use.

steamrolled · 1d ago
> Formal Methods in general are underrated in the industry. Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.

At least Microsoft and Google poured a lot of money into this by funding well-staffed multi-year research projects. There's plenty of public trail in terms of research papers. It's just that not a whole lot came out of it otherwise.

The problem isn't that the methods are underrated, it's that they aren't compatible with the approach to software engineering in these places (huge monolithic codebases, a variety of evolving languages and frameworks, no rigid constraints on design principles).

fakedang · 5h ago
Can you ELI5 what formal methods are and how not the industry standard apparently? As a complete noob, from what I'm reading online, they're pretty much how you should approach software engineering, or really any sort of programming right?
dwrodri · 5h ago
Formal methods = “this software cannot do things it shouldn’t do”, I have formally proven it ALWAYS EXECUTES THE WAY I CONSTRAINED IT TO.

Contrast with

Testing = “My tests prove these inputs definitely produce these test outputs”

IME Formal methods struggle making contact with reality because you really only get their promise “it always does what it is constrained to do” when every abstraction underneath provides the same guarantee, I wager most CPUs/GPUs aren’t verified down to the gate level these days.

It’s just faster to “trust” tests with most of the benefit, and developing software faster is very important to capturing a market and accruing revenue if you are building your software for business reasons.

EDIT: My gate-level verification remark is a bit extreme, but it applies to higher layers of the stack. The linux kernel isn’t verified. Drivers are sometimes verified, but not often. There is an HN comment somewhere about building a filesystem in Coq, and while the operations at the filesystem layer are provably correct, the kernel interfaces still fail. The firmware still isn’t proven. The CPU itself running on has undisclosed optimizations in its caches and load/store mechanisms which just aren’t proven, but enabled it to beat the competition on benchmarks, driving sales.

tgma · 23h ago
> Pretty much no large companies except AWS (thank you Byron Cook!) use them at a large scale.

I don't think that's true at all. I suppose that depends on what you mean by formal methods and in what context you're concerned about those. Off the top of my head this comes to mind from Microsoft: https://learn.microsoft.com/en-us/windows-hardware/drivers/d...

nhatcher · 1d ago
SAT solvers and the algorithms surrounding them are so much fun. I agree they are very unappreciated.

Shameless plug: I wrote a (admittedly very deriative) introduction with some examples I thought at the time were cool.

https://www.nhatcher.com/post/on-hats-and-sats/

cubancigar11 · 15h ago
Thanks that was quite informative, perfect for me.
sevensor · 1d ago
SMT is so much fun. The Z3 Python api lets you write your problem very directly and then gives you fast answers, even for quite large problems.
Recursing · 9h ago
Note that CVC5 has basically the same Python API ( https://cvc5.github.io/docs/cvc5-1.1.2/api/python/python.htm... ) and is often much faster
ndr · 12h ago
I did write a shockingly similar solution few months ago:

https://gist.github.com/enigma/98ea0392471fa70211251daa16ce8...

doctorpangloss · 22h ago
This post is the programming joke about Python, "import solution; solution()".
sevensor · 18h ago
Barely a joke, this is literally what using the Python Z3 bindings feels like.
cpatuzzo · 1d ago
I tried to write a programming language that compiles to SAT many years ago: https://sentient-lang.org/
superlopuh · 11h ago
I love that language and frequently show it to people. I'm sad to see that my local install doesn't work any more. I actually used it to solve a puzzle in Evoland 2 that I'm relatively sure was added as a joke, and is not solvable in a reasonable time without a solver. I'm actually doing a PhD in compilers right now, and would love to chat about sentient if you have the time. My email is sasha@lopoukhine.com.
hwayne · 1d ago
I remember you showing me this! Wow that was a long time ago.
zero_k · 21h ago
Haha, Marijn Heule who is pushing a lot of limits of SAT solving would love this. If you manage to get him excited, he might spend a few years on this problem :) He's kinda famous for solving the Boolean Pythagorean Triples problem using SAT [1]. He loves puzzles. He also got Knuth excited about a bunch of fun puzzles.

BTW, these puzzles also tend to have a lot of symmetries, which SAT solvers are pretty bad at handling. You can break them, though, using a variety of techniques, e.g. static symmetry breaking [2], or symmetric learning.

[1] https://www.cs.utexas.edu/~marijn/ptn/ [2] https://github.com/markusa4/satsuma

jononor · 1d ago
How good are current LLMs at translating problems given as text into something SMT solvers can operate on? Be it MiniZinc, Z3, Smtlib, Python bindings, etc. Anyone tried it out?
hwayne · 1d ago
Apparently they're getting very good: https://emschwartz.me/new-life-hack-using-llms-to-generate-c...

I try not to use them too much because I want to build the skill of using SMTs directly for now.

Twirrim · 22h ago
I've found them to be bad, for the most part. There aren't enough blog posts and examples of code out there for them to leach from.

Besides which, I would argue the process of writing proof in the language is integral to building the understanding you need to deal with the results. You'll spot bugs as you're creating the model.

Jaxan · 1d ago
I tried it many months ago and it was garbage. But this was trying smtlib directly. Maybe via the python bindings it works better?
stong1 · 1d ago
Reminds me of a small project I did back in undergrad: Minesweeper using a SMT solver. https://github.com/stong/smt-minesweeper
robinhouston · 1d ago
If you want a language for expressing constraint satisfaction problems that's higher-level than SAT, I think MiniZinc is pretty interesting. https://www.minizinc.org/
naet · 1d ago
I actually wrote a backtracking solution to the LinkedIn queens game a while ago (and the tango game).

I know nothing about SMT or SAT and I imagine they might be faster, but the backtracking appears to solve just as instantaneously when you push the button.

Might be cool to learn a bit about SMT or SAT. Not sure how broadly applicable they are but I've seen people use them to solve some difficult advent of code problems if nothing else.

gbacon · 21h ago
Definite broad applicability.

NP-complete are the hardest problems in NP. Cook in 1971 proved SAT to be in NP-complete. In the worst case for any other problem in NP, we can quickly (i.e., in polynomial time) convert instances of that problem into instances of SAT. In other words, we can use SAT to solve any problem in NP.

It turns out there are many problems in NP-complete. The fast conversion applies among them too, so in some sense, problems in NP-complete are all the same because we can use them all to solve instances of each other. However, for some of those problem instances the best known algorithm is to try all possible inputs, which requires exponential time (very, very slow for even modestly large inputs).

Lots of research has been and continues to be poured into SAT because any gains automatically yield improvements to everything else in NP-complete and the rest of NP. Using a SAT solver allows you to hitch a ride more or less for free on the results of all that research. Each incremental improvement to SAT solvers benefits programs that use them.

As the author noted, forming SAT instances by hand can be a pain. SMT or SAT Modulo Theories is sort of a high-level language that “compiles down” to SAT. Expressing problems with SMT is more natural and reduces the burden of converting your problem to SMT and SMT solutions back to your problem domain.

TheBozzCL · 1d ago
Hah, about a month ago I wrote a DLX solver for exact cover problems and LiQueens was one of my first implementations.

Next I want to try to solve the Tango and Zip games.

b0a04gl · 1d ago
you mentioned SMT is slower than SAT and left it there, but that feels incomplete. in problems like this, solve time barely matters unless you’re generating at scale. the real weight is in how fast you can write, refactor, and trust the constraints. SAT might give faster outputs, but SMT usually gets you to correct models quicker with less friction. wondering if you actually benchmarked both and stuck with SAT on numbers, or if it was more of a default comfort pick. felt like a missed moment to shift the lens from solver speed to model dev loop speed
osmarks · 1d ago
I was briefly looking into using SMT for Minecraft autocrafting, but it turns out you can do integer linear programming and the mapping is easier.
spencerflem · 1d ago
Some additional context: Outside of Microsoft, this puzzle is often known as Star Battle.

Terrific little puzzle, highly recommend it!

https://www.puzzles.wiki/wiki/Star_Battle

https://www.puzzle-star-battle.com/?size=5

anArbitraryOne · 20h ago
What about a CP solver?
refulgentis · 22h ago
> Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. Glucose is really, really fast.

I'm new to this area, neither the original article nor the link to Glucose have enough info to tell me order of magnitude here: milliseconds? hours?

OutOfHere · 1d ago
The article fails to even say what SMT is. It also fails to describe and explain it. This article should help:

https://en.wikipedia.org/wiki/Satisfiability_modulo_theories

Jtsummers · 1d ago
In the article:

> "Satisfiability Modulo Theories"

rook37 · 1d ago
Huh, where? I didn't see it going through and opening it in reader mode and ctrl+f-ing any of those words turns up nothing for me still.
Jtsummers · 1d ago
He embeds the footnotes in the web version (it's a proper footnote in the email newsletter version). Find the "..." in this paragraph:

> Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. Go read his post, it's pretty cool. What leapt out to me was that he used CVC5, an SMT solver. (...) SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in Z3 (via the Python API).