A plus for Dafny versus something like TLA+, is that it is an actual programming language, so there is some guarantee that the proofs have been properly translated to executable code, and that further changes still map to the proofs.
Currently it has translation backends for C#, Java, JavaScript, Go and Python.
Comparison with TLA+ doesn't make any sense as TLA+ implements a very different sort of logic, but the property that it is a real programming language is shared by virtually everything in this space.
Lean/Adga are real programming languages, while Coq (Rocq), F*, ATS, Isabelle/HOL all extract to various other programming languages.
Frankly, it's TLA+ that is the odd one here.
pjmlp · 4h ago
Agreed, my remark was more coming from the point of view that I don't get why TLA+ keeps being talked about, when there is such an impedance mismatch between the math model proving the logic, and the actual implementation, where even the language semantics might play a role that wasn't clearly mapped on TLA+ model.
steego · 2h ago
Isn't TLA+ is more like Alloy insofar as they're thinking tools optimized for the design phase?
I'm more familiar with Alloy, which is a great tool for exploring a specification and looking for counter-examples that violate your specification.
AFAIK, none of the languages you listed above work well in conceptualization phase. Are any of them capable of synthesizing counter-examples out of the box? (Aside: I feel like Lean's meta capabilities could be leveraged to do this.)
pjmlp · 1h ago
I only listed Dafny, although I do agree with the list on the reply to me.
Never looked into Alloy, I guess have to get an understanding of it.
How can you validate that the beautiful design phase actually maps to e.g. C code, writing data via ODBC to a SQL database, with stored procedures written in PL/SQL?
Neither the semantics of the toolchains nor the semantics of the commercial products ar part of the TLA+ model as such.
Additionally it requires someone to meticulously compare the mathematical model with the implementation code, to validate that what is written actually maps to what was designed.
Although it wouldn't work for my contrieved example, at least tools like Dafny have more viability by going through "formal model => generate library for consumption", so that we can get an automated model to code validation, without human intervention.
Jtsummers · 1h ago
> Additionally it requires someone to meticulously compare the mathematical model with the implementation code, to validate that what is written actually maps to what was designed.
This is a deficiency in TLA+ (and many other systems), but it's not a good enough reason to discard or dismiss it. The industry alternative to TLA+ is not something that traces fully and easily from spec to code, but mostly to use informal, largely prose, documents as their specification (if they specify anything at all). TLA+ is a massive improvement on that even if it's not a perfect tool. Same for Alloy and other systems. It's better if people model and specify at least portions of their system formally even if it still takes effort to verify the code correctly implements that specification, effort they have to expend anyways but with greater difficulty lacking any thing approaching a formal specification.
trashchomper · 17h ago
Having played with Dafny only in a university course, I really enjoyed it as a way of implementing algorithms and being certain they work with much less cruft than unit tests.
I haven't gone looking but verifier tools compatible with languages people already use (typescript/rust/go/whatever is the flavour of the month) feel like the way to go
codebje · 10h ago
The languages people already use are inconsistent (in a logic theory sense) and lack formal semantics. Efforts to try and prove anything useful about programs written in those languages don't get far, because the first fact requires you to restrict everything to a subset of the languages, and sometimes that restricts you to an unworkable subset, and the latter fact is a massive hill to climb at whose peak is a note saying, "we have changed the implementation, this hill is no longer relevant."
The only* practical way to approach this is exactly Dafny: start with a (probably) consistent core language with well understood semantics, build on a surface language with syntax features that make it more pleasant to use, proving that the surface language has a translation to the core language that means semantics are clear, and then generate the desired final language after verification has succeeded.
Dafny's about the best of the bunch for this too, for the set of target languages it supports.
(It's fine and normal for pragmatic languages to be inconsistent: all that means here is you can prove false is true, which means you can prove anything and everything trivially, but it also means you can tell the damn compiler to accept your code you're sure is right even though the compiler tells you it isn't. It looks like type casts, "unsafe", and the like.)
* one could alternatively put time and effort into making consistent and semantically clear languages compile efficiently, such that they're worth using directly.
Xmd5a · 8h ago
Can't I just work on a subset of the language? I don't care if the linked list implementation I use isn't verified. I want to verify my own code, not because it will run on mars, but as an alternative to writing tests. Is this possible?
Jtsummers · 3h ago
Yes. See SPARK for an example of this. It is a subset of Ada plus the SPARK portions to convey details and proofs. You can use it per file, not just on an entire program, which lets you prove properties of critical parts and leave the rest to conventional testing approaches.
4ad · 4h ago
Unless you use a language designed for formal verification (Lean/Idris/Agda/F*/ATS/etc), no, it is not possible.
You can get pretty far in Haskell (with various extensions) and Scala. But for Go/TypeScript/etc, forget about it.
4ad · 5h ago
Dafny is great, and has some advantages compared to its competitors, but unequivocally calling it "the best" is quite bullish. For example, languages using dependent types (F*, ATS, Coq/Adga/Lean) are more expressive. And there are very mature systems using HOL.
Truth is that everything involves a tradeoff, and some systems are better than others at different things. Dafny explores a particular design space. Hoare-style invariants are easier to use than dependent types (as long as your SMT solver is happy, anyway) but F* also has that, except that in F* you can also use dependent types when automatic refinement proofs become inadequate. And F* and ATS can target low-level, more so than Dafny.
Probably I would not use ATS for anything, but between F* and Dafny, there isn't such a clear cut (I'd most likely use F*).
And if I don't need (relatively) low-level, I wouldn't use either.
Jtsummers · 17h ago
I agree, using tools more in line with your language is better but I believe the knowledge from learning Dafny ought to transfer well to other systems. And Dafny seems better as a pedagogical system than what I've seen and used for other languages.
I'm exploring it now as a way to ease colleagues into SPARK. A lot of the material appears to transfer over and the book Program Proofs seems better to me than what I found for SPARK. I probably wouldn't have colleagues work through the book themselves so much as run a series of tutorials. We've done this often in the past when trying to bring everyone up to speed on some new skillset or tooling, if someone already knows it (or has the initiative to learn ahead of time) then they run tutorial sessions every week or so for the team.
pjmlp · 9h ago
The only tool I see to ever take off in languages that people already use is design by contract, and is has been a hard ride even making that available in some.
Let alone something more complex in formal verification.
This only had the one previous submission but I found it interesting. The mentioned book, Program Proofs, is worth checking out if the topic and language interests you.
anonymousDan · 10h ago
Is this applicable to proofs of concurrent code? Or is Dafny not the right tool?
lou1306 · 8h ago
I am not a Dafny expert but from what I have gathered it uses a deductive procedure underneath, so it's rather geared towards sequential code. To analyse concurrent code, one needs to essentially build a sequential program that _also models the scheduler_ (see e.g., [1]).
This procedure is unsurprisingly called sequentialization and (somewhat less unsurprisingly) is also a pretty good approach when applied to other techniques, such as bounded model checking [2].
Currently it has translation backends for C#, Java, JavaScript, Go and Python.
https://dafny.org/latest/DafnyRef/DafnyRef#sec-compilation
Lean/Adga are real programming languages, while Coq (Rocq), F*, ATS, Isabelle/HOL all extract to various other programming languages.
Frankly, it's TLA+ that is the odd one here.
I'm more familiar with Alloy, which is a great tool for exploring a specification and looking for counter-examples that violate your specification.
AFAIK, none of the languages you listed above work well in conceptualization phase. Are any of them capable of synthesizing counter-examples out of the box? (Aside: I feel like Lean's meta capabilities could be leveraged to do this.)
Never looked into Alloy, I guess have to get an understanding of it.
How can you validate that the beautiful design phase actually maps to e.g. C code, writing data via ODBC to a SQL database, with stored procedures written in PL/SQL?
Neither the semantics of the toolchains nor the semantics of the commercial products ar part of the TLA+ model as such.
Additionally it requires someone to meticulously compare the mathematical model with the implementation code, to validate that what is written actually maps to what was designed.
Although it wouldn't work for my contrieved example, at least tools like Dafny have more viability by going through "formal model => generate library for consumption", so that we can get an automated model to code validation, without human intervention.
This is a deficiency in TLA+ (and many other systems), but it's not a good enough reason to discard or dismiss it. The industry alternative to TLA+ is not something that traces fully and easily from spec to code, but mostly to use informal, largely prose, documents as their specification (if they specify anything at all). TLA+ is a massive improvement on that even if it's not a perfect tool. Same for Alloy and other systems. It's better if people model and specify at least portions of their system formally even if it still takes effort to verify the code correctly implements that specification, effort they have to expend anyways but with greater difficulty lacking any thing approaching a formal specification.
I haven't gone looking but verifier tools compatible with languages people already use (typescript/rust/go/whatever is the flavour of the month) feel like the way to go
The only* practical way to approach this is exactly Dafny: start with a (probably) consistent core language with well understood semantics, build on a surface language with syntax features that make it more pleasant to use, proving that the surface language has a translation to the core language that means semantics are clear, and then generate the desired final language after verification has succeeded.
Dafny's about the best of the bunch for this too, for the set of target languages it supports.
(It's fine and normal for pragmatic languages to be inconsistent: all that means here is you can prove false is true, which means you can prove anything and everything trivially, but it also means you can tell the damn compiler to accept your code you're sure is right even though the compiler tells you it isn't. It looks like type casts, "unsafe", and the like.)
* one could alternatively put time and effort into making consistent and semantically clear languages compile efficiently, such that they're worth using directly.
You can get pretty far in Haskell (with various extensions) and Scala. But for Go/TypeScript/etc, forget about it.
Truth is that everything involves a tradeoff, and some systems are better than others at different things. Dafny explores a particular design space. Hoare-style invariants are easier to use than dependent types (as long as your SMT solver is happy, anyway) but F* also has that, except that in F* you can also use dependent types when automatic refinement proofs become inadequate. And F* and ATS can target low-level, more so than Dafny.
Probably I would not use ATS for anything, but between F* and Dafny, there isn't such a clear cut (I'd most likely use F*).
And if I don't need (relatively) low-level, I wouldn't use either.
I'm exploring it now as a way to ease colleagues into SPARK. A lot of the material appears to transfer over and the book Program Proofs seems better to me than what I found for SPARK. I probably wouldn't have colleagues work through the book themselves so much as run a series of tutorials. We've done this often in the past when trying to bring everyone up to speed on some new skillset or tooling, if someone already knows it (or has the initiative to learn ahead of time) then they run tutorial sessions every week or so for the team.
Let alone something more complex in formal verification.
This only had the one previous submission but I found it interesting. The mentioned book, Program Proofs, is worth checking out if the topic and language interests you.
This procedure is unsurprisingly called sequentialization and (somewhat less unsurprisingly) is also a pretty good approach when applied to other techniques, such as bounded model checking [2].
[1] https://leino.science/papers/krml260.pdf
[2] https://research.cs.wisc.edu/wpis/papers/cav08.pdf