I like this. Very much falls into the "make bad state unrepresentable".
The issues I see with this approach is when developers stop at this first level of type implementation. Everything is a type and nothing works well together, tons of types seem to be subtle permutations of each other, things get hard to reason about etc.
In systems like that I would actually rather be writing a weakly typed dynamic language like JS or a strongly typed dynamic language like Elixir. However, if the developers continue pushing logic into type controlled flows, eg:move conditional logic into union types with pattern matching, leverage delegation etc. the experience becomes pleasant again. Just as an example (probably not the actual best solution) the "DewPoint" function could just take either type and just work.
josephg · 15h ago
Yep. For this reason, I wish more languages supported bound integers. Eg, rather than saying x: u32, I want to be able to use the type system to constrain x to the range of [0, 10).
This would allow for some nice properties. It would also enable a bunch of small optimisations in our languages that we can't have today. Eg, I could make an integer that must fall within my array bounds. Then I don't need to do bounds checking when I index into my array. It would also allow a lot more peephole optimisations to be made with Option.
Weirdly, rust already kinda supports this within a function thanks to LLVM magic. But it doesn't support it for variables passed between functions.
mcculley · 14h ago
Ada has this ability to define ranges for subtypes. I wish language designers would look at Ada more often.
fny · 1h ago
Range checks in Ada are basically assignment guards with some cute arithmetic attached. Ada still does most of the useful checking at runtime, so you're really just introducing more "index out of bounds". Consumer this example:
procedure Sum_Demo is
subtype Index is Integer range 0 .. 10;
subtype Small is Integer range 0 .. 10;
Arr : array(Index) of Integer := (others => 0);
X : Small := 0;
I : Integer := Integer'Value(Integer'Image(X)); -- runtime evaluation
begin
for J in 1 .. 11 loop
I := I + 1;
end loop;
Arr(I) := 42; -- possible out-of-bounds access if I = 11
end Sum_Demo;
This compile, and the compiler will tell you: "warning: Constraint_Error will be raised at run time".
It's a stupid example for sure. Here's a more complex one:
procedure Sum_Demo is
subtype Index is Integer range 0 .. 10;
subtype Small is Integer range 0 .. 10;
Arr : array(Index) of Integer := (others => 0);
X : Small := 0;
I : Integer := Integer'Value(Integer'Image(X)); -- runtime evaluation
begin
for J in 1 .. 11 loop
I := I + 1;
end loop;
Arr(I) := 42; -- Let's crash it
end Sum_Demo;
This again compiles, but if you run it: raised CONSTRAINT_ERROR : sum_demo.adb:13 index check failed
It's a cute feature, but it's useless for anything complex.
jjmarr · 3h ago
VHDL has this feature too, being based on Ada.
tylerhou · 10h ago
Academic language designers do! But it takes a while for academic features to trickle down to practical languages—especially because expressive-enough refinement typing on even the integers leads to an undecidable theory.
ninalanyon · 1h ago
Pascal had this many decades ago, how long do we have to wait?
idbehold · 10h ago
>But it takes a while
*Checks watch*
We're going on 45 years now.
drpixie · 3h ago
Naaa ... most "new" languages are just reinventions of stuff that's been around for ... 45 years, by people who should know better.
geysersam · 9h ago
Aren't most type systems in widely used languages Turing complete and (consequently) undecidable? Typescript and python are two examples that come to mind
But yeah maybe expressive enough refinement typing leads to hard to write and slow type inference engines
spookie · 10h ago
Well, ada is practical
voidhorse · 4h ago
Eh, idk.
I think the reasons are predominantly social, not theoretical.
For every engineer out there that gets excited when I say the words "refinement types" there are twenty that either give me a blank stare or scoff at the thought, since they a priori consider any idea that isn't already in their favorite (primitivistic) language either too complicated or too useless.
Then they go and reinvent it as a static analysis layer on top of the language and give it their own name and pat themselves on the back for "inventing" such a great check. They don't read computer science papers.
nikeee · 13h ago
I proposed a primitive for this in TypeScript a couple of years ago [1].
While I'm not entirely convinced myself whether it is worth the effort, it offers the ability to express "a number greater than 0". Using type narrowing and intersection types, open/closed intervals emerge naturally from that. Just check `if (a > 0 && a < 1)` and its type becomes `(>0)&(<1)`, so the interval (0, 1).
My specific use case is pattern matching http status codes to an expected response type, and today I'm able to work around it with this kind of construct https://github.com/mnahkies/openapi-code-generator/blob/main... - but it's esoteric, and feels likely to be less efficient to check than what you propose / a range type.
There's runtime checking as well in my implementation, but it's a priority for me to provide good errors at build time
archargelod · 3h ago
Nim[0] supports subrange types:
type
Foo = range[1 .. 10]
Bar = range[0.0 .. 1.0] # float works too
var f:Foo = 42 # Error: cannot convert 42 to Foo = range 1..10(int)
var p = Positive 22 # Positive and Natural types are pre-defined
But wouldn't that also require code execution? For example even though the compiler already knows the size of an array and could do a bounds check on direct assigment (arr[1] = 1) in some wild nested loop you could exceed the bounds that the compiler can't see.
Otherwise you could have
type level asserts more generally. Why stop at a range check when you could check a regex too? This makes the difficulty more clear.
For the simplest range case (pure assignment) you could just use an enum?
steveklabnik · 15h ago
In my understanding Rust may gain this feature via “pattern types.”
ijustlovemath · 3h ago
Where can I sign?
0_gravitas · 1h ago
Clojure does bounded values/regex natively in Clojure Spec, and also the Malli library from Metosin
librasteve · 11h ago
in raku, that’s spelled
subset OneToTen of Int where 1..10:
yoz-y · 7h ago
Off topic: do you use raku in day to day life? I tried learning it but perl5 remains my go-to when I just need to whip something up
librasteve · 25m ago
yes … I find it as useful as I used to find perl (spent 4 years as a perl coder back in the day) and use for ad hoc scripts … data migration, PDF/CSV scraping, some LLM prompt engineering in a business setting, wrote a module to manage/install Wordpress on EC2 for example
Pascal had range types such as 0..9 (as of 1970). Subranges could also be defined for any scalar type. Further, array index types were such ranges.
someone_19 · 10h ago
You can do this quite easily in Rust. But you have to overload operators to make your type make sense. That's also possible, you just need to define what type you get after dividing your type by a regular number and vice versa a regular number by your type. Or what should happen if when adding two of your types the sum is higher than the maximum value. This is quite verbose. Which can be done with generics or macros.
josephg · 7h ago
You can do it at runtime quite easily in rust. But the rust compiler doesn’t understand what you’re doing - so it can’t make use of that information for peephole optimisations or to elide array bounds checks when using your custom type. And you don’t get runtime errors instead of compile time errors if you try to assign the wrong value into your type.
someone_19 · 1h ago
Here is example of compile time error with wrong newtype argument:
The generic magic for this is called “dependant types” I believe - generics that can take values as well as types as parameters. Idris supports these
ChadNauseam · 5h ago
What the GP described could be achieved with dependent types, but could also be achieved with a less powerful type system, and the reduced power can sometimes lead to enormous benefits in terms of how pleasant it actually is to use. Check out "refinement types" (implemented in Liquid Haskell for example). Many constraints can be encoded in the type system, and an SMT solver runs at compile time to check if these constrains are guaranteed to be satisfied by your code. The result is that you can start with a number that's known to be in [0..10), then double it and add five, and then you can pass that to a function that expects a number in [10..20). Dependent types would typically require some annoying boilerplate to prove that your argument to the function would fall within that range, but an SMT solver can chew through that without any problem.
ameliaquining · 12h ago
The full-blown version that guarantees no bounds-check errors at runtime requires dependent types (and consequently requires programmers to work with a proof assistant, which is why it's not very popular). You could have a more lightweight version that instead just crashes the program at runtime if an out-of-range assignment is attempted, and optionally requires such fallible assignments to be marked as such in the code. Rust can do this today with const generics, though it's rather clunky as there's very little syntactic sugar and no implicit widening.
tialaramex · 8h ago
AIUI WUFFS doesn't need a full blown proof assistant because instead of attempting the difficult problem "Can we prove this code is safe?" it has the programmer provide elements of such a proof as they write their program so it can merely ask "Is this a proof that the program is safe?" instead.
ameliaquining · 7h ago
This is also approximately true of Idris. The thing that really helps Wuffs is that it's a pretty simple language without a lot of language features (e.g., no memory allocation and only very limited pointers) that complicate proofs. Also, nobody is particularly tempted to use it and then finds it unexpectedly forbidding, because most programmers don't ever have to write high-performance codecs; Wuffs's audience is people who are already experts.
vmchale · 13h ago
ATS does this. Works quite well since multiplication by known factors and addition of type variables + inequalities is decidable (and in fact quadratic).
ninetyninenine · 14h ago
This can be done in typescript. It’s not super well known because of typescripts association with frontend and JavaScript. But typescript is a language with one of the most powerful type systems ever.
Among the popular languages like golang, rust or python typescript has the most powerful type system.
How about a type with a number constrained between 0 and 10? You can already do this in typescript.
You can even programmatically define functions at the type level. So you can create a function that outputs a type between 0 to N.
type Range<N extends number, A extends number[] = []> =
A['length'] extends N ? A[number] : Range<N, [...A, A['length']]>;
The issue here is that it’s a bit awkward you want these types to compose right? If I add two constrained numbers say one with max value of 3 and another with max value of two the result should be max value of 5. Typescript doesn’t support this by default with default addition. But you can create a function that does this.
// Build a tuple of length L
type BuildTuple<L extends number, T extends unknown[] = []> =
T['length'] extends L ? T : BuildTuple<L, [...T, unknown]>;
// Add two numbers by concatenating their tuples
type Add<A extends number, B extends number> =
[...BuildTuple<A>, ...BuildTuple<B>]['length'];
// Create a union: 0 | 1 | 2 | ... | N-1
type Range<N extends number, A extends number[] = []> =
A['length'] extends N ? A[number] : Range<N, [...A, A['length']]>;
function addRanges<
A extends number,
B extends number
>(
a: Range<A>,
b: Range<B>
): Range<Add<A, B>> {
return (a + b) as Range<Add<A, B>>;
}
The issue is to create these functions you have to use tuples to do addition at the type level and you need to use recursion as well. Typescript recursion stops at 100 so there’s limits.
Additionally it’s not intrinsic to the type system. Like you need peanno numbers built into the number system and built in by default into the entire language for this to work perfectly. That means the code in the function is not type checked but if you assume that code is correct then this function type checks when composed with other primitives of your program.
lucianbr · 13h ago
Complexity is bad in software. I think this kind of thing does more harm than good.
I get an error that I can't assign something that seems to me assignable, and to figure out why I need to study functions at type level using tuples and recursion. The cure is worse than the disease.
ninetyninenine · 11h ago
It can work. It depends on context. Like let's say these types are from a well renowned library or one that's been used by the codebase for a long time.
If you trust the type, then it's fine. The code is safer. In the world of of the code itself things are easier.
Of course like what you're complaining about, this opens up the possibility of more bugs in the world of types, and debugging that can be a pain. Trade offs.
In practice people usually don't go crazy with type level functions. They can do small stuff, but usually nothing super crazy. So type script by design sort of fits the complexity dynamic you're looking for. Yes you can do type level functions that are super complex, but the language is not designed around it and it doesn't promote that style either. But you CAN go a little deeper with types then say a language with less power in the type system like say Rust.
giraffe_lady · 12h ago
Typescript's type system is turing complete, so you can do basically anything with it if this sort of thing is fun to you. Which is pretty much my problem with it: this sort of thing can be fun, feels intellectually stimulating. But the added power doesn't make coding easier or make the code more sound. I've heard this sort of thing called the "type puzzle trap" and I agree with that.
I'll take a modern hindley milner variant any day. Sophisticated enough to model nearly any type information you'll have need of, without blurring the lines or admitting the temptation of encoding complex logic in it.
>Which is pretty much my problem with it: this sort of thing can be fun, feels intellectually stimulating. But the added power doesn't make coding easier or make the code more sound.
In practice nobody goes too crazy with it. You have a problem with a feature almost nobody uses. It's there and Range<N> is like the upper bound of complexity I've seen in production but that is literally extremely rare as well.
There is no "temptation" of coding complex logic in it at all as the language doesn't promote these features at all. It's just available if needed. It's not well known but typescript types can be easily used to be 1 to 1 with any hindley milner variant. It's the reputational baggage of JS and frontend that keeps this fact from being well known.
In short: Typescript is more powerful then hindley milner, a subset of it has one to one parity with it, the parts that are more powerful then hindley milner aren't popular and used that widely nor does the flow of the language itself promote there usage. The feature is just there if you need it.
If you want a language where you do this stuff in practice take a look at Idris. That language has these features built into the language AND it's an ML style language like haskell.
giraffe_lady · 9h ago
I have definitely worked in TS code bases with overly gnarly types, seen more experienced devs spend an entire workday "refactoring" a set of interrelated types and producing an even gnarlier one that more closely modeled some real world system but was in no way easier to reason about or work with in code. The advantage of HM is the inference means there is no incentive to do this, it feels foolish from the beginning.
arrowsmith · 15h ago
FYI: Ruby is strongly typed, not loosely.
> 1 + "1"
(irb):1:in 'Integer#+': String can't be coerced into Integer (TypeError)
from (irb):1:in '<main>'
from <internal:kernel>:168:in 'Kernel#loop'
from /Users/george/.rvm/rubies/ruby-3.4.2/lib/ruby/gems/3.4.0/gems/irb-1.14.3/exe/irb:9:in '<top (required)>'
from /Users/george/.rvm/rubies/ruby-3.4.2/bin/irb:25:in 'Kernel#load'
from /Users/george/.rvm/rubies/ruby-3.4.2/bin/irb:25:in '<main>'
kitten_mittens_ · 15h ago
There seem to be two competing nomenclatures around strong/weak typing where people mean static/dynamic instead.
josephg · 15h ago
Some people mistakenly call dynamic typing "weak typing" because they don't know what those words mean. PSA:
Static typing / dynamic typing refers to whether types are checked at compile time or runtime. "Static" = compile time (eg C, C++, Rust). "Dynamic" = runtime (eg Javascript, Ruby, Excel)
Strong / weak typing refers to how "wibbly wobbly" the type system is. x86 assembly language is "weakly typed" because registers don't have types. You can do (more or less) any operation with the value in any register. Like, you can treat a register value as a float in one instruction and then as a pointer during the next instruction.
Ruby is strongly typed because all values in the system have types. Types affects what you can do. If you treat a number like its an array in ruby, you get an error. (But the error happens at runtime because ruby is dynamically typed - thus typechecking only happens at runtime!).
0x457 · 10h ago
It's strongly typed, but it's also duck typed. Also, in ruby everything is an object, even the class itself, so type checking there is weird.
Sure it stops you from running into "'1' + 2" issues, but won't stop you from yeeting VeryRawUnvalidatedResponseThatMightNotBeAuthorized to a function that takes TotalValidatedRequestCanUseDownstream. You won't even notice an issue until:
- you manually validate
- you call a method that is unavailable on the wrong object.
ameliaquining · 11h ago
I recall a type theorist once defined the terms as follows (can't find the source): "A strongly typed language is one whose type system the speaker likes. A weakly typed language is one whose type system the speaker dislikes."
So yeah I think we should just give up these terms as a bad job. If people mean "static" or "dynamic" then they can say that, those terms have basically agreed-upon meanings, and if they mean things like "the type system prohibits [specific runtime behavior]" or "the type system allows [specific kind of coercion]" then it's best to say those things explicitly with the details filled in.
We've been reading comments like that since the internet was created (and no doubt in books before that). Why give up now?
johnfn · 15h ago
irb(main):001:0> a = 1
=> 1
irb(main):002:0> a = '1'
=> "1"
It doesn't seem that strong to me.
folkrav · 11h ago
It would be weak if that was actually mutating the first “a”. That second declaration creates a new variable using the existing name “a”. Rust lets you do the same[1].
Rust lets you do the same because the static typing keeps you safe. In Rust, treating the second 'a' like a number would be an error. In ruby, it would crash.
folkrav · 4h ago
Let’s rephrase: is naming a variable typing? As for runtime vs. compile errors - isn’t this just a trade off of interpreted languages?
0x457 · 10h ago
These are two entirely different a's you're storing reference to it in the same variable. You can do the same in rust (we agree it statically and strongly typed, right?):
let a = 1;
let a = '1';
Strongly typing means I can do 1 + '1' variable names and types has nothing to do with it being strongly typed.
dgb23 · 15h ago
In the dynamic world being able to redefine variables is a feature not a bug (unfortunately JS has broken this), even if they are strongly typed. The point of strong typing is that the language doesn't do implicit conversions and other shenanigans.
9rx · 11h ago
The types are strong. The variables are weak.
Spivak · 12h ago
Well yeah, because variables in what you consider to be a
strongly typed language are allocating the storage for those variables. When you say int x you're asking the compiler to give you an
int shaped box. When you say x = 1 in Ruby all you're doing is saying is that in this scope the name x now refers to the box holding a 1. You can't actually store a string in the int box, you can only say that
from now on the name x refers to the string box.
jevndev · 14h ago
The “Stop at first level of type implementation” is where I see codebases fail at this. The example of “I’ll wrap this int as a struct and call it a UUID” is a really good start and pretty much always start there, but inevitably someone will circumvent the safety. They’ll see a function that takes a UUID and they have an int; so they blindly wrap their int in UUID and move on. There’s nothing stopping that UUID from not being actually universally unique so suddenly code which relies on that assumption breaks.
This is where the concept of “Correct by construction” comes in. If any of your code has a precondition that a UUID is actually unique then it should be as hard as possible to make one that isn’t. Be it by constructors throwing exceptions, inits returning Err or whatever the idiom is in your language of choice, the only way someone should be able to get a UUID without that invariant being proven is if they really *really* know what they’re doing.
(Sub UUID and the uniqueness invariant for whatever type/invariants you want, it still holds)
munificent · 14h ago
> This is where the concept of “Correct by construction” comes in.
This is one of the basic features of object-oriented programming that a lot of people tend to overlook these days in their repetitive rants about how horrible OOP is.
One of the key things OO gives you is constructors. You can't get an instance of a class without having gone through a constructor that the class itself defines. That gives you a way to bundle up some data and wrap it in a layer of validation that can't be circumvented. If you have an instance of Foo, you have a firm guarantee that the author of Foo was able to ensure the Foo you have is a meaningful one.
Of course, writing good constructors is hard because data validation is hard. And there are plenty of classes out there with shitty constructors that let you get your hands on broken objects.
But the language itself gives you direct mechanism to do a good job here if you care to take advantage of it.
Functional languages can do this too, of course, using some combination of abstract types, the module system, and factory functions as convention. But it's a pattern in those languages where it's a language feature in OO languages. (And as any functional programmer will happily tell you, a design pattern is just a sign of a missing language feature.)
lock1 · 13h ago
I find regular OOP language constructor are too restrictive. You can't return something like Result<CorrectObject,ConstructorError> to handle the error gracefully or return a specific subtype; you need a static factory method to do something more than guaranteed successful construction w/o exception.
Does this count as a missing language feature by requiring a "factory pattern" to achieve that?
Jensson · 5h ago
> You can't return something like Result<CorrectObject,ConstructorError> to handle the error gracefully
Throwing an error is doing exactly that though, its exactly the same thing in theory.
What you are asking for is just more syntactic sugar around error handling, otherwise all of that already exists in most languages. If you are talking about performance that can easily be optimized at compile time for those short throw catch syntactic sugar blocks.
Java even forces you to handle those errors in code, so don't say that these are silent there is no reason they need to be.
henry700 · 13h ago
The natural solution for this is a private constructor with public static factory methods, so that the user can only obtain an instance (or the error result) by calling the factory methods. Constructors need to be constrained to return an instance of the class, otherwise they would just be normal methods.
Convention in OOP languages is (un?)fortunately to just throw an exception though.
Conscat · 10h ago
In languages with generic types such as C++, you generally need free factory functions rather than static member functions so that type deduction can work.
No comments yet
0x457 · 11h ago
This is why constructors are dumb IMO and rust way is the right way.
Nothing stops you from returning Result<CorrectObject,ConstructorError> in CorrectObject::new(..) function because it's just a regular function struct field visibility takes are if you not being able to construct incorrect CorrectObject.
hombre_fatal · 12h ago
I don't see this having much to do with OOP vs FP but maybe the ease in which a language lets you create nominal types and functions that can nicely fail.
What sucks about OOP is that it also holds your hand into antipatterns you don't necessarily want, like adding behavior to what you really just wanted to be a simple data type because a class is an obvious junk drawer to put things.
And, like your example of a problem in FP, you have to be eternally vigilant with your own patterns to avoid antipatterns like when you accidentally create a system where you have to instantiate and collaborate multiple classes to do what would otherwise be a simple `transform(a: ThingA, b: ThingB, c: ThingC): ThingZ`.
Finally, as "correct by construction" goes, doesn't it all boil down to `createUUID(string): Maybe<UUID>`? Even in an OOP language you probably want `UUID.from(string): Maybe<UUID>`, not `new UUID(string)` that throws.
munificent · 11h ago
> Even in an OOP language you probably want `UUID.from(string): Maybe<UUID>`, not `new UUID(string)` that throws.
One way to think about exceptions is that they are a pattern matching feature that privileges one arm of the sum type with regards to control flow and the type system (with both pros and cons to that choice). In that sense, every constructor is `UUID.from(string): MaybeWithThrownNone<UUID>`.
9rx · 10h ago
The best way to think about exceptions is to consider the term literally (as in: unusual; not typical) while remembering that programmers have an incredibly overinflated sense of ability.
In other words, exceptions are for cases where the programmer screwed up. While programmers screwing up isn't unusual at all, programmers like to think that they don't make mistakes, and thus in their eye it is unusual. That is what sets it apart from environmental failures, which are par for the course.
To put it another way, it is for signalling at runtime what would have been a compiler error if you had a more advanced compiler.
SAI_Peregrinus · 4h ago
Unfortunately many languages treat exceptions as a primary control flow mechanism. That's part of why Rust calls its exceptions "panics" and provides the "panic=abort" compile-time option which aborts the program instead of unwinding the stack with the possibility of catching the unwind. As a library author you can never guarantee that `catch_unwind` will ever get used, so its main purpose of preventing unwinding across an FFI boundary is all it tends to get used for.
9rx · 4h ago
> Unfortunately many languages
Just Java (and Javascript by extension, as it was trying to copy Java at the time), really. You do have a point that Java programmers have infected other languages with their bad habits. For example, Ruby was staunchly in the "return errors as values and leave exception handling for exceptions" before Rails started attracting Java developers, but these days all bets are off. But the "purists" don't advocate for it.
MoreQARespect · 15h ago
I've recently been following red-green-refactor but instead of with a failing test, I tighten the screws on the type system to make a production-reported bug cause the type checker to fail before making it green by fixing the bug.
I still follow TDD-with-a-test for all new features, all edge cases and all bugs that I can't trigger failure by changing the type system for.
However, red-green-refactor-with-the-type-system is usually quick and can be used to provide hard guarantees against entire classes of bug.
pclowes · 15h ago
I like this approach, there are often calls for increased testing on big systems and what they really mean is increased rigor. Don't waste time testing what you can move into the compiler.
It is always great when something is so elegantly typed that I struggle to think of how to write a failing test.
What drives me nuts is when there are testing left around basically testing the compiler that never were “red” then “greened” makes me wonder if there is some subtle edge case I am missing.
eyelidlessness · 13h ago
As you move more testing responsibilities to the compiler, it can be valuable to test the compiler’s responsibilities for those invariants though. Otherwise it can be very hard to notice when something previously guaranteed statically ceases to be.
eyelidlessness · 13h ago
I found myself following a similar trajectory, without realizing that’s what I was doing. For a while it felt like I was bypassing the discipline of TDD that I’d previously found really valuable, until I realized that I was getting a lot of the test-first benefits before writing or running any code at all.
Now I just think of types as the test suite’s first line of defense. Other commenters who mention the power of types for documentation and refactoring aren’t wrong, but I think that’s because types are tests… and good tests, at almost any level, enable those same powers.
MoreQARespect · 12h ago
I dont think tests and types are the same "thing" per se - they work vastly better in conjunction with each other than alone and are weirdly symmetrical in the way that theyre bad substitutes for each other.
However, Im convinced that theyre both part of the same class of thing, and that "TDD" or red/green/refactor or whatever you call it works on that class, not specifically just on tests.
Documentation is a funny one too - I use my types to generate API and other sorts of reference docs and tests to generate how-to docs. There is a seemingly inextricable connection between types and reference docs, tests and how-to docs.
eyelidlessness · 11h ago
Types are a kind of test. Specifically they’re a way to assert certain characteristics about the interactions between different parts of the code. They’re frequently assertions you’d want to make another way, if you didn’t have the benefit of a compiler to run that set of assertions for you. And like all tests, they’re a means to gain or reinforce confidence in claims you could make about the code’s behavior. (Which is their symmetry with documentation.)
tossandthrow · 9h ago
This can usually be alleviated by structural types instead of nominal types.
You can always enforce nominal types if you really need it.
reactordev · 13h ago
Union types!! If everything’s a type and nothing works together, start wrapping them in interfaces and define an über type that unions everything everywhere all at once.
Welcome to typescript. Where generics are at the heart of our generic generics that throw generics of some generic generic geriatric generic that Bob wrote 8 years ago.
Because they can’t reason with the architecture they built, they throw it at the type system to keep them in line. It works most of the time. Rust’s is beautiful at barking at you that you’re wrong. Ultimately it’s us failing to design flexibility amongst ever increasing complexity.
Remember when “Components” where “Controls” and you only had like a dozen of them?
Remember when a NN was only a few hundred thousand parameters?
As complexity increases with computing power, so must our understanding of it in our mental model.
However you need to keep that mental model in check, use it. If it’s typing, do it. If it’s rigorous testing, write your tests. If it’s simulation, run it my friend. Ultimately, we all want better quality software that doesn’t break in unexpected ways.
valenterry · 2h ago
union types are great. But alone they are not sufficient for many cases. For example, try to define a datastructure that captures a classical evaluation-tree.
You might go with:
type Expression = Value | Plus | Minus | Multiply | Divide;
interface Value { type: "value"; value: number; }
interface Plus { type: "plus"; left: Expression; right: Expression; }
interface Minus { type: "minus"; left: Expression; right: Expression; }
interface Multiply { type: "multiply"; left: Expression; right: Expression; }
interface Divide { type: "divide"; left: Expression; right: Expression; }
And so on.
That looks nice, but when you try to pattern match on it and have your pattern matching return the types that are associated with the specific operation, it won't work. The reason is that Typescript does not natively support GADTs. Libs like ts-pattern use some tricks to get closish at least.
And while this might not be very important for most application developers, it is very important for library authors, especially to make libraries interoperable with each other and extend them safely and typesafe.
kazinator · 15h ago
Also known as "make bad state unexperimentable".
Ey7NFZ3P0nzAe · 20m ago
Shoutout to the absolutely awesome beartype for python. With simple decorators it adds runtime type checking with virtually no performance cost!!
An adjacent point is to use checked exceptions and to handle them appropriate to their type. I don't get why Java checked exceptions were so maligned. They saved me so many headaches on a project where I forced their use as I was the tech lead for it.
Everyone hated me for a while because it forced them to deal with more than just the happy path but they loved it once they got in the rhythm of thinking about all the exceptional cases in the code flow. And the project was extremely robustness even though we were not particularly disciplined about unit testing
bcrosby95 · 14h ago
I think most complaints about checked exceptions in Java ultimately boil down to how verbose handling exceptions in Java is. Everytime the language forces you to handle an exception when you don't really need to makes you hate it a bit more.
First, the library author cannot reasonably define what is and isn't a checked exception in their public API. That really is up to the decision of the client. This wouldn't be such a big deal if it weren't so verbose to handle exceptions though: if you could trivially convert an exception to another type, or even declare it as runtime, maybe at the module or application level, you wouldn't be forced to handle them in these ways.
Second, to signature brittleness, standard advice is to create domain specific exceptions anyways. Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose... see above.
Ultimately, I love checked exceptions. I just hate the ergonomics around exceptions in Java. I wish designers focused more on fixing that than throwing the baby out with the bathwater.
vips7L · 4h ago
That’s the same conclusion I’ve come too. I’ve commented on it a little bit here:
> Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose
The problem just compounds too. People start checking things that they can’t handle from the functions they’re calling. The callers upstream can’t possibly handle an error from the code you’re calling, they have no idea why it’s being called.
I also hate IOException. It’s so extremely unspecific. It’s the worst way to do exceptions. Did the entire disk die or was the file not just found or do I not have permissions to write to it? IOException has no meaning.
Part of me secretly hopes Swift takes over because I really like its error handling.
lock1 · 13h ago
If only Java also provided Either<L,R>-like in the standard library...
Personally I use checked exceptions whenever I can't use Either<> and avoid unchecked like a plague.
Yeah, it's pretty sad Java language designer just completely deserted exception handling. I don't think there's any kind of improvement related to exceptions between Java 8 and 24.
alex_smart · 10h ago
Ok please help me understand, what is the difference between
- R method() throws L, and
- Either<L, R> method()
To me they seem completely isomorphic?
lock1 · 3h ago
That's what I thought at first too. At first glance they look equivalent, telling API users what the expected result of a method call is. In that sense, both are equivalent.
But after experimenting a bit with checked exceptions, I realized how neglected exceptions are in Java.
- There's no other way to handle checked exceptions other than try-catch block
- They play very badly with API that use functional interfaces. Many APIs don't provide checked throws variant
- catch block can't use generic / parameterized type, you need to catch Exception or Throwable then operate on it at runtime
After rolling my own Either<L,R>, it felt like a customizable typesafe macro for exception handling. It addresses all the annoyances I had with checked exception handling, and it plays nicely with exhaustive pattern matching using `sealed`.
Granted, it has the drawback that sometimes I have to explicitly spell out types due to local type inference failing to do so. But so far it has been a pleasant experience of handling error gracefully.
cloogshicer · 9h ago
There is a major difference at the call site.
try/catch has significantly more complex call sites because it affects control flow.
Jensson · 5h ago
That is just a syntactic sugar difference, you could have exactly the same call site structure if you wanted in a language.
worldsayshi · 10h ago
Don't you mean "isosemantic"? Since the same concept is represented with different syntax.
default-kramer · 15h ago
I think checked exceptions were maligned because they were overused. I like that Java supports both checked and unchecked exceptions. But IMO checked exceptions should only be used for what Eric Lippert calls "exogenous" exceptions [1]; and even then most of them should probably be converted to an unchecked exception once they leave the library code that throws them. For example, it's always possible that your DB could go offline at any time, but you probably don't want "throws SQLException" polluting the type signature all the way up the call stack. You'd rather have code assuming all SQL statements are going to succeed, and if they don't your top-level catch-all can log it and return HTTP 500.
Put another way: errors tend to either be handled "close by" or "far away", but rarely "in the middle".
So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).
Jensson · 5h ago
> So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).
It doesn't, you can just declare that the function throws these as well, you don't have to handle it directly.
wavemode · 3h ago
It pollutes type signatures. If some method deep down the call stack changes its implementation details from throwing exception A you don't care about to throwing exception B you also don't care about, you also have to change the type of `throws` annotation on your method.
This is annoying enough to deal with in concrete code, but interfaces make it a nightmare.
abraxas · 14h ago
It's fine to let exceptions percolate to the top of the call stack but even then you likely want to inform the user or at least log it in your backend why the request was unsuccessful. Checked exceptions force both the handling of exceptions and the type checking if they are used as intended. It's not a problem if somewhere along the call chain an SQLException gets converted to "user not permitted to insert this data" exception. This is how it was always meant to work. What I don't recommend is defaulting to RuntimeException and derivatives for those business level exceptions. They should still be checked and have their own types which at least encourages some discipline when handling and logging them up the call stack.
yardstick · 11h ago
In my experience, the top level exception handler will catch all incl Throwable, and then inspect the exception class and any nested exception classes for things like SQL error or MyPermissionsException etc and return the politically correct error to the end user. And if the exception isn’t in a whitelist of ones we don’t need to log, we log it to our application log.
codr7 · 14h ago
Sometimes I feel like I actually wouldn't mind having any function touching the database tagged as such. But checked exceptions are such a pita to deal with that I tend to not bother.
alex_smart · 10h ago
>you probably don't want "throws SQLException" polluting the type signature all the way up the call stack
A problem easily solved by writing business logic in pure java code without any IO and handling the exceptions gracefully at the boundary.
Jtsummers · 15h ago
Setting aside the objections some have to exceptions generally: Checked exceptions, in contrast to unchecked, means that if a function/method deep in your call stack is changed to throw an exception, you may have to change many function (to at least denote that they will throw that exception or some exception) between the handler and the thrower. It's an objection to the ergonomics around modifying systems.
Think of the complaints around function coloring with async, how it's "contagious". Checked exceptions have the same function color problem. You either call the potential thrower from inside a try/catch or you declare that the caller will throw an exception.
gpderetta · 13h ago
And as with async, the issue is a) the lack of the ability to write generic code that can abstract over the async-ness or throw signature of a function and b) the ability to type erase asyncness (by wrapping with stackful coroutines) or throw signature (by converting to unchecked exceptions).
Incidentally, for exceptions, Java had (b), but for a long time didn't have (a) (although I think this changed?), leading to (b) being abused.
abraxas · 14h ago
That's a valid point but it's somewhere on a spectrum of "quick to write/change" vs "safe and validated" debate of strictly vs loosely typed systems. Strictly typed systems are almost by definition much more "brittle" when it comes to code editing. But the strictness also ensures that refactoring is usually less perilous than in loosely typed code.
bigstrat2003 · 5h ago
> Checked exceptions, in contrast to unchecked, means that if a function/method deep in your call stack is changed to throw an exception, you may have to change many function (to at least denote that they will throw that exception or some exception) between the handler and the thrower.
That's the point! The whole reason for checked exceptions is to gain the benefit of knowing if a function starts throwing an exception that it didn't before, so you can decide how to handle it. It's a good thing, not a bad thing! It's no different from having a type system which can tell you if the arguments to a function change, or if its return type does.
Jtsummers · 4h ago
Why are you screaming? All those wasted exclamation marks, you could have written something I didn't know. I didn't say it wasn't the point or that it was a bad thing.
someone_19 · 10h ago
Unhappy way is a part of contract. So yes, that is what I want. If a function couldn't fail before but can after the update - I want to know about it.
In fact, at each layer, if you want to propagate an error, you have to convert it to one specific to that layer.
jayd16 · 7h ago
C# went with properly typed but unchecked exceptions. IMO it gives you a clean error stacks without too much of an issue.
I also think its a bit cleaner to have a nicely pattern matched handler blocks than bespoke handling at every level. That said, if unwrapped error results have a robust layout then its probably pretty equivalent.
dherls · 15h ago
With Java, there are a lot of usability issues with checked types. For example streams to process data really don't play nicely if your map or filter function throws a checked exception. Also if you are calling a number of different services that each have their own checked exception, either you resort to just catching generic Exception or you end up with a comically large list of exceptions
hiddew · 14h ago
That is why I am happy that rich errors (https://xuanlocle.medium.com/kotlin-2-4-introduces-rich-erro...) are coming to Kotlin. This expresses the possible error states very well, while programming for the happy path and with some syntactic sugar for destucturing the errors.
wvenable · 12h ago
I rarely have more than handful of try..catch blocks in any application. These either wrap around an operation that can be retried in the case of temporary failure or abort the current operation with a logged error message.
Checked exceptions feel like a bad mix of error returns and colored functions to me.
Hackbraten · 13h ago
For anyone who dislikes checked exceptions due to how clunky they feel: modern Java allows you to construct custom Result-like types using sealed interfaces.
paulddraper · 5h ago
There are lots of reasons.
But for one, Java checked exceptions don't work with generics.
socalgal2 · 8h ago
My team recently did this to some C++ code that was using mixed numeric values. It started off as finding a bug. The bug was fixed but the fixer wanted to add safer types to avoid future bugs. They added them, found 3 more bugs where the wrong values were being used unintentionally.
recursivedoubts · 15h ago
Type systems, like any other tool in the toolbox, have an 80/20 rule associated with them. It is quite easy to overdo types and make working with a library extremely burdensome for little to no to negative benefit.
I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Maybe an elaborate type system worth it, but maybe not (especially if there are good tests.)
> I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Presumably you need to know what an Account and a User are to use that software in the first place. I can't imagine a reasonable person easily understanding a getAccountById function which takes one argument of type UUID, but having trouble understanding a getAccountById function which takes one argument of type AccountId.
kjksf · 12h ago
UserID and AccountID could just as well be integers.
What he means is that by introducing a layer of indirection via a new type you hide the physical reality of the implementation (int vs. string).
The physical type matters if you want to log it, save to a file etc.
So now for every such type you add a burden of having to undo that indirection.
At which point "is it worth it?" is a valid question.
You made some (but not all) mistakes impossible but you've also introduced that indirection that hides things and needs to be undone by the programmer.
> There is a UI for memorialising users, but I assured her that the pros simply ran a bit of code in the PHP debugger. There’s a function that takes two parameters: one the ID of the person being memorialised, the other the ID of the person doing the memorialising. I gave her a demo to show her how easy it was....And that’s when I entered Clowntown....I first realised something was wrong when I went back to farting around on Facebook and got prompted to login....So in case you haven’t guessed what I got wrong yet, I managed to get the arguments the wrong way round.
Instead of me memorialising my test user, my test user memorialised me.
paulddraper · 1h ago
I recommend adding a serialization method to your types, namely to text, but optionally to JSON as well.
buerkle · 10h ago
foo(UUID, UUID);
foo(AccountId, UserId);
I'd much rather deal with the 2nd version than the first. It's self-documenting and prevents errors like calling "foo(userId, accountId)" letting the compiler test for those cases. It also helps with more complex data structures without needing to create another type.
I now know I never know whenever "a UUID" is stored or represented as a GUIDv1 or a UUIDv4/UUIDv7.
I know it's supposed to be "just 128 bits", but somehow, I had a bunch of issues running old Java servlets+old Java persistence+old MS SQL stack that insisted, when "converting" between java.util.UUID to MS SQL Transact-SQL uniqueidentifier, every now and then, that it would be "smart" if it flipped the endianess of said UUID/GUID to "help me". It got to a point where the endpoints had to manually "fix" the endianess and insert/select/update/delete for both the "original" and the "fixed" versions of the identifiers to get the expected results back.
(My educated guess it's somewhat similar to those problems that happens when your persistence stack is "too smart" and tries to "fix timezones" of timestamps you're storing in a database for you, but does that wrong, some of the time.)
paulddraper · 1h ago
UUIDs all have the same storage+representation.
They are generated with different algorithms, if you find these distinctions to be semantically useful to operations, carry that distinction into the type.
Seems like 98% of the time it wouldn’t matter.
3836293648 · 15h ago
To be fair, you probably needed to know that anyway? Or else you would've just passed invalid data into functions.
recursivedoubts · 15h ago
I cannot recall ever passing an invalid UUID (or long id) into a function due to statically-knowable circumstances.
happytoexplain · 15h ago
The point is that you might pass a semantically invalid user ID. Not that you might pass an invalid UUID.
I generally agree that it's easy to over-do, but can be great if you have a terse, dense, clear language/framework/docs, so you can instantly learn about UserID.
ThunderSizzle · 15h ago
More specifically, if all entities have a GUID, it's not impossible to accidentally map entity A ID to entity B ID accidentally, especially when working with relationships. Moving the issue to the compiler is nicer than the query returning 0 results and the developer staring endlessly for the subtle issue.
dgb23 · 14h ago
I think the example is just not very useful, because it illustrates a domain separation instead of a computational one, which is almost always the wrong approach.
It is however useful to return a UUID type, instead of a [16]byte, or a HTMLNode instead of a string etc. These discriminate real, computational differences. For example the method that gives you a string representation of an UUID doesn't care about the surrounding domain it is used in.
Distinguishing a UUID from an AccountID, or UserID is contextual, so I rather communicate that in the aggregate. Same for Celsius and Fahrenheit. We also wouldn't use a specialized type for date times in every time zone.
petesergeant · 14h ago
> I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Yes, that’s exactly the point. If you don’t know how to acquire an AccountID you shouldn’t just be passing a random string or UUID into a function that accepts an AccountID hoping it’ll work, you should have acquired it from a source that gives out AccountIDs!
recursivedoubts · 14h ago
And that's my point: I'm usually getting AccountIDs from strings (passed in via HTTP requests) so the whole thing becomes a pointless exercise.
lmm · 5h ago
If your system is full of stringly typed network interfaces then yes there is no point in trying to make it good. You can make things a bit better by using a structured RPC protocol like gRPC, but the only real solution is to not do that.
Kranar · 9h ago
You just accept raw strings without doing any kind of validation? The step that performs validation should encode that step in the form of a type.
recursivedoubts · 8h ago
i pride myself in never doing any validation ever
never escape anything, either
just hand my users a raw SQL connection
petesergeant · 14h ago
Do you validate them? I assume you do. Feels like a great time to cast them too
Hard not to agree with the general idea. But also hard to ignore all of the terrible experiences I've had with systems where everything was a unique type.
In general, I think this largely falls when you have code that wants to just move bytes around intermixed with code that wants to do some fairly domain specific calculations. I don't have a better way of phrasing that, at the moment. :(
hombre_fatal · 12h ago
Maybe I know what you mean.
There are cases where you have the data in hand but now you have to look for how to create or instantiate the types before you can do anything with it, and it can feel like a scavenger hunt in the docs unless there's a cookbook/cheatsheet section.
One example is where you might have to use createVector(x, y, z): Vector when you already have { x, y, z }. And only then can you createFace(vertices: Vector[]): Face even though Face is just { vertices }. And all that because Face has a method to flip the normal or something.
Another example is a library like Java's BouncyCastle where you have the byte arrays you need, but you have to instantiate like 8 different types and use their methods on each other just to create the type that lets you do what you wish was just `hash(data, "sha256")`.
chriswarbo · 6h ago
"Phantom types" are useful for what you describe: that's where we add a parameter to a type (i.e. making it generic), but we don't actually use that parameter anywhere. I used this when dealing with cryptography in Scala, where everything is just an array of bytes, but phantom types prevented me getting them mixed up. https://news.ycombinator.com/item?id=28059019
stellalo · 12h ago
Ideally though, the compiler lowers all domain specific logic into simple byte-moving, just after having checked that types add up. Or maybe I misunderstood what you meant?
tyleo · 15h ago
In C#, I often use a type like:
readonly struct Id32<M> {
public readonly int Value { get; }
}
Then you can do:
public sealed class MFoo { }
public sealed class MBar { }
And:
Id32<MFoo> x;
Id32<MBar> y;
This gives you integer ids that can’t be confused with each other. It can be extended to IdGuid and IdString and supports new unique use cases simply by creating new M-prefixed “marker” types which is done in a single line.
I’ve also done variations of this in TypeScript and Rust.
default-kramer · 14h ago
I've done something like that too. I also noticed that enums are even lower-friction (or were, back in 2014) if your IDs are integers, but I never put this pattern into real code because I figured it might be too confusing: https://softwareengineering.stackexchange.com/questions/3090...
gpderetta · 13h ago
FWIW, I extensively use strong enums in C++[1] for exactly this reason and they are a cheap simple way to add strongly typed ids.
[1] enum class from C++11, classic enums have too many implicit conversions to be of any use.
TuxSH · 11h ago
> classic enums have too many implicit conversions
They're fairly useful still (and since C++11 you can specify their underlying type), you can use them as namespaced macro definitions
Kinda hard to do "bitfield enums" with enum class
TuxSH · 11h ago
> classic enums have too many implicit conversions
They're fairly useful still (and since C++11 you can specify their underlying type), you can use them as namespaced macro definitions
yawaramin · 7h ago
This technique is called 'phantom type' because no values of MFoo or MBar exist at runtime.
The name means "Value Object Generator" as it uses Source generation to generate the "Value object" types.
That readme has links to similar libraries and further reading.
tyleo · 11h ago
This seems like overkill. I’d prefer the few lines of code above to a whole library.
SideburnsOfDoom · 19m ago
Is it "overkill" if it's already written and tested?
Once you have several of these types, and they have validation and other concerns then the cost-benefit might flip.
FYI, In modern c#, you could try using "readonly record struct" in order to get lots of equality and other concerns generated for you. It's like a "whole library" but it's a compiler feature.
rjbwork · 14h ago
Have you used this in production? It seems appealing but seems so anti-thetical to the common sorts of engineering cultures I've seen where this sort of rigorous thinking does not exactly abound.
vborovikov · 10h ago
Source generators hide too many details from the user.
I prefer to have the generated code to be the part of the code repo. That's why I use code templates instead of source generators. But a properly constructed ID type has a non-trivial amount of code: https://github.com/vborovikov/pwsh/blob/main/Templates/ItemT...
SideburnsOfDoom · 23m ago
> a properly constructed ID type has a non-trivial amount of code
That is correct, I've looked at the generated code and it's non-trivial, especially when validation, serialisation and casting concerns are present. That's why I'd want it to be common, tested code.
SideburnsOfDoom · 14h ago
Sadly I have not. I have played with it and it seems to hold up quite well.
I want it for a case where it seems very well suited - all customer ids are strings, but only very specific strings are customer ids. And there are other string ids around as well.
IMHO Migration won't be hard - you could allow casts to/from the primitive type while you change code. Temporarily disallowing these casts will show you where you need to make changes.
I don't know yet how "close to the edges" you would have to go back to the primitive types in ordered for json and db serialisation to work.
But it would be easier to get in place in a new "green field" codebase.
I pitched it as a refactoring, but the other people were well, "antithetical" is a good word.
kwon-young · 11h ago
This reminds me of the mp-units [1] library which aims to solve this problem focusing on the physical quantities.
The use of strong quantities means that you can have both safety and complex conversion logic handled automatically, while having generic code not tied to single set of units.
I have tried to bring that to the prolog world [2] but I don't think my fellow prolog programmers are very receptive to the idea ^^.
I remember a long, long time ago, working on a project that handled lots of different types of physical quantities: distance, speed, temperature, pressure, area, volume, and so on. But they were all just passed around as "float" so you'd every so often run into bugs where a distance was passed where a speed was expected, and it would compile fine but have subtle or obvious runtime defects. Or the API required speed in km/h, but you passed it miles/h, with the same result. I always wanted to harden it up with distinct types so we could catch these problems during development rather than testing, but I was a junior guy and could never articulate it well and justify the engineering effort, and nobody wanted to go through the effort of explicitly converting to/from primitive types to operate on the numbers.
I had kind of written off using types because of the complexity of physical units, so I will be having a look at that!
My biggest problem has been people not specifying their units. On our own code end I'm constantly getting people to suffix variables with the units. But there's still data from clients, standard library functions, etc. where the units aren't specified!
bbkane · 15h ago
I was doing this and used it for a year in https://github.com/bbkane/warg/, but ripped it out since Go auto-casts underlying types to derived types in function calls:
Type userID int64
func Work(u userID) {...}
Work(1) // Go accepts this
I think I recalled that correctly. Since things like that were most of what I was doing I didn't feel the safety benefit in many places, but had to remember to cast the type in others (iirc, saving to a struct field manually).
drpixie · 3h ago
Arrggg- that's the best reason, so far, to avoid Go.
Almost nothing is a number. A length is not a number, an age is not a number, a phone number is not a number - sin(2inches) is meaningless, 30years^2 is meaningless, phone#*2 is meaningless, and 2inches+30years is certainly meaningless - but most of our languages permit us to construct, and use, and confuse these meaningless things.
alphazard · 14h ago
This is a little misleading. Go will automatically convert a numeric literal (which is a compile time idea not represented at runtime) into the type of the variable it is being assigned to.
Go will not automatically cast a variable of one type to another. That still has to be done explicitly.
func main() {
var x int64 = 1
Func(SpecialInt64(x)) // this will work
Func(x) // this will not work
}
type SpecialInt64 int64
func Func(x SpecialInt64) {
}
Yep in the same way it would allow `var u userID = 1` it allows `Work(1)` rather than insisting on `var u userID = userID(1)` and `Work(userID(1))`.
I teach Go a few times a year, and this comes up a few times a year. I've not got a good answer why this is consistent with such an otherwise-explicit language.
skybrian · 14h ago
This only happens for literal values. Mixing up variables of different types will result in a type error.
When you write 42 in Go, it’s not an int32 or int64 or some more specific type. It’s automatically inferred to have the correct type. This applies even for user-defined numeric types.
frankus · 12h ago
Swift has a typealias keyword but it's not really useful for this since two distinct aliased types with the same underlying type can be freely interchanged. Wrong code may look wrong but it will still compile.
Wrapper structs are the idiomatic way to achieve this, and with ExpressibleByStringLiteral are pretty ergonomic, but I wonder if there's a case for something like a "strong" typealias ("typecopy"?) that indicates e.g. "this is just a String but it's a particular kind of String and shouldn't be mixed with other Strings".
titanomachy · 12h ago
Yeah, most languages I've used are like this. E.g. rust/c/c++.
I guess the examples in TFA are golang? It's kind of nice that you don't have to define those wrapper types, they do make things a bit more annoying.
In C++ you have to be extra careful even with wrapper classes, because types are allowed to implicitly convert by default. So if Foo has a constructor that takes a single int argument, then you can pass an int anywhere Foo is expected. Fine as long as you remember to mark your constructors as explicit.
ghosty141 · 12h ago
Rust has the newtype idiom which works as proper type alias most of the time
ameliaquining · 12h ago
Both clang-tidy and cpplint can be configured to require all single-argument constructors (except move, copy, and initializer-list constructors) to be marked explicit, in order to avoid this pitfall.
dataflow · 12h ago
This sounds elegant in theory but very thorny in practice even with a standards change, at least in C++ (though I don't believe the issues are that particular to the language). Like how do you want the equivalent of std::cout << your_different_str to behave? What about with third-party functions and extension points that previously took strings?
jandrewrogers · 10h ago
Isn't that where C++20 concepts come in?
ameliaquining · 12h ago
In what precise way are you envisioning that this would be different from a wrapper struct?
frankus · 9h ago
Pretty much only less boilerplate. Definitely questionable if it's worth the added complexity. And also it could probably be a macro.
qcnguy · 11h ago
Haskell has this, it's called newtype.
In OOP languages as long as the type you want to specialize isn't final you can just create a subclass. It's cheap (no additional wrappers or boxes), easy, and you can specialize behavior if you want to.
Unfortunately for various good reasons Java makes String final, and String is one of the most useful types to specialize on.
buerkle · 10h ago
But then you are representing two distinct types as the same underlying type, String.
Leading to the original problem. I don't want to represent MyType as a String because it's not.
qcnguy · 10h ago
It has to work that way or else you can't use the standard library. What you want to block is not:
StringUtils.trim(String foo);
but
myApp.doSomething(AnotherMyType amt);
The latter is saying "I need not any string but a specific kind of string".
buerkle · 5h ago
No I want to block both. I don't want to give devs the option of creating a function doSomething(String) that happens to accept MyType. If I need to call trim then I'll do
StringUtils.trim(MyType.toString());
peterldowns · 15h ago
My friend Lukas has written about this before in more detail, and describes the general technique as "Safety Through Incompatibility". I use this approach in all of my golang codebases now and find it invaluable — it makes it really easy to do the right thing and really hard to accidentally pass the wrong kinds of IDs around.
Im on the opposite extreme here in that I believe typing obsession is the root of much of our problems as an industry.
I think Rich Hickey was completely right, this is all information and we just need to get better at managing information like we are supposed to.
The downside of this approach is that these systems are tremendously brittle as changing requirements make you comfort your original data model to fit the new requirements.
Most OOP devs have seen atleast 1 library with over 1000 classes. Rust doesn't solve this problem no matter how much I love it. Its the same problem of now comparing two things that are the same but are just different types require a bunch of glue code which can itself lead to new bugs.
Data as code seems to be the right abstraction. Schemas give validation a-la cart while still allowing information to be passed, merged, and managed using generic tools rather than needing to build a whole api for every new type you define in your mega monolith.
dajonker · 10h ago
A lot of us programmer folk are indefinitely in search of that one thing that will finally let us write the perfect, bug-free, high performance software. We take these concepts to the extreme and convince ourselves that it will absolutely work as long as we strictly do it the Right Way and only the Right Way. Then we try to convince to our fellow programmers that the Right Way will solve all of our problems and that it is the Only Way.
It will be great, it will be grand, it will be amazing.
rmunn · 4h ago
A wise person once told me that if you ever find yourself saying "if only everyone would just do X...", then you should stop right there. Never, ever, in the history of the world has everyone done X. No matter how good an idea X is, there will always be some people who say "No, I'm going to do Y instead." Maybe they're stupid, maybe they're evil, maybe they're just ignorant... or maybe, just maybe, X was not the best thing for their particular needs and Y was actually better for them.
This is an important concept to keep in mind. It applies to programming, it applies to politics, it applies to nearly every situation you can think of. Any time you find yourself wishing that everyone would just do X and the world would be a better place, realize that that is never going to happen, and that some people will choose to do Y — and some of them will even be right to do so, because you do not (and cannot) know the specific needs of every human being on the planet, so X will not actually be right for some of them.
jhhh · 4h ago
The problem with this post is that the author is conflating two different things. Using a type system to capture the units of a measurement or metric is straightforwardly better than having them be implied. Stripping a numeric value and unit down to just a value involves an obvious loss of information. That situation is wholly different than just wrapping your UUID in some bespoke type which doesn't provide any extra information. They just look the same because you're mechanically doing something similar (ie. wrapping some primitive or standard type in something else). Not to mention unless you want to make your wrappers monads you're going to have to unwrap them at some point anyway at which point you can still transpose the actual type when you have to call any external library/function. I would love to know what the test suites of these 'many bugs in real systems' projects looked like. I suspect the test suite coverage wasn't very good.
jerf · 15h ago
This technique makes me sad.
Not because it's a bad idea. Quite the contrary. I've sung the praises of it myself.
But because it's like the most basic way you can use a type system to prevent bugs. In both the sense used in the article, and in the sense that it is something you have to do to get the even more powerful tools brought to bear on the problem that type systems often.
And yet, in the real world, I am constantly explaining this to people and constantly fighting uphill battles to get people to do it, and not bypass it by using primitives as much as possible then bashing it into the strict type at the last moment, or even just trying to remove the types.
Here on HN we debate the finer points of whether we should be using dependent typing, and in the real world I'm just trying to get people to use a Username type instead of a string type.
Not always. There are some exceptions. And considered over my entire career, the trend is positive overall. But there's still a lot of basic explanations about this I have to give.
I wonder what the trend of LLM-based programming will result in after another few years. Will the LLMs use this technique themselves, or will people lean on LLMs to "just" fix the problems from using primitive types everywhere?
wvenable · 12h ago
I think if it were better supported in the majority of strictly typed programming languages then it would be used more. Most languages make it a big hassle.
gpderetta · 13h ago
Code review time:
+ int doTheThing(bool,bool,int,int);
Die a little bit inside.
jjice · 15h ago
Does anyone know the term for this? I had "Type Driven Development" in my head, but I don't know if that's a broadly used term for this.
It's a step past normal "strong typing", but I've loved this concept for a while and I'd love to have a name to refer to it by so I can help refer others to it.
jshxr · 15h ago
I've seen it being referred to as "New Type Pattern" or "New Type Idiom" in quite some places. For example in the rust-by-example book [1].
The overall idea of using your type system to enforce invariants is called typeful programming [1]. The first few sentences of that paper are:
"There exists an identifiable programming style based on the widespread use of type information handled through mechanical typechecking techniques. This typeful programming style is in a sense independent of the language it is embedded in; it adapts equally well to functional, imperative, object-oriented, and algebraic programming, and it is not incompatible with relational and concurrent programming."
> The strongly typed identifier commonly wraps the data type used as the primary key in the database, such as a string, an integer or universally unique identifier (UUID).
bcrosby95 · 15h ago
Using basic types for domain concepts is called 'primitive obsession'. It's been considered code smell for at least 25 years. So this would be... not being primitive obsessed. It isn't anything driven development.
Different people draw the line in different places for this. I've never tried writing code that takes every domain concept, no matter how small, and made a type out of it. It's always been on my bucket list though to see how it works out. I just never had the time or in-the-moment inclination to go that far.
Romario77 · 15h ago
I think often times it's enough to have enums for known ints, for example and have some parameter checking for ranges when known.
Some languages like C++ made a contracts concept where you could make these checks more formal.
As some people indicated the auto casting in many languages could make the implementation of these primitive based types complicated and fragile and provide more nuisance than it provides value.
bcrosby95 · 15h ago
Yep! I recently started playing with Ada and they make tightly specifying your types based upon primitives pretty easy. You also have some control over auto conversion based upon the specifics of how you declare them.
marcosdumay · 15h ago
"Type driven development" is usually meant to say you will specify your system behavior in the types. Often by writing the types first and having the actual program determined by them. Some times so completely determined that you can use some software (not an LLM) to write it. (The name is a joke about the other TDD.)
Izkata · 14h ago
It's taking the original idea behind Hungarian notation (now called "Apps Hungarian notation" to distinguish from "Systems Hungarian notation" which uses datatype) and moving it into the type system.
To keep building on history, I'd suggest Hungarian types.
b450 · 15h ago
The method in the article is very close to the idea of a "branded type". Though maybe there's a distinction someone can point out to me.
frou_dh · 15h ago
"newtype", or kind of (but not exactly) the opposite of "Primitive Obsession"
SideburnsOfDoom · 14h ago
You're correct that this is far from a new idea.
Relevant terms are "Value object" (1) and avoiding "Primitive obsession" where everything is "stringly typed".
Strongly typed ids should be Value Objects, but not all value objects are ids. e.g. I might have a value object that represents an x-y co-ordinate, as I would expect an object with value (2,3) to be equal to a different object with the same value.
I've used the approach described for uuids on a project and I liked it. We were using typescript so we went further using template literal types [1]
type UserId = `user:${uuid}`;
type OrgId = `org:${uuid}`;
This had the benefit that we could add validation (basic begins with kind of logic) and it was obvious upon visual inspection (e.g. in logs/debugging).
If writing the check is too tricky, sometimes it can just be easier to track the type of a value with the value (if you can be told the type externally) with tagged unions (AKA: Discriminated unions). See: https://www.typescriptlang.org/docs/handbook/typescript-in-5...
You can combine `unique symbol` with tagged unions and type predicates to make it easier to tell them apart.
oehpr · 8h ago
I assume you used these against a relational database? Did you commit those ids with the prefix still attached? or did you `.split()[1]` or something?
I think it's a pretty good idea. I'm just wondering how this translated to other systems.
stillpointlab · 7h ago
We were using Mongo and stored the ids with the prefix in the DB as the primary key. Pretty much everywhere we were passing them around as strings, never as 128 bit int, so there was no integrity checking outside of the app layer.
The only drawback was marshalling the types when they come out of the db layer. Since the db library types were string we had to hard cast them to the correct types, really my only pain. That isn't such a big deal, but it means some object creation and memory waste, like:
We normally didn't do it, but it would be at that time you could have some `function isObjectId(id:string) : id is ObjectId { id.beginsWith("object:"); }` wrapper for formal verification (and maybe throw exceptions on bad keys). And we were probably doing some type conversions anyway (e.g. `new Date(result.createdAt)`).
If we were reading stuff from the client or network, we would often do the verification step with proper error handling.
mcflubbins · 15h ago
I've actually seen this before and didn't realize this is exactly what the goal was. I just thought it was noise. In fact, just today I wrote a function that accepted three string arguments and was trying to decide if I should force the caller to parse them into some specific types, or do so in the function body and throw an error, or just live with it. This is exactly the solution I needed (because I actually don't NEED the parsed values.)
This is going to have the biggest impact on my coding style this year.
bern4444 · 5h ago
This seems like a conclusion derived from the ideas in parse don't validate[1].
The goal is to encode the information you learn while parsing your data into your type system. This unlocks so many capabilities: better error handling, making illegal states unrepresentable, better compiler checking, better autocompletion etc.
This is also an incredibly useful technique with LLMs. If you alias types (e.g. str to DateStr) the LLM can better infer which functions to select and how to compose them
m_a_u · 1h ago
How would you do this? Don't you have to define your types using JSON schema which supports only a limited set?
fny · 1h ago
For pure LLM use, not MCPs.
zzo38computer · 10h ago
There are benefits of such things, especially if it can be handled by the compiler so that it does not make the code inefficient. In some cases it might even automatically convert the type, but often it is better to not do so. Furthermore, there may be an operator to ignore the type and use the representation directly, which must be specified explicitly (in order to avoid bugs in the software involving doing it by mistake).
In the example, they are (it seems) converting between Celsius and Fahrenheit, using floating point. There is the possibility of minor rounding errors, although if you are converting between Celsius and Kelvin with integers only then these rounding errors do not occur.
In some cases, a function might be able to work with any units as long as the units match.
> Public and even private functions should often avoid dealing in floats or integers alone
In some cases it makes sense to use those types directly, e.g. many kind of purely mathematical functions (such as checking if a number is prime). When dealing with physical measurements, bit fields, ID numbers, etc, it does make sense to have types specifically for those things, although the compiler should allow to override the requirement of the more specific type in specific cases by an explicit operator.
There is another article about string types, but I think there is the problem of using text-based formats, that will lead to many of these problems, including needing escaping, etc.
recursivedoubts · 12h ago
they constantly try to escape
from the darkness outside & within
by dreaming of type systems so perfect
that no one will need to be good
but the strings that are will shadow
the abstract datatype that pretends to be
No comments yet
mk_chan · 15h ago
I’ve been using hacks to do this for a long time. I wish it was simpler in C++. I love C++ typing but hate the syntax and defaults. It’s so complicated to get started with.
There is no duck, just primitive types organized duck-wise.
The sooner you embrace the truth of mereological nihilism the better your abstractions will be.
Almost everything at every layer of abstraction is structure.
Understanding this will allow you to still use types, just not abuse them because you think they are "real".
Splizard · 9h ago
Go is a great language because it has distinct types by default, it's not about "making invalid states unrepresentable", it's about recording relationships about a particular type of value and where it can be used ie. it doesn't matter that UserID is just a string, what matters, is that now you can see what string values are UserIDs without making assumptions based on naming conventions.
beders · 11h ago
It is tempting, maybe a good first step, but often not expressive enough.
Especially and particularly attributes/fields/properties in an enterprise solution.
You want to associate various metadata - including at runtime - with a _value_ and use that as attribute/field/property in a container.
You want to be able to transport and combine these values in different ways, especially if your business domain is subject to many changes.
If you are tempted to use "classes" for this, you will sign up for significant pain later down the road.
jshxr · 15h ago
Unfortunately, this can be somewhat awkward to implement in certain structural typed languages like TypeScript. I often find myself writing something along the lines of
type UserID = string & { readonly __tag: unique symbol }
which always feels a bit hacky.
paldepind2 · 12h ago
I never understood why people are so keen to do that in TypeScript. With that definition a `UserID` can still be silently "coerced" to a `string` everywhere. So you only get halfway there to an encapsulated type.
I think it's a much better idea to do:
type UserID = { readonly __tag: unique symbol }
Now clients of `UserID` no longer knows anything about the representation. Like with the original approach you need a bit of casting, but that can be neatly encapsulated as it would be in the original approach anyway.
Izkata · 12h ago
There was a post a decade or more ago, I think written with Java, that used variables like "firstname", "lastname", "fullname", and "nickname" in its example, including some functions to convert between them. Does this sound familiar to anyone?
The examples were a bit less contrived than this, encoding business rules where you'd want nickname for most UI but real name for official notifications, and the type system prevented future devs from using the wrong one when adding new UI or emails.
abujazar · 9h ago
Separate types for each model id is an extremely tedious way of avoiding bugs that can easily be prevented by a single test.
dilap · 8h ago
Personally I like it, and it catches bugs right away, especially when there are multiple possible ids, e.g.
func AddMessage(u UserId, m MessageId)
If it's just
func AddMessage(userId, messageId string)
it's very easy to accidentally call as
AddMessage(messageId, userId)
and then best-case you are wasting time figuring out a test failure, and worst case trying to figure out the bug IRL.
V.S. an instant compile error.
I have seen errors like this many times, both written by myself and others. I think it's great to use the type system to eliminate this class of error!
(Especially in languages like Go that make it very low-friction to define the newtype.)
Another benefit if you're working with any sort of static data system is it makes it very easy to validate the data -- e.g. just recursively scan for instances of FooId and make sure they are actually foo, instead of having to write custom logic or schema for everywhere a FooId might occur.
mcapodici · 9h ago
There are other benefits over a test.
The compiler tests the type is correct wherever you use it. It is also documentation.
Still have tests! But types are great.
But sadly, in practice I don't often use a type per ID type because it is not idiomatic to code bases I work on. It's a project of its own to move a code base to be like that if it wasn't in the outset. Also most programming languages don't make it ergonomic.
somethingsome · 11h ago
I'm curious about what you think about something,
Supoose you make two simple types one for Kelvin K and the other for Fahrenheit F or degrees D.
And you implement the conversions between them in the types.
But then you have something like
d: D = 10;
For i=1...100000:
k=f_Take_D_Return_K(d)
d=g_Take_K_Return_D(k)
end
Then you will implicitly have many many automatic conversions that are not useful.
How to handle this? Is it easily catched by the compiler when the functions are way more complex?
tomtom1337 · 11h ago
I interpret your question as «given that I am doing many conversions between temperature, because that makes it easier to write correct code, then I worry that my code will be slow because I am doing many conversions».
My response is: these conversions are unlikely to be the slow step in your code, don’t worry about it.
I do agree though, that it would be nice if the compiler could simplify the math to remove the conversions between units. I don’t know of any languages that can do that.
somethingsome · 10h ago
That's exactly the problem, in the software I have in mind, the conversions are actually very slow, and I can't easily change the content of the functions that process the data, they are very mathematical, it would take much time to rewrite everything.
For example, it's not my case but it's like having to convert between two image representations (matrix multiply each pixel) every time.
I'm scared that this kind of 'automatic conversion' slowness will be extremely difficult to debug and to monitor.
tomtom1337 · 10h ago
Why would it be difficult to monitor the slowness? Wouldn’t a million function calls to the from_F_to_K function be very noticeable when profiling?
On your case about swapping between image representations: let’s say you’re doing a FFT to transform between real and reciprocal representations of an image - you probably have to do that transformation in order to do the the work you need doing on reciprocal space. There’s no getting around it. Or am I misunderstanding?
Please don’t take my response as criticism, I’m genuinely interested here, and enjoying the discussion.
somethingsome · 10h ago
I have many functions written by many scientists in a unique software over many years, some expect a data format the others another, it's not always the same function that is called, but all the functions could have been written using a unique data format. However, they chose the data format when writing the functions based on the application at hand at that moment and the possible acceleration of their algorithms with the selected data structure.
When I tried to refactor using types, this kind of problems became obvious. And forced more conversions than intended.
So I'm really curious because, a part from rewriting everything, I don't see how to avoid this problem. It's more natural for some applications to have the data format 1 and for others the data format 2. And forcing one over the other would make the application slow.
The problem arises only in 'hybrid' pipelines when new scientist need to use some existing functions some of them in the first data format, and the others in the other.
As a simple example, you can write rotations in a software in many ways, some will use matrix multiply, some Euler angles, some quaternions, some geometric algebra. It depends on the application at hand which one works the best as it maps better with the mental model of the current application. For example geometric algebra is way better to think about a problem, but sometimes Euler angles are output from a physical sensor. So some scientists will use the first, and the others the second. (of course, those kind of conversions are quite trivial and we don't care that much, but suppose each conversion is very expensive for one reason or another)
Came here to say this. This is an old thing. I’m guessing next we’ll rediscover “Stringly Typed”?
That refactoring guru raccoon reminds me of Minix for some reason.
fedeb95 · 15h ago
This works very well and I'd whish I'd convince my team members to use more this technique.
Moreover: you can separate types based on admitted values and perform runtime checks. Percentage, Money, etc.
kccqzy · 15h ago
This pattern is exactly the pattern I recommended two weeks ago in a thread about a nearly catastrophic OpenZFS bug https://news.ycombinator.com/item?id=44531524 in response to someone saying we should use AI to detect this class of bugs. I'm glad there are still people who think alike and opt for simpler, more deterministic solutions such as using the type system.
I think the rule of thumb here is to avoid every kind of runtime check that can be checked at compile time.
But if you have a function that works with different types you should make it more reusable.
It’s a good marker to yourself or to a review agent
tonymet · 13h ago
This is an accidental benefit of golang for those coming from python , perl or php. At first making structs and types is a pain. But within few hundred lines it’s a blessing .
Being forced to think early on types has a payoff at the medium complexity scale
manoDev · 10h ago
> In any nontrivial codebase, this inevitably leads to bugs when, for example, a string representing a user ID gets used as an account ID
Inevitably is a strong word. I can't recall the last time I've seen such bug in the wild.
> or when a critical function accepts three integer arguments and someone mixes up the correct order when calling it.
Positional arguments suck and we should rely on named/keyword arguments?
I understand the line of reasoning here, but the examples are bad. Those aren't good reasons to introduce new types. If you follow this advice, you'll end up with an insufferable codebase where 80% LoC is type casting.
Types are like database schemas. You should spend a lot of time thinking about semantics, not simply introduce new types because you want to avoid (hypothetical) programmer errors.
"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."
nsm · 15h ago
Highly recommend the Early Access book Data-Oriented Programming with Java by Chris Kiehl as another resource.
goostavos · 11h ago
Hey, I'm that guy! Thanks for the shout out!
cat-whisperer · 11h ago
I've been using this technique in Rust for years, really helps catch bugs early and makes code more readable. Wish more languages had similar type systems.
William_BB · 10h ago
What do you think about this but in C++ (e.g. with explicit constructors)? Has anyone had any experience with it? Did it succeed or fail?
alphazard · 14h ago
I've seen experienced programmers do this a lot. It's the kind of thing that someone thinks is annoying, without realizing that it was preventing them from doing something incorrect.
rhubarbtree · 12h ago
It can be annoying though.
I think Rich Hickey has a point that bugs like this almost certain get caught by running the program. If they make it into production it usually results in an obscure edge case.
I’m sure there are exceptions but unless you’re designing for the worst case (safety critical etc) rather than average case (web app), types come with a lot of trade offs.
I’ve been on the fence about types for a long time, but having built systems fast at a startup for years, I now believe dynamic typing is superior. Folks I know who have built similar systems and are excellent coders also prefer dynamic typing.
In my current startup we use typescript because the other team members like it. It does help replace comments when none are available, and it stops some bugs, but it also makes the codebase very hard to read and slows down dev.
A high quality test suite beats everything else hands down.
lurking_swe · 11h ago
An engineer getting up to speed on a 10 year old web app that uses dynamic types will likely have a very different opinion.
No types anywhere, so making a change is SCARY! And all the original engineers have usually moved on. Fun times. Types are a form of forced documentation after all, and help catch an entire class of bugs. If you’re really lucky, the project has good unit tests.
I think dynamic typing is wonderful for making software quickly, and it can be a force multiplier for startups. I also enjoy it when creating small services or utilities. But for a large web app, you’ll pay a price eventually. Or more accurately…the poor engineer that inherits your code in 10 years will pay the price. God bless them if they try to do a medium sized refactor without types lol. I’ve been on both ends of the spectrum here.
Pros and cons. There’s _always_ a tradeoff for the business.
rhubarbtree · 21m ago
I think you’re right.
But most startups aren’t building for 10 years out. If you use a lot of typing, you’ll probably die way before then. But yeah if you’re building a code base for the long term then use types unless you’re disciplined enough to write comments and good code.
As for refactoring, that is exactly what test suites are for.
bigstrat2003 · 5h ago
> I think Rich Hickey has a point that bugs like this almost certain get caught by running the program.
That is certainly correct... but that doesn't make it a good thing. One wants to catch bugs before the program is running, not after.
rhubarbtree · 20m ago
Depends which is quicker. If I catch a trivial bug on the first run, I just saved myself writing the type system to find it ahead of time.
Does this apply in Java where adding a type means every ID has to have a class instance in the heap? ChatGPT says I might want to wait for Project Valhalla value types.
chaz6 · 12h ago
The equivalent in Python is:-
from typing import NewType
UserId = NewType("UserId", int)
sdeframond · 12h ago
Actually, not really. In this case UserId is still an integer, which means any method that takes an integer can also take a UserId. Which means your co-workers are likely to just use integer out of habit.
Also, you can still do integer things with them, such as
> nonsense = UserId(1) + UserId(2)
lolive · 14h ago
Primitive object types in Java (String, Float, …) are final.
That blocks you from doing such tricks, as far as I understand.
viktorcode · 13h ago
The idea is to just wrap them in a unique type per intended use case, like AccountID, SessionID, etc. and inside the may contain a single field with String.
vemv · 15h ago
Most static type systems that I know of disappear at runtime. You literally cannot "use" them once deployed to production.
(Typescript's Zed and Clojure's Malli are counterexamples. Although not official offerings)
Following OP's example, what prevents you from getting a AccountID parsed as a UserID at runtime, in production? In production it's all UUIDs, undistinguishable from one another.
A truly safe approach would use distinct value prefixes – one per object type. Slack does this I believe.
Jtsummers · 14h ago
> Most static type systems that I know of disappear at runtime. You literally cannot "use" them once deployed to production.
That's part of the point of being static. If we can statically determine properties of the system and use that information in the derived machine code (or byte code or whatever), then we may be able to discard that information at runtime (though there are reasons not to discard it).
> Following OP's example, what prevents you from getting a AccountID parsed as a UserID at runtime, in production? In production it's all UUIDs, undistinguishable from one another.
If you're receiving information from the outside and converting it into data in your system you have to parse and validate it. If the UUID does not correspond to a UserID in your database or whatever, then the attempted conversion should fail. You'd have a guard like this:
if user_db.contains(UserID(uuid)) {
return UserID(uuid)
}
// signal an error or return a None, zero value, null, etc.
vemv · 14h ago
There are infinitely many runtime properties that are simply impossible to determine statically.
Static typing is just a tool, aiming to help with a subset of all possible problems you may find. If you think it's an absolute oracle of every possible problem you may find, sorry, that's just not true, and trivially demonstrable.
Your example already is a runtime check that makes no particular use of the type system. It's a simple "set contains" check (value-oriented, not type-oriented) which also is far more expensive than simply verifying the string prefix of a Slack-style object identifier.
Ultimately I'm not even saying that types are bad, or that static typing is bad. If you truly care about correctness, you'd use all layers at your disposition - static and dynamic.
No comments yet
skwee357 · 13h ago
This is one of the reasons I adore Rust. Creating custom types in Rust geel very native and effortless.
zeroCalories · 16h ago
I generally agree, but I think the real strength in types come from the way in which they act as documentation and help you refactor. If you see a well laid out data model in types you supercharge your ability to understand a complex codebase. Issues like the one in the example should have been caught by a unit test.
sam_lowry_ · 15h ago
Also validation. In Java, you can have almost seamless validation on instantiation of your very objects. That's why having a class for IBAN instead of String containing IBAN is the right way to do.
codr7 · 14h ago
Allocating objects for every single property can turn pretty bad in Java.
A strong enough type system would be a lot more useful.
Warwolt · 10h ago
Isn't this just the newtype pattern?
dajonker · 10h ago
It is. To some it is more fun to reinvent the wheel than to study history
This can solve a lot of problems, but also introduce awkward situations where it is hard to make a square shape or panel because the width measure must first be converted explicitly into a height measure in order to be used as such which might be considered correct but also expensively awkward and pedantic.
ho_schi · 15h ago
I'm not familiar with Go. Please correct me, but this reads like object oriented programming i.e. OOP for every kind of data?
Coming from C++, this kind of types with classes make sense. But also are a maintenance task with further issues, were often proper variable naming matters. Likely a good balance is the key.
Jtsummers · 15h ago
This isn't an OO thing at all. In C, to contrast with Go, a typedef is an alias. You can use objects (general sense) of the type `int` interchangeably with something like `myId` which is created through `typedef int myId`.
That is, this is perfectly acceptable C:
int x = 10;
myId id = x; // no problems
In Go the equivalent would be an error because it will not, automatically, convert from one type to another just because it happens to be structurally identical. This forces you to be explicit in your conversion. So even though the type happens to be an int, an arbitrary int or other types which are structurally ints cannot be accidentally converted to a myId unless you somehow include an explicit but unintended conversion.
ho_schi · 14h ago
Thank you for answer :)
This helped me! Especially because you started with typedef from C. Therefore I could relate. Others just downvote and don't explain.
The issues I see with this approach is when developers stop at this first level of type implementation. Everything is a type and nothing works well together, tons of types seem to be subtle permutations of each other, things get hard to reason about etc.
In systems like that I would actually rather be writing a weakly typed dynamic language like JS or a strongly typed dynamic language like Elixir. However, if the developers continue pushing logic into type controlled flows, eg:move conditional logic into union types with pattern matching, leverage delegation etc. the experience becomes pleasant again. Just as an example (probably not the actual best solution) the "DewPoint" function could just take either type and just work.
This would allow for some nice properties. It would also enable a bunch of small optimisations in our languages that we can't have today. Eg, I could make an integer that must fall within my array bounds. Then I don't need to do bounds checking when I index into my array. It would also allow a lot more peephole optimisations to be made with Option.
Weirdly, rust already kinda supports this within a function thanks to LLVM magic. But it doesn't support it for variables passed between functions.
procedure Sum_Demo is subtype Index is Integer range 0 .. 10; subtype Small is Integer range 0 .. 10;
begin for J in 1 .. 11 loop I := I + 1; end loop; end Sum_Demo;This compile, and the compiler will tell you: "warning: Constraint_Error will be raised at run time".
It's a stupid example for sure. Here's a more complex one:
This again compiles, but if you run it: raised CONSTRAINT_ERROR : sum_demo.adb:13 index check failedIt's a cute feature, but it's useless for anything complex.
*Checks watch*
We're going on 45 years now.
But yeah maybe expressive enough refinement typing leads to hard to write and slow type inference engines
I think the reasons are predominantly social, not theoretical.
For every engineer out there that gets excited when I say the words "refinement types" there are twenty that either give me a blank stare or scoff at the thought, since they a priori consider any idea that isn't already in their favorite (primitivistic) language either too complicated or too useless.
Then they go and reinvent it as a static analysis layer on top of the language and give it their own name and pat themselves on the back for "inventing" such a great check. They don't read computer science papers.
While I'm not entirely convinced myself whether it is worth the effort, it offers the ability to express "a number greater than 0". Using type narrowing and intersection types, open/closed intervals emerge naturally from that. Just check `if (a > 0 && a < 1)` and its type becomes `(>0)&(<1)`, so the interval (0, 1).
I also built a simple playground that has a PoC implementation: https://nikeee.github.io/typescript-intervals/
[1]: https://github.com/microsoft/TypeScript/issues/43505
My specific use case is pattern matching http status codes to an expected response type, and today I'm able to work around it with this kind of construct https://github.com/mnahkies/openapi-code-generator/blob/main... - but it's esoteric, and feels likely to be less efficient to check than what you propose / a range type.
There's runtime checking as well in my implementation, but it's a priority for me to provide good errors at build time
Otherwise you could have type level asserts more generally. Why stop at a range check when you could check a regex too? This makes the difficulty more clear.
For the simplest range case (pure assignment) you could just use an enum?
mainly I find Raku (and the community) much -Ofun
https://play.rust-lang.org/?version=stable&mode=debug&editio...
rust-analyzer gives an error directly in IDE.
Among the popular languages like golang, rust or python typescript has the most powerful type system.
How about a type with a number constrained between 0 and 10? You can already do this in typescript.
You can even programmatically define functions at the type level. So you can create a function that outputs a type between 0 to N. The issue here is that it’s a bit awkward you want these types to compose right? If I add two constrained numbers say one with max value of 3 and another with max value of two the result should be max value of 5. Typescript doesn’t support this by default with default addition. But you can create a function that does this. The issue is to create these functions you have to use tuples to do addition at the type level and you need to use recursion as well. Typescript recursion stops at 100 so there’s limits.Additionally it’s not intrinsic to the type system. Like you need peanno numbers built into the number system and built in by default into the entire language for this to work perfectly. That means the code in the function is not type checked but if you assume that code is correct then this function type checks when composed with other primitives of your program.
I get an error that I can't assign something that seems to me assignable, and to figure out why I need to study functions at type level using tuples and recursion. The cure is worse than the disease.
If you trust the type, then it's fine. The code is safer. In the world of of the code itself things are easier.
Of course like what you're complaining about, this opens up the possibility of more bugs in the world of types, and debugging that can be a pain. Trade offs.
In practice people usually don't go crazy with type level functions. They can do small stuff, but usually nothing super crazy. So type script by design sort of fits the complexity dynamic you're looking for. Yes you can do type level functions that are super complex, but the language is not designed around it and it doesn't promote that style either. But you CAN go a little deeper with types then say a language with less power in the type system like say Rust.
I'll take a modern hindley milner variant any day. Sophisticated enough to model nearly any type information you'll have need of, without blurring the lines or admitting the temptation of encoding complex logic in it.
https://youtu.be/0mCsluv5FXA
In practice nobody goes too crazy with it. You have a problem with a feature almost nobody uses. It's there and Range<N> is like the upper bound of complexity I've seen in production but that is literally extremely rare as well.
There is no "temptation" of coding complex logic in it at all as the language doesn't promote these features at all. It's just available if needed. It's not well known but typescript types can be easily used to be 1 to 1 with any hindley milner variant. It's the reputational baggage of JS and frontend that keeps this fact from being well known.
In short: Typescript is more powerful then hindley milner, a subset of it has one to one parity with it, the parts that are more powerful then hindley milner aren't popular and used that widely nor does the flow of the language itself promote there usage. The feature is just there if you need it.
If you want a language where you do this stuff in practice take a look at Idris. That language has these features built into the language AND it's an ML style language like haskell.
Static typing / dynamic typing refers to whether types are checked at compile time or runtime. "Static" = compile time (eg C, C++, Rust). "Dynamic" = runtime (eg Javascript, Ruby, Excel)
Strong / weak typing refers to how "wibbly wobbly" the type system is. x86 assembly language is "weakly typed" because registers don't have types. You can do (more or less) any operation with the value in any register. Like, you can treat a register value as a float in one instruction and then as a pointer during the next instruction.
Ruby is strongly typed because all values in the system have types. Types affects what you can do. If you treat a number like its an array in ruby, you get an error. (But the error happens at runtime because ruby is dynamically typed - thus typechecking only happens at runtime!).
Sure it stops you from running into "'1' + 2" issues, but won't stop you from yeeting VeryRawUnvalidatedResponseThatMightNotBeAuthorized to a function that takes TotalValidatedRequestCanUseDownstream. You won't even notice an issue until:
- you manually validate
- you call a method that is unavailable on the wrong object.
Related Stack Overflow post: https://stackoverflow.com/questions/2690544/what-is-the-diff...
So yeah I think we should just give up these terms as a bad job. If people mean "static" or "dynamic" then they can say that, those terms have basically agreed-upon meanings, and if they mean things like "the type system prohibits [specific runtime behavior]" or "the type system allows [specific kind of coercion]" then it's best to say those things explicitly with the details filled in.
It says:
> I give the following general definitions for strong and weak typing, at least when used as absolutes:
> Strong typing: A type system that I like and feel comfortable with
> Weak typing: A type system that worries me, or makes me feel uncomfortable
https://news.ycombinator.com/item?id=42367644
A month before that:
https://news.ycombinator.com/item?id=41630705
I've given up since then.
[1] https://doc.rust-lang.org/book/ch03-01-variables-and-mutabil...
let a = 1;
let a = '1';
Strongly typing means I can do 1 + '1' variable names and types has nothing to do with it being strongly typed.
This is where the concept of “Correct by construction” comes in. If any of your code has a precondition that a UUID is actually unique then it should be as hard as possible to make one that isn’t. Be it by constructors throwing exceptions, inits returning Err or whatever the idiom is in your language of choice, the only way someone should be able to get a UUID without that invariant being proven is if they really *really* know what they’re doing.
(Sub UUID and the uniqueness invariant for whatever type/invariants you want, it still holds)
This is one of the basic features of object-oriented programming that a lot of people tend to overlook these days in their repetitive rants about how horrible OOP is.
One of the key things OO gives you is constructors. You can't get an instance of a class without having gone through a constructor that the class itself defines. That gives you a way to bundle up some data and wrap it in a layer of validation that can't be circumvented. If you have an instance of Foo, you have a firm guarantee that the author of Foo was able to ensure the Foo you have is a meaningful one.
Of course, writing good constructors is hard because data validation is hard. And there are plenty of classes out there with shitty constructors that let you get your hands on broken objects.
But the language itself gives you direct mechanism to do a good job here if you care to take advantage of it.
Functional languages can do this too, of course, using some combination of abstract types, the module system, and factory functions as convention. But it's a pattern in those languages where it's a language feature in OO languages. (And as any functional programmer will happily tell you, a design pattern is just a sign of a missing language feature.)
Does this count as a missing language feature by requiring a "factory pattern" to achieve that?
Throwing an error is doing exactly that though, its exactly the same thing in theory.
What you are asking for is just more syntactic sugar around error handling, otherwise all of that already exists in most languages. If you are talking about performance that can easily be optimized at compile time for those short throw catch syntactic sugar blocks.
Java even forces you to handle those errors in code, so don't say that these are silent there is no reason they need to be.
Convention in OOP languages is (un?)fortunately to just throw an exception though.
No comments yet
Nothing stops you from returning Result<CorrectObject,ConstructorError> in CorrectObject::new(..) function because it's just a regular function struct field visibility takes are if you not being able to construct incorrect CorrectObject.
What sucks about OOP is that it also holds your hand into antipatterns you don't necessarily want, like adding behavior to what you really just wanted to be a simple data type because a class is an obvious junk drawer to put things.
And, like your example of a problem in FP, you have to be eternally vigilant with your own patterns to avoid antipatterns like when you accidentally create a system where you have to instantiate and collaborate multiple classes to do what would otherwise be a simple `transform(a: ThingA, b: ThingB, c: ThingC): ThingZ`.
Finally, as "correct by construction" goes, doesn't it all boil down to `createUUID(string): Maybe<UUID>`? Even in an OOP language you probably want `UUID.from(string): Maybe<UUID>`, not `new UUID(string)` that throws.
One way to think about exceptions is that they are a pattern matching feature that privileges one arm of the sum type with regards to control flow and the type system (with both pros and cons to that choice). In that sense, every constructor is `UUID.from(string): MaybeWithThrownNone<UUID>`.
In other words, exceptions are for cases where the programmer screwed up. While programmers screwing up isn't unusual at all, programmers like to think that they don't make mistakes, and thus in their eye it is unusual. That is what sets it apart from environmental failures, which are par for the course.
To put it another way, it is for signalling at runtime what would have been a compiler error if you had a more advanced compiler.
Just Java (and Javascript by extension, as it was trying to copy Java at the time), really. You do have a point that Java programmers have infected other languages with their bad habits. For example, Ruby was staunchly in the "return errors as values and leave exception handling for exceptions" before Rails started attracting Java developers, but these days all bets are off. But the "purists" don't advocate for it.
I still follow TDD-with-a-test for all new features, all edge cases and all bugs that I can't trigger failure by changing the type system for.
However, red-green-refactor-with-the-type-system is usually quick and can be used to provide hard guarantees against entire classes of bug.
It is always great when something is so elegantly typed that I struggle to think of how to write a failing test.
What drives me nuts is when there are testing left around basically testing the compiler that never were “red” then “greened” makes me wonder if there is some subtle edge case I am missing.
Now I just think of types as the test suite’s first line of defense. Other commenters who mention the power of types for documentation and refactoring aren’t wrong, but I think that’s because types are tests… and good tests, at almost any level, enable those same powers.
However, Im convinced that theyre both part of the same class of thing, and that "TDD" or red/green/refactor or whatever you call it works on that class, not specifically just on tests.
Documentation is a funny one too - I use my types to generate API and other sorts of reference docs and tests to generate how-to docs. There is a seemingly inextricable connection between types and reference docs, tests and how-to docs.
You can always enforce nominal types if you really need it.
Welcome to typescript. Where generics are at the heart of our generic generics that throw generics of some generic generic geriatric generic that Bob wrote 8 years ago.
Because they can’t reason with the architecture they built, they throw it at the type system to keep them in line. It works most of the time. Rust’s is beautiful at barking at you that you’re wrong. Ultimately it’s us failing to design flexibility amongst ever increasing complexity.
Remember when “Components” where “Controls” and you only had like a dozen of them?
Remember when a NN was only a few hundred thousand parameters?
As complexity increases with computing power, so must our understanding of it in our mental model.
However you need to keep that mental model in check, use it. If it’s typing, do it. If it’s rigorous testing, write your tests. If it’s simulation, run it my friend. Ultimately, we all want better quality software that doesn’t break in unexpected ways.
You might go with:
And so on.That looks nice, but when you try to pattern match on it and have your pattern matching return the types that are associated with the specific operation, it won't work. The reason is that Typescript does not natively support GADTs. Libs like ts-pattern use some tricks to get closish at least.
And while this might not be very important for most application developers, it is very important for library authors, especially to make libraries interoperable with each other and extend them safely and typesafe.
https://beartype.readthedocs.io/en/latest/
First, the library author cannot reasonably define what is and isn't a checked exception in their public API. That really is up to the decision of the client. This wouldn't be such a big deal if it weren't so verbose to handle exceptions though: if you could trivially convert an exception to another type, or even declare it as runtime, maybe at the module or application level, you wouldn't be forced to handle them in these ways.
Second, to signature brittleness, standard advice is to create domain specific exceptions anyways. Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose... see above.
Ultimately, I love checked exceptions. I just hate the ergonomics around exceptions in Java. I wish designers focused more on fixing that than throwing the baby out with the bathwater.
https://news.ycombinator.com/item?id=44551088
https://news.ycombinator.com/item?id=44432640
> Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose
The problem just compounds too. People start checking things that they can’t handle from the functions they’re calling. The callers upstream can’t possibly handle an error from the code you’re calling, they have no idea why it’s being called.
I also hate IOException. It’s so extremely unspecific. It’s the worst way to do exceptions. Did the entire disk die or was the file not just found or do I not have permissions to write to it? IOException has no meaning.
Part of me secretly hopes Swift takes over because I really like its error handling.
Personally I use checked exceptions whenever I can't use Either<> and avoid unchecked like a plague.
Yeah, it's pretty sad Java language designer just completely deserted exception handling. I don't think there's any kind of improvement related to exceptions between Java 8 and 24.
To me they seem completely isomorphic?
But after experimenting a bit with checked exceptions, I realized how neglected exceptions are in Java. - There's no other way to handle checked exceptions other than try-catch block - They play very badly with API that use functional interfaces. Many APIs don't provide checked throws variant - catch block can't use generic / parameterized type, you need to catch Exception or Throwable then operate on it at runtime
After rolling my own Either<L,R>, it felt like a customizable typesafe macro for exception handling. It addresses all the annoyances I had with checked exception handling, and it plays nicely with exhaustive pattern matching using `sealed`.
Granted, it has the drawback that sometimes I have to explicitly spell out types due to local type inference failing to do so. But so far it has been a pleasant experience of handling error gracefully.
try/catch has significantly more complex call sites because it affects control flow.
[1] https://ericlippert.com/2008/09/10/vexing-exceptions/
So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).
It doesn't, you can just declare that the function throws these as well, you don't have to handle it directly.
This is annoying enough to deal with in concrete code, but interfaces make it a nightmare.
A problem easily solved by writing business logic in pure java code without any IO and handling the exceptions gracefully at the boundary.
Think of the complaints around function coloring with async, how it's "contagious". Checked exceptions have the same function color problem. You either call the potential thrower from inside a try/catch or you declare that the caller will throw an exception.
Incidentally, for exceptions, Java had (b), but for a long time didn't have (a) (although I think this changed?), leading to (b) being abused.
That's the point! The whole reason for checked exceptions is to gain the benefit of knowing if a function starts throwing an exception that it didn't before, so you can decide how to handle it. It's a good thing, not a bad thing! It's no different from having a type system which can tell you if the arguments to a function change, or if its return type does.
In fact, at each layer, if you want to propagate an error, you have to convert it to one specific to that layer.
I also think its a bit cleaner to have a nicely pattern matched handler blocks than bespoke handling at every level. That said, if unwrapped error results have a robust layout then its probably pretty equivalent.
Checked exceptions feel like a bad mix of error returns and colored functions to me.
But for one, Java checked exceptions don't work with generics.
I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Maybe an elaborate type system worth it, but maybe not (especially if there are good tests.)
https://grugbrain.dev/#grug-on-type-systems
Presumably you need to know what an Account and a User are to use that software in the first place. I can't imagine a reasonable person easily understanding a getAccountById function which takes one argument of type UUID, but having trouble understanding a getAccountById function which takes one argument of type AccountId.
What he means is that by introducing a layer of indirection via a new type you hide the physical reality of the implementation (int vs. string).
The physical type matters if you want to log it, save to a file etc.
So now for every such type you add a burden of having to undo that indirection.
At which point "is it worth it?" is a valid question.
You made some (but not all) mistakes impossible but you've also introduced that indirection that hides things and needs to be undone by the programmer.
> There is a UI for memorialising users, but I assured her that the pros simply ran a bit of code in the PHP debugger. There’s a function that takes two parameters: one the ID of the person being memorialised, the other the ID of the person doing the memorialising. I gave her a demo to show her how easy it was....And that’s when I entered Clowntown....I first realised something was wrong when I went back to farting around on Facebook and got prompted to login....So in case you haven’t guessed what I got wrong yet, I managed to get the arguments the wrong way round. Instead of me memorialising my test user, my test user memorialised me.
I'd much rather deal with the 2nd version than the first. It's self-documenting and prevents errors like calling "foo(userId, accountId)" letting the compiler test for those cases. It also helps with more complex data structures without needing to create another type.
I now know I never know whenever "a UUID" is stored or represented as a GUIDv1 or a UUIDv4/UUIDv7.
I know it's supposed to be "just 128 bits", but somehow, I had a bunch of issues running old Java servlets+old Java persistence+old MS SQL stack that insisted, when "converting" between java.util.UUID to MS SQL Transact-SQL uniqueidentifier, every now and then, that it would be "smart" if it flipped the endianess of said UUID/GUID to "help me". It got to a point where the endpoints had to manually "fix" the endianess and insert/select/update/delete for both the "original" and the "fixed" versions of the identifiers to get the expected results back.
(My educated guess it's somewhat similar to those problems that happens when your persistence stack is "too smart" and tries to "fix timezones" of timestamps you're storing in a database for you, but does that wrong, some of the time.)
They are generated with different algorithms, if you find these distinctions to be semantically useful to operations, carry that distinction into the type.
Seems like 98% of the time it wouldn’t matter.
I generally agree that it's easy to over-do, but can be great if you have a terse, dense, clear language/framework/docs, so you can instantly learn about UserID.
It is however useful to return a UUID type, instead of a [16]byte, or a HTMLNode instead of a string etc. These discriminate real, computational differences. For example the method that gives you a string representation of an UUID doesn't care about the surrounding domain it is used in.
Distinguishing a UUID from an AccountID, or UserID is contextual, so I rather communicate that in the aggregate. Same for Celsius and Fahrenheit. We also wouldn't use a specialized type for date times in every time zone.
Yes, that’s exactly the point. If you don’t know how to acquire an AccountID you shouldn’t just be passing a random string or UUID into a function that accepts an AccountID hoping it’ll work, you should have acquired it from a source that gives out AccountIDs!
never escape anything, either
just hand my users a raw SQL connection
https://news.ycombinator.com/item?id=44677515
In general, I think this largely falls when you have code that wants to just move bytes around intermixed with code that wants to do some fairly domain specific calculations. I don't have a better way of phrasing that, at the moment. :(
There are cases where you have the data in hand but now you have to look for how to create or instantiate the types before you can do anything with it, and it can feel like a scavenger hunt in the docs unless there's a cookbook/cheatsheet section.
One example is where you might have to use createVector(x, y, z): Vector when you already have { x, y, z }. And only then can you createFace(vertices: Vector[]): Face even though Face is just { vertices }. And all that because Face has a method to flip the normal or something.
Another example is a library like Java's BouncyCastle where you have the byte arrays you need, but you have to instantiate like 8 different types and use their methods on each other just to create the type that lets you do what you wish was just `hash(data, "sha256")`.
I’ve also done variations of this in TypeScript and Rust.
[1] enum class from C++11, classic enums have too many implicit conversions to be of any use.
They're fairly useful still (and since C++11 you can specify their underlying type), you can use them as namespaced macro definitions
Kinda hard to do "bitfield enums" with enum class
They're fairly useful still (and since C++11 you can specify their underlying type), you can use them as namespaced macro definitions
The name means "Value Object Generator" as it uses Source generation to generate the "Value object" types.
That readme has links to similar libraries and further reading.
Once you have several of these types, and they have validation and other concerns then the cost-benefit might flip.
FYI, In modern c#, you could try using "readonly record struct" in order to get lots of equality and other concerns generated for you. It's like a "whole library" but it's a compiler feature.
I prefer to have the generated code to be the part of the code repo. That's why I use code templates instead of source generators. But a properly constructed ID type has a non-trivial amount of code: https://github.com/vborovikov/pwsh/blob/main/Templates/ItemT...
That is correct, I've looked at the generated code and it's non-trivial, especially when validation, serialisation and casting concerns are present. That's why I'd want it to be common, tested code.
I want it for a case where it seems very well suited - all customer ids are strings, but only very specific strings are customer ids. And there are other string ids around as well.
IMHO Migration won't be hard - you could allow casts to/from the primitive type while you change code. Temporarily disallowing these casts will show you where you need to make changes.
I don't know yet how "close to the edges" you would have to go back to the primitive types in ordered for json and db serialisation to work.
But it would be easier to get in place in a new "green field" codebase. I pitched it as a refactoring, but the other people were well, "antithetical" is a good word.
I have tried to bring that to the prolog world [2] but I don't think my fellow prolog programmers are very receptive to the idea ^^.
[1] https://mpusz.github.io/mp-units/latest/
[2] https://github.com/kwon-young/units
https://kotlinlang.org/docs/inline-classes.html
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3320.htm
https://doc.rust-lang.org/rust-by-example/generics/new_types...
My biggest problem has been people not specifying their units. On our own code end I'm constantly getting people to suffix variables with the units. But there's still data from clients, standard library functions, etc. where the units aren't specified!
Almost nothing is a number. A length is not a number, an age is not a number, a phone number is not a number - sin(2inches) is meaningless, 30years^2 is meaningless, phone#*2 is meaningless, and 2inches+30years is certainly meaningless - but most of our languages permit us to construct, and use, and confuse these meaningless things.
Go will not automatically cast a variable of one type to another. That still has to be done explicitly.
https://go.dev/play/p/4eNQOJSmGqDI teach Go a few times a year, and this comes up a few times a year. I've not got a good answer why this is consistent with such an otherwise-explicit language.
When you write 42 in Go, it’s not an int32 or int64 or some more specific type. It’s automatically inferred to have the correct type. This applies even for user-defined numeric types.
Wrapper structs are the idiomatic way to achieve this, and with ExpressibleByStringLiteral are pretty ergonomic, but I wonder if there's a case for something like a "strong" typealias ("typecopy"?) that indicates e.g. "this is just a String but it's a particular kind of String and shouldn't be mixed with other Strings".
I guess the examples in TFA are golang? It's kind of nice that you don't have to define those wrapper types, they do make things a bit more annoying.
In C++ you have to be extra careful even with wrapper classes, because types are allowed to implicitly convert by default. So if Foo has a constructor that takes a single int argument, then you can pass an int anywhere Foo is expected. Fine as long as you remember to mark your constructors as explicit.
In OOP languages as long as the type you want to specialize isn't final you can just create a subclass. It's cheap (no additional wrappers or boxes), easy, and you can specialize behavior if you want to.
Unfortunately for various good reasons Java makes String final, and String is one of the most useful types to specialize on.
https://lukasschwab.me/blog/gen/deriving-safe-id-types-in-go...
https://lukasschwab.me/blog/gen/safe-incompatibility.html
I think Rich Hickey was completely right, this is all information and we just need to get better at managing information like we are supposed to.
The downside of this approach is that these systems are tremendously brittle as changing requirements make you comfort your original data model to fit the new requirements.
Most OOP devs have seen atleast 1 library with over 1000 classes. Rust doesn't solve this problem no matter how much I love it. Its the same problem of now comparing two things that are the same but are just different types require a bunch of glue code which can itself lead to new bugs.
Data as code seems to be the right abstraction. Schemas give validation a-la cart while still allowing information to be passed, merged, and managed using generic tools rather than needing to build a whole api for every new type you define in your mega monolith.
This is an important concept to keep in mind. It applies to programming, it applies to politics, it applies to nearly every situation you can think of. Any time you find yourself wishing that everyone would just do X and the world would be a better place, realize that that is never going to happen, and that some people will choose to do Y — and some of them will even be right to do so, because you do not (and cannot) know the specific needs of every human being on the planet, so X will not actually be right for some of them.
Not because it's a bad idea. Quite the contrary. I've sung the praises of it myself.
But because it's like the most basic way you can use a type system to prevent bugs. In both the sense used in the article, and in the sense that it is something you have to do to get the even more powerful tools brought to bear on the problem that type systems often.
And yet, in the real world, I am constantly explaining this to people and constantly fighting uphill battles to get people to do it, and not bypass it by using primitives as much as possible then bashing it into the strict type at the last moment, or even just trying to remove the types.
Here on HN we debate the finer points of whether we should be using dependent typing, and in the real world I'm just trying to get people to use a Username type instead of a string type.
Not always. There are some exceptions. And considered over my entire career, the trend is positive overall. But there's still a lot of basic explanations about this I have to give.
I wonder what the trend of LLM-based programming will result in after another few years. Will the LLMs use this technique themselves, or will people lean on LLMs to "just" fix the problems from using primitive types everywhere?
It's a step past normal "strong typing", but I've loved this concept for a while and I'd love to have a name to refer to it by so I can help refer others to it.
[1] https://doc.rust-lang.org/rust-by-example/generics/new_types...
"There exists an identifiable programming style based on the widespread use of type information handled through mechanical typechecking techniques. This typeful programming style is in a sense independent of the language it is embedded in; it adapts equally well to functional, imperative, object-oriented, and algebraic programming, and it is not incompatible with relational and concurrent programming."
[1] Luca Cardelli, Typeful Programming, 1991. http://www.lucacardelli.name/Papers/TypefulProg.pdf
[2] https://news.ycombinator.com/item?id=18872535
https://en.wikipedia.org/wiki/Strongly_typed_identifier
> The strongly typed identifier commonly wraps the data type used as the primary key in the database, such as a string, an integer or universally unique identifier (UUID).
Different people draw the line in different places for this. I've never tried writing code that takes every domain concept, no matter how small, and made a type out of it. It's always been on my bucket list though to see how it works out. I just never had the time or in-the-moment inclination to go that far.
Some languages like C++ made a contracts concept where you could make these checks more formal.
As some people indicated the auto casting in many languages could make the implementation of these primitive based types complicated and fragile and provide more nuisance than it provides value.
To keep building on history, I'd suggest Hungarian types.
Relevant terms are "Value object" (1) and avoiding "Primitive obsession" where everything is "stringly typed".
Strongly typed ids should be Value Objects, but not all value objects are ids. e.g. I might have a value object that represents an x-y co-ordinate, as I would expect an object with value (2,3) to be equal to a different object with the same value.
1) https://martinfowler.com/bliki/ValueObject.html
https://en.wikipedia.org/wiki/Value_object
1. https://www.typescriptlang.org/docs/handbook/2/template-lite...
But depending on the format it can sometimes be tricky to narrow a string back down to that format.
We have type guards to do that narrowing. (see: https://www.typescriptlang.org/docs/handbook/2/narrowing.htm..., but their older example is a little easier to read: https://www.typescriptlang.org/docs/handbook/advanced-types....)
If writing the check is too tricky, sometimes it can just be easier to track the type of a value with the value (if you can be told the type externally) with tagged unions (AKA: Discriminated unions). See: https://www.typescriptlang.org/docs/handbook/typescript-in-5...
And if the formats themselves are generated at runtime and you can use the "unique" keyword to make sure different kinds of data are treated as separate (see: https://www.typescriptlang.org/docs/handbook/symbols.html#un...).
You can combine `unique symbol` with tagged unions and type predicates to make it easier to tell them apart.
I think it's a pretty good idea. I'm just wondering how this translated to other systems.
The only drawback was marshalling the types when they come out of the db layer. Since the db library types were string we had to hard cast them to the correct types, really my only pain. That isn't such a big deal, but it means some object creation and memory waste, like:
We normally didn't do it, but it would be at that time you could have some `function isObjectId(id:string) : id is ObjectId { id.beginsWith("object:"); }` wrapper for formal verification (and maybe throw exceptions on bad keys). And we were probably doing some type conversions anyway (e.g. `new Date(result.createdAt)`).If we were reading stuff from the client or network, we would often do the verification step with proper error handling.
This is going to have the biggest impact on my coding style this year.
The goal is to encode the information you learn while parsing your data into your type system. This unlocks so many capabilities: better error handling, making illegal states unrepresentable, better compiler checking, better autocompletion etc.
[1]https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...
In the example, they are (it seems) converting between Celsius and Fahrenheit, using floating point. There is the possibility of minor rounding errors, although if you are converting between Celsius and Kelvin with integers only then these rounding errors do not occur.
In some cases, a function might be able to work with any units as long as the units match.
> Public and even private functions should often avoid dealing in floats or integers alone
In some cases it makes sense to use those types directly, e.g. many kind of purely mathematical functions (such as checking if a number is prime). When dealing with physical measurements, bit fields, ID numbers, etc, it does make sense to have types specifically for those things, although the compiler should allow to override the requirement of the more specific type in specific cases by an explicit operator.
There is another article about string types, but I think there is the problem of using text-based formats, that will lead to many of these problems, including needing escaping, etc.
No comments yet
https://github.com/Mk-Chan/libchess/blob/master/internal/Met... https://github.com/Mk-Chan/libchess/blob/master/Square.h
[1] https://typing.python.org/en/latest/spec/aliases.html
There is no duck, just primitive types organized duck-wise.
The sooner you embrace the truth of mereological nihilism the better your abstractions will be.
Almost everything at every layer of abstraction is structure.
Understanding this will allow you to still use types, just not abuse them because you think they are "real".
Especially and particularly attributes/fields/properties in an enterprise solution.
You want to associate various metadata - including at runtime - with a _value_ and use that as attribute/field/property in a container.
You want to be able to transport and combine these values in different ways, especially if your business domain is subject to many changes.
If you are tempted to use "classes" for this, you will sign up for significant pain later down the road.
I think it's a much better idea to do:
Now clients of `UserID` no longer knows anything about the representation. Like with the original approach you need a bit of casting, but that can be neatly encapsulated as it would be in the original approach anyway.The examples were a bit less contrived than this, encoding business rules where you'd want nickname for most UI but real name for official notifications, and the type system prevented future devs from using the wrong one when adding new UI or emails.
V.S. an instant compile error.
I have seen errors like this many times, both written by myself and others. I think it's great to use the type system to eliminate this class of error!
(Especially in languages like Go that make it very low-friction to define the newtype.)
Another benefit if you're working with any sort of static data system is it makes it very easy to validate the data -- e.g. just recursively scan for instances of FooId and make sure they are actually foo, instead of having to write custom logic or schema for everywhere a FooId might occur.
The compiler tests the type is correct wherever you use it. It is also documentation.
Still have tests! But types are great.
But sadly, in practice I don't often use a type per ID type because it is not idiomatic to code bases I work on. It's a project of its own to move a code base to be like that if it wasn't in the outset. Also most programming languages don't make it ergonomic.
Supoose you make two simple types one for Kelvin K and the other for Fahrenheit F or degrees D.
And you implement the conversions between them in the types.
But then you have something like
d: D = 10;
For i=1...100000:
endThen you will implicitly have many many automatic conversions that are not useful. How to handle this? Is it easily catched by the compiler when the functions are way more complex?
My response is: these conversions are unlikely to be the slow step in your code, don’t worry about it.
I do agree though, that it would be nice if the compiler could simplify the math to remove the conversions between units. I don’t know of any languages that can do that.
For example, it's not my case but it's like having to convert between two image representations (matrix multiply each pixel) every time.
I'm scared that this kind of 'automatic conversion' slowness will be extremely difficult to debug and to monitor.
On your case about swapping between image representations: let’s say you’re doing a FFT to transform between real and reciprocal representations of an image - you probably have to do that transformation in order to do the the work you need doing on reciprocal space. There’s no getting around it. Or am I misunderstanding?
Please don’t take my response as criticism, I’m genuinely interested here, and enjoying the discussion.
When I tried to refactor using types, this kind of problems became obvious. And forced more conversions than intended.
So I'm really curious because, a part from rewriting everything, I don't see how to avoid this problem. It's more natural for some applications to have the data format 1 and for others the data format 2. And forcing one over the other would make the application slow.
The problem arises only in 'hybrid' pipelines when new scientist need to use some existing functions some of them in the first data format, and the others in the other.
As a simple example, you can write rotations in a software in many ways, some will use matrix multiply, some Euler angles, some quaternions, some geometric algebra. It depends on the application at hand which one works the best as it maps better with the mental model of the current application. For example geometric algebra is way better to think about a problem, but sometimes Euler angles are output from a physical sensor. So some scientists will use the first, and the others the second. (of course, those kind of conversions are quite trivial and we don't care that much, but suppose each conversion is very expensive for one reason or another)
I didn't find it a criticism :)
Also relevant https://refactoring.guru/smells/primitive-obsession
That refactoring guru raccoon reminds me of Minix for some reason.
Moreover: you can separate types based on admitted values and perform runtime checks. Percentage, Money, etc.
But if you have a function that works with different types you should make it more reusable.
It’s a good marker to yourself or to a review agent
Being forced to think early on types has a payoff at the medium complexity scale
I understand the line of reasoning here, but the examples are bad. Those aren't good reasons to introduce new types. If you follow this advice, you'll end up with an insufferable codebase where 80% LoC is type casting.
Types are like database schemas. You should spend a lot of time thinking about semantics, not simply introduce new types because you want to avoid (hypothetical) programmer errors.
"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."
I think Rich Hickey has a point that bugs like this almost certain get caught by running the program. If they make it into production it usually results in an obscure edge case.
I’m sure there are exceptions but unless you’re designing for the worst case (safety critical etc) rather than average case (web app), types come with a lot of trade offs.
I’ve been on the fence about types for a long time, but having built systems fast at a startup for years, I now believe dynamic typing is superior. Folks I know who have built similar systems and are excellent coders also prefer dynamic typing.
In my current startup we use typescript because the other team members like it. It does help replace comments when none are available, and it stops some bugs, but it also makes the codebase very hard to read and slows down dev.
A high quality test suite beats everything else hands down.
No types anywhere, so making a change is SCARY! And all the original engineers have usually moved on. Fun times. Types are a form of forced documentation after all, and help catch an entire class of bugs. If you’re really lucky, the project has good unit tests.
I think dynamic typing is wonderful for making software quickly, and it can be a force multiplier for startups. I also enjoy it when creating small services or utilities. But for a large web app, you’ll pay a price eventually. Or more accurately…the poor engineer that inherits your code in 10 years will pay the price. God bless them if they try to do a medium sized refactor without types lol. I’ve been on both ends of the spectrum here.
Pros and cons. There’s _always_ a tradeoff for the business.
But most startups aren’t building for 10 years out. If you use a lot of typing, you’ll probably die way before then. But yeah if you’re building a code base for the long term then use types unless you’re disciplined enough to write comments and good code.
As for refactoring, that is exactly what test suites are for.
That is certainly correct... but that doesn't make it a good thing. One wants to catch bugs before the program is running, not after.
[0]: https://wiki.c2.com/?PrimitiveObsession
Also, you can still do integer things with them, such as
> nonsense = UserId(1) + UserId(2)
(Typescript's Zed and Clojure's Malli are counterexamples. Although not official offerings)
Following OP's example, what prevents you from getting a AccountID parsed as a UserID at runtime, in production? In production it's all UUIDs, undistinguishable from one another.
A truly safe approach would use distinct value prefixes – one per object type. Slack does this I believe.
That's part of the point of being static. If we can statically determine properties of the system and use that information in the derived machine code (or byte code or whatever), then we may be able to discard that information at runtime (though there are reasons not to discard it).
> Following OP's example, what prevents you from getting a AccountID parsed as a UserID at runtime, in production? In production it's all UUIDs, undistinguishable from one another.
If you're receiving information from the outside and converting it into data in your system you have to parse and validate it. If the UUID does not correspond to a UserID in your database or whatever, then the attempted conversion should fail. You'd have a guard like this:
Static typing is just a tool, aiming to help with a subset of all possible problems you may find. If you think it's an absolute oracle of every possible problem you may find, sorry, that's just not true, and trivially demonstrable.
Your example already is a runtime check that makes no particular use of the type system. It's a simple "set contains" check (value-oriented, not type-oriented) which also is far more expensive than simply verifying the string prefix of a Slack-style object identifier.
Ultimately I'm not even saying that types are bad, or that static typing is bad. If you truly care about correctness, you'd use all layers at your disposition - static and dynamic.
No comments yet
A strong enough type system would be a lot more useful.
from typing import NewType
UserId = NewType('UserId', int) some_id = UserId(524313)
Coming from C++, this kind of types with classes make sense. But also are a maintenance task with further issues, were often proper variable naming matters. Likely a good balance is the key.
That is, this is perfectly acceptable C:
In Go the equivalent would be an error because it will not, automatically, convert from one type to another just because it happens to be structurally identical. This forces you to be explicit in your conversion. So even though the type happens to be an int, an arbitrary int or other types which are structurally ints cannot be accidentally converted to a myId unless you somehow include an explicit but unintended conversion.This helped me! Especially because you started with typedef from C. Therefore I could relate. Others just downvote and don't explain.