Names are not type safety (2020)

42 azhenley 37 8/3/2025, 10:55:28 PM lexi-lambda.github.io ↗

Comments (37)

5pl1n73r · 4h ago

My peers and I work on a language centered around "constructive data modeling" (first time I hear it called that). We implement integers, and indeed, things like non empty lists using algebraic data types, for example. You can both have a theory of values that doesn't rely on trapdoors like "int32" or "string", as well as encode invariants, as this article covers.

As I understand it, the primary purpose of newtypes is actually just to work around typeclass issues like in the examples mentioned at the end of the article. They are specifically designed to be zero cost, because you want to not pay when you work around the type class instance already being taken for the type you want to make an instance for. When you make an abstract data type by not exporting the data constructors, that can be done with or without newtype.

eru · 4h ago

The alternative to newtypes is probably to go the same route as OCaml and have people explicitly bring their own instances for typeclasses, instead of allowing each type only one instance?

I think OCaml calls these things modules or so. But the concepts are similar. For most cases, when there's one obvious instance that you want, having Haskell pick the instance is less of a hassle.

nixpulvis · 5h ago

In Rust I find myself gaining a good bit of type safety without losing ergonomics by wrapping types in a newtype then implementing Deref for them. At first it might seem like a waste, but it prevents accidentally passing the wrong type of thing to a function (e.g. a user UUID as a post UUID).

chiffaa · 7m ago

I want to point out that, technically, using Deref for this is an anti-pattern, as Deref is intended exclusively for smart pointers. Nothing really wrong with doing this outside of some loss in opacity (and unexpected behaviour if you're writing a library), but it's worth pointing out

weinzierl · 19m ago

This and for the use case from the article we will hopefully gain pattern types in Rust soon.

They do not solve every problem that constructive data modeling does but in my opinion a large portion of what actually occurs in everyday programs. Since they are zero-cost I'd say their cost-benefit ratio is pretty good.

Ada and Pascal also had handled the "encode the range in the type" nicely for decades.

lmm · 4h ago

IME this is exactly backwards: type safety is mostly about names, everything else is a nice-to-have. Yes, you can bypass your name checks if you want to, but you can bypass any type check if you want to. Most relevant type relationships in most programming are business relationships that would be prohibitively expensive to express in a full formalism if that was even possible. But putting names on them is cheap, easy, and effective. The biggest win from typed languages comes from using these basic techniques.

b_e_n_t_o_n · 3h ago

Hmm, IME the preferred type systems are structural - a function shouldn't care what the name is of the struct passed to it, it should just work if it has the correct fields.

rendaw · 40m ago

If someone encodes "Meter" and "Yard", your type system wouldn't provide any errors if a meter is used in a yard calculation or vice versa. If someone encodes "RGBColor" and "LinearRGBColor", both structs with 3 floats, your type system wouldn't provide any errors if a LinearRGB color is passed into an RGB calculation. You also wouldn't have any error if you accidentally passed a Vertex3 (again, struct of 3 floats) into your RGB calculation.

Also, preferred by who?

lmm · 2h ago

I think that's backwards - ultimately everything on a computer is just bytes, so if you push that philosophy to the limit then you would write untyped functions and they can "just work" on any input (just not necessarily giving results that are sensible or useful if the input is wrong). The point of a type system is to help you avoid writing semantically wrong code, to bring errors forward, and actually the most important and valuable use case is distinguishing values that are structurally identical but semantically different (e.g. customer ID vs product ID, x coordinate vs y coordinate, immutable list vs read view of mutable list, sorted vs unsorted...).

b_e_n_t_o_n · 2h ago

I think the structural type approach leans heavily into the "computation is just data and its transformations", so it makes sense for it to treat data as the most important thing. You end up thinking less about classification and more about the transformations.

I'm not saying the nominal approach to types is wrong or bad, I just find my way of thinking is better suited for structural systems. I'm thinking less about the semantics around product_id vs user_id and more about what transforms are relevant - the semantics show up in the domain layer.

Take a vec3 for example, in a structural system you could apply a function designed for a vec2 on it, which has practical applications.

lmm · 59m ago

> I'm not saying the nominal approach to types is wrong or bad, I just find my way of thinking is better suited for structural systems. I'm thinking less about the semantics around product_id vs user_id and more about what transforms are relevant - the semantics show up in the domain layer.

But that domain layer should make use of the type system! That's where the type system is most useful!

RossBencina · 1m ago

[delayed]

valenterry · 2h ago

> I think the structural type approach leans heavily into the "computation is just data and its transformations"

But it's never "just data". My password is different in many ways than my username. Don't you ever log/print it by accident! So even if structurally the same, we MUST treat it different. Hence any approach that always only looks at things structurally is deeply flawed in the context of safe software development.

b_e_n_t_o_n · 1h ago

Yeah you bring up a good point. A { name: string } dict needs to be treated differently from a { user_pw: string } dict. The difference is that happens in the domain layer instead of the type layer.

valenterry · 27m ago

> The difference is that happens in the domain layer instead of the type layer.

What's those layers you are talking about? In my domain-logic code I use types of course so there is no dedicated "type layer".

SkiFire13 · 40m ago

That's no difference than using newtype structs. If you remove the extra layer you are left with `string` for both of them.

> The difference is that happens in the domain layer instead of the type layer

This view greatly reduces the usefulness of the type layer though, as that's the only automated tool that can help the domain layer with handling cases like this.

b_e_n_t_o_n · 21m ago

It's not really automated though, it's just another layer of code written by a human, prone to the same types of human error.

seanmcdirmid · 2h ago

Structural type systems mostly don’t support encapsulation (private members that store things like account numbers) without some sort of weird add on, while nominal type systems support encapsulation directly (because the name hides structure). The canonical example is a cowboy and picture that both have a draw method.

b_e_n_t_o_n · 2h ago

Both Go and TS are structural and support encapsulation fine, I'm not sure why that would be an issue.

seanmcdirmid · 2h ago

TS doesn’t really. TS simply treats private fields as public ones when it comes to structural type checks. TS is unsound anyways, so not providing hard guarantees about field access safety is right up its alley. More to the point, if you specify a class type with private fields as a requirement, whatever you plug into that requirement has to have those private fields, they are part of the type’s public signature.

To get where structural type systems fall down, think about a bad case is when dealing with native state and you have a private long field with a pointer hiding in it used in native calls. Any “type” that provides that long will fit the type, leading to seg faults. A nominal type system allows you to make assurances behind the class name.

Anyways, this was a big deal in the late 90s, eg see opaque types https://en.wikipedia.org/wiki/Opaque_data_type.

b_e_n_t_o_n · 1h ago

Typescript had to support JS's quirks... :/

   class Foo {
      public bar = 1;
      private _value = 'hello';
      static doSomething(f: Foo) {
         console.log(f._value);
      }
   }
   class MockFoo { public bar = 1; }
   let mock = new MockFoo();
   Foo.doSomething(mock); // Fails

Which is why you'd generally use interfaces, either declared or inline.

In the pointer example, if the long field is private then it's not part of the public interface and you shouldn't run into that issue no?

seanmcdirmid · 48m ago

_value is part of the type for Foo, it’s as if it was a public field. You can forge a reference to Foo by adding _value to your mock. TS deals with private fields by pretending they are public when it comes to matching. There are more rigorous ways have hiding and then revealing private state in structurally typed languages, but they involve something that is suspiciously like using a name, and really, it makes sense. The only way you can hide something and recover it later is via some kind of name (unless you can somehow capture the private structure in a type variable so it’s just passing through the parts that can’t see it).

You can do a lot just by hiding the private state and providing methods that operate on that private state in the type (using interfaces for example), but that approach doesn’t allow for binary methods (you need to reveal private state on a non-receiver in a method).

b_e_n_t_o_n · 28m ago

Can you explain the last part more? I don't think I'm grasping what you mean.

o11c · 3h ago

The critical problem with structural typing is that it requires weird and arbitrary branding when dealing with unions of singletons.

whilenot-dev · 56m ago

Branding doesn't need to be weird and arbitrary, see Pythons NewType: https://docs.python.org/3/library/typing.html#typing.NewType

Reading TFA now, Pythons NewType seems to be equal to Haskells newtype. Yes, it's a hack for the type checker to work around existing language semantics and feels unergonomic at times when Parse, Don't Validate needs to fall back to plain validation, but I wouldn't call it neither weird nor arbitrary.

b_e_n_t_o_n · 3h ago

You mean like if you have two types which are identical but you want your type system to treat them as distinct? To me that's a data modelling issue rather than something wrong with the type system, but I understand how it can sometimes be unavoidable and you need to work around it.

I think it also makes more sense in immutable functional languages like clojure. Oddly enough I like it in Go too, despite being very different from clojure.

andyferris · 3h ago

If I understand you correctly - in popular structurally typed languages, sure.

It seems ok in upcoming languages with polymorphic sum types (eg Roc “tags”) though?

stirfish · 3h ago

> should just work if it has the correct fields.

Correct fields by...name? By structure? I'm trying to understand.

b_e_n_t_o_n · 3h ago

By name, type, and structure. In typescript for example:

   let full_name = (in: { first: string, last: string }) => in.first + " " + in.last

Then you can use this function on any data type that satisfies that signature, regardless of if it's User, Dog, Manager etc.

auggierose · 27m ago

These kinds of types are just a waste of time. It is going to be OneToSix or OneToSeven very soon...

It's just an example! Well, if you cannot come up with a good example, maybe you don't have a point.

ashton314 · 5m ago

[delayed]

andyferris · 2h ago

These are possibly situations where I’d resort to a panic on the extra branch rather than complicate the return type.

Providing a proof of program correctness is pretty challenging even in languages that support it. In most cases careful checking of invariants at runtime (where not possible at compile time) and crashing loudly and early is sufficient for reliable-enough software.

kazinator · 4h ago

What if I want a type called MinusIntMaxToPlusIntMax?

In other words the full range of Int?

Is newtype still bad?

In other words how much of this criticism has to do with newtype not providing sub-ranging for enumerable types?

It seems that it could be extended to do that.

skybrian · 3h ago

The author seems concerned about compile-time range checking: did you handle the full range of inputs?

Range checking can be very annoying to deal with if you take it too seriously. This comes up when writing a property testing framework. It's easy to generate test data that will cause out of memory errors - just pass in maximum-length strings everywhere. Your code accepts any string, right? That's what type signature says!

In practice, setting compile-time limits on string sizes for the inputs to every internal function would be unreasonable. When using dynamically allocated memory, the maximum input size is really a system property: how much memory does the system have? Limits on input sizes need to be set at system boundaries.

valenterry · 2h ago

Title should been "names are not ENOUGH for type-safety" but then no one would have read it I guess...

b_e_n_t_o_n · 3h ago

Perhaps it's because I'm not a haskeller but I'm not sure if I'm sold on encoding this into the type system. In go (and other languages for example), you would simply use a struct with a hidden Int, and receiver methods for construction/modification/access. I'm not sure I see the benefit of the type ceremony around it.

cryptonector · 1h ago

> you would simply use a struct with a hidden

In such languages that's the equivalent of a newtype in Haskell.

the_af · 3h ago

Isn't the whole article a discussion of the kind of guarantees such an approach (which can also be done in Haskell) cannot provide?

b_e_n_t_o_n · 2h ago

Right, I'm just unsure how valuable those guarantees really are. Especially if I'm extracting an Int out of the type to interface with other code.