I had a somewhat related idea to this which was simply to track reference stability (i.e., is there any operation beyond freeing the data outright that will invalidate them).
A reference into a vector is unstable so you can only have 1 mut xor N read. A reference into an arena is stable so you can have all the muts you want, but they cannot be sent across threads.
Reference to object vs reference to contents is also related, you can safely have multiple mut references to the same object that cannot invalidate each other, even if they invalidate references to the contents.
This can be tracked with path analysis as employed by Hylo (Hylo only has second class references, but I mean the algorithm they use to track paths is applicable).
It was interesting enough that I knew I had to write a post about it. Happy to answer any questions!
mrkeen · 1h ago
> Because of those "inaccessible" rules, we can never have a readwrite reference and a readonly reference to an object at the same time.
I can't not see this as a good thing. It's almost at the level of "the only thing an ownership system does".
If my thread is operating on a struct of 4 int64s, do I now have to think about another read-only thread seeing that struct in an invalid partially-written state?
wavemode · 1h ago
Ideally, the rules for single-threaded references and references that are allowed to be shared across threads would be different.
zozbot234 · 1h ago
That's why Cell<T> and RefCell<T> are a thing. Both allow you to mutate shared references, but disable shared access across threads. The qcell crate even includes a version of Cell/RefCell that's "branded" by a region/lifetime, just like in this proposal.
Joker_vD · 33m ago
There is also "Tree Borrows" [0] proposal for Rust, which is aimed at the same problem.
I'm not sure if I understand it correctly. So, if I have a hashtable, adding or removing an element would invalidate existing pointers to any other element in the hashtable? I guess it makes sense from a memory release POV, but... I end up thinking that for databases using GC or RC is a better approach. Maybe I'm biased, but I have found far easier to work on databases written in C or C#. For that kind of programs I felt Rust overrestrictive and forcing to more dangerous patterns such as replacing pointers with offsets.
verdagon · 1h ago
Yep, adding or removing an element would invalidate existing pointers to any other element in the hash table. This is generally regarded as a good thing if your elements are stored contiguously in the hash table, because a resize would cause any existing pointers to dangle. This should be true for C, and might be true for C# if you're using `struct`s which put the data inline (memory's a bit fuzzy on C#'s rules for references to structs though, maybe someone can chime in).
This new approach still requires us to be mindful of our data layout. Not caring about data layout is still definitely a strength of GC and RC. I'm actually hoping to find a way to blend Nick's approach seamlessly with reference counting (preferably without risking panics or deadlocks) to get the best of both worlds, so that we can consider it for Mojo. I consider that the holy grail of memory safety, and some recent developments give me some hope for that!
(Also, I probably shouldn't mention it since it's not ready, but Nick's newest model might have found a way to solve that for separate-chaining hash maps where addresses are stable. We might be able to express that to the type system, which would be pretty cool.)
kibwen · 2h ago
> In any language, when we hand a function a reference to an object, that function can't destroy the object, nor change its type
This isn't quite true. While Rust doesn't currently support this, people have proposed the concept of `&own` references that take ownership of the object and free it when the reference goes out of scope (consider that Rust's standard destructor signature, `fn drop(&mut)`, should probably take one of these hypothetical owning references). I addition, I believe that languages with typestate can cause types to change as a result of function calls, although I don't quite understand the details.
modulared · 1h ago
> I believe that languages with typestate can cause types to change as a result of function calls
Do you have any specific languages?
verdagon · 2h ago
Great point =) That's true for a lot of languages (including Mojo too), so I should have said "non-owning reference" there. I'll update the post to clarify. Thanks for catching that!
skywal_l · 2h ago
I don't understand your comment. `drop` takes a *mutable* reference. But by default, references in Rust are immutable.
Ygg2 · 1h ago
Kibwen is saying that there was an idea to have `fn drop(&mut x)` from Drop trait become `fn drop_own(&own x)`. Then `&own` would do mutation and drop the owner of the reference. A regular &mut reference shouldn't do that outside of `drop(&mut x)`.
littlestymaar · 1h ago
shared references are immutable in Rust, mut references and shared references are both references.
wavemode · 1h ago
An &own reference seems like it would just be equivalent to a Box.
littlestymaar · 1h ago
Intriguing, what would be the purpose of such owning references compared to just passing ownership? Is that to have a way to reliably avoid memcopying large objects when passing them by value?
kibwen · 58m ago
> Is that to have a way to reliably avoid memcopying large objects when passing them by value?
I believe so, yes. Currently the only way to transfer ownership is by-value, and LLVM might optimize away the memcpy, but also it might not.
nixpulvis · 1h ago
No mention of how this is safe when operating on data across threads? One of the biggest wins for Rust is sane treatment of references when using parallelism.
TL;DR: Mutability is tracked at the group level, so we can share an immutable group with any number of threads (especially good with structured concurrency) or lend a mutable group to a single other thread. References themselves are still aliasable, regardless of the group's mutability.
Taking an existing (mutable, aliasing) group and temporarily interpreting it as immutable has precedent (I did it in Vale [0]) so I like the approach, but I might be biased ;)
(This is from my memory of how Nick's proposal works, I'll ask him to give a better answer once morning hits his Australia timezone)
A reference into a vector is unstable so you can only have 1 mut xor N read. A reference into an arena is stable so you can have all the muts you want, but they cannot be sent across threads.
Reference to object vs reference to contents is also related, you can safely have multiple mut references to the same object that cannot invalidate each other, even if they invalidate references to the contents.
This can be tracked with path analysis as employed by Hylo (Hylo only has second class references, but I mean the algorithm they use to track paths is applicable).
It was interesting enough that I knew I had to write a post about it. Happy to answer any questions!
I can't not see this as a good thing. It's almost at the level of "the only thing an ownership system does".
If my thread is operating on a struct of 4 int64s, do I now have to think about another read-only thread seeing that struct in an invalid partially-written state?
[0] https://perso.crans.org/vanille/treebor/aux/preprint.pdf
This new approach still requires us to be mindful of our data layout. Not caring about data layout is still definitely a strength of GC and RC. I'm actually hoping to find a way to blend Nick's approach seamlessly with reference counting (preferably without risking panics or deadlocks) to get the best of both worlds, so that we can consider it for Mojo. I consider that the holy grail of memory safety, and some recent developments give me some hope for that!
(Also, I probably shouldn't mention it since it's not ready, but Nick's newest model might have found a way to solve that for separate-chaining hash maps where addresses are stable. We might be able to express that to the type system, which would be pretty cool.)
This isn't quite true. While Rust doesn't currently support this, people have proposed the concept of `&own` references that take ownership of the object and free it when the reference goes out of scope (consider that Rust's standard destructor signature, `fn drop(&mut)`, should probably take one of these hypothetical owning references). I addition, I believe that languages with typestate can cause types to change as a result of function calls, although I don't quite understand the details.
Do you have any specific languages?
I believe so, yes. Currently the only way to transfer ownership is by-value, and LLVM might optimize away the memcpy, but also it might not.
TL;DR: Mutability is tracked at the group level, so we can share an immutable group with any number of threads (especially good with structured concurrency) or lend a mutable group to a single other thread. References themselves are still aliasable, regardless of the group's mutability.
Taking an existing (mutable, aliasing) group and temporarily interpreting it as immutable has precedent (I did it in Vale [0]) so I like the approach, but I might be biased ;)
(This is from my memory of how Nick's proposal works, I'll ask him to give a better answer once morning hits his Australia timezone)
[0] https://verdagon.dev/blog/zero-cost-borrowing-regions-part-1...