I agree with the author, we need better primitives, if you need functionality now:
Major tools that exist today for partial structure traversal and focused manipulation:
- Optics (Lenses, Prisms, Traversals)
Elegant, composable ways to zoom into, modify, and rebuild structures.
Examples: Haskell's `lens`, Scala's Monocle, Clojure's Specter.
Think of these as programmable accessors and updaters.
- Zippers
Data structures with a "focused cursor" that allow local edits without manually traversing the whole structure.
Examples: Huet’s original Zipper (1997), Haskell’s `Data.Tree.Zipper`, Clojure’s built-in zippers.
- Query Languages (for semantic traversal and deep search)
When paths aren't enough and you need semantic conditionals:
- SPARQL (semantic web graph querying)
- Datalog (logic programming and query over facts)
- Cypher (graph traversal in Neo4j)
- Prolog (pure logic exploration)
These approaches let you declaratively state what you want instead of manually specifying traversal steps.
Traversable and lenses are very closely linked. If you go to the original paper leading to Traversable [1] and read through it, it feels basically identical to reading through the parts of the lens library that lay down the core abstractions and the laws implementations must follow if you want to be able to blindly manipulate them. In fact, the traverse function is a Traversal, and so fits trivially into the lens ecosystem.
Arent haskell traversables different in that they preserve the structure if you were to map over them, as compared to the solution posted in the article, where they get flattened to what amounts to a list?
Avshalom · 20h ago
To point out a prolog thing which is also applicable to other languages with good patter matching: the break/return/prune examples are all ergonomic to implement as recursion in a way that fails in C++ style type based dispatch.
jasperry · 1d ago
These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler, but an interface or typeclass like your examples (in a language advanced enough to have them.)
The problem is that 'lens', 'monocle', etc. are famously abstract and difficult for people to apply to their actual problems. IMO, the solution would be for standard libraries to specify interfaces called 'BreadthFirstTraverse', 'DepthFirstTraverse', etc.
T-R · 20h ago
I definitely agree for traversals, but Lenses need some sort of primitive support - even in Haskell they're mostly generated with TemplateHaskell, and the language developers have spent a long time trying to make the `record.field` accessor syntax overloadable enough to work with lenses[1][2]. Hopefully someday we'll be free from having to memorize all the lens operators.
Optics are famously abstract in implementation, but I don't think people have trouble applying them - people seem to like JQuery/CSS selectors, and insist on `object.field` syntax; it's kind of wild that no mainstream language has a first-class way to pass around the description of a location in an arbitrary data structure.
Optics let you concisely describe the location, but defer the dereferencing, so you could definitely approximate optics, not by passing around pointers you compute with `offsetof`, but passing around functions that use `offsetof` to return memory locations to reference (read/write to). You could certainly write a composition operator for `*(*T) => List<*R>`... Some people have done something like it[1][2]:
Account acc = getAccount();
QVERIFY(acc.person.address.house == 20);
auto houseLens = personL() to addressL() to houseL();
std::function<int(int)> modifier = [](int old) { return old + 6; };
Account newAcc2 = over(houseLens, newAcc1, modifier);
These also use templating to get something that still feels maybe a little less ergonomic than it could be, though.
Haskell has 'Traversable' (and 'Foldable' etc) which are a lot more approachable than the fully generalised lens library.
naasking · 1d ago
> These are what I think the author is looking for. But it shouldn't be a "primitive" in terms of code automatically generated by the compiler
I think people are often too enamored by general purpose languages that can express such abstractions natively. I don't see an issue with a language that provides this as a primitive without being able to express it itself, constraints can be useful for other properties. Once you can traverse trees, most programming problems can be tackled even in such constrained languages, eg. SQL with CTE.
jll29 · 9h ago
In Rust, you can extend the syntax a bit within the language itself.
For C++, you could define yourself a template that expands to the two functions you listed.
For any language, you could write yourself a pre-processor that adds for_tree notation and expands it,
either with pre-processor semantics or working on the abstract syntax tree (which is more "proper" but also more work"). I would recommend the former to test notations, and then you can live with them for a while to experiment with them and see if and how often you really need the construct (recall that is how C++ evolved from C via "C with classes" - it was first a pre-processor).
Once you and others are 100% convinced of the new syntax, go to your prefered language's working group/ISO committee and propose inclusion.
My own feeling is that this is not something for which I need more than recursion; calling inside traverse() traverse(left) and traverse(right) for binary trees or using a normal for loop to iterate over all this->children() from 0 to this->children.size() is something that occurs in graph libraries and parsers/compilers once in a while but not often enough to warrant its own notation. Rather, when I look at languages like C++, I'd have a language that is simpler, cleaner, more homogeneous and more orthogonal; C++ is complicated, convoluted, too large to implement it correctly by a single person in reasonable time (compared to beauties like C, Pascal, Scheme), so I stand on the side of minimalism.
timeninja · 7h ago
The "syntax extension" thing is precisely the reason I have an interest in (and advocacy for) Common Lisp.
Be that as it may, for C++, Eric Neibler's [Boost.Proto](https://www.boost.org/doc/libs/1_84_0/doc/html/proto.html) could likely help conveniently connect syntax to implementation to achieve something similar to what the author is taking about.
rebeccaskinner · 1d ago
Don’t forget recursion schemes. The last example in the article was just asking for a hylomorphism.
adamgordonbell · 17h ago
Even just traverse from scala, haskell, kotlin, etc will work iterating over trees.
Seriously, I thought of those the moment I read the title of this post.
larodi · 23h ago
how about we also get regex-parsable streams (IO::Async in perl has something like it, suboptimal perhaps) and regex-parsable treestructures (totally possible)? seems like just having the ~= work on structures (or whatever the API is called in other languages, this being Perl5)?
CyberDildonics · 1d ago
This seems like dramatically over complicated iterators. Do they really need to be called 'optics' and 'lenses' and 'prisms' ?
Think of these as programmable accessors and updaters.
How is iterating through something already not 'programmable' ?
bts · 17h ago
Indeed they call for new names, as they encompass far more than iterators.
If you read a bit more about them, I think you will be surprised to see the breadth of what these abstractions can be used for. To start, they've been used to build a new compositional foundation for game theory[1], and they can be used to model gradient-based learning in machine learning[2].
As for their simpler applications as getters and setters, they are programmable in the sense that you can think of lens/prism types as interfaces that can be implemented arbitrarily. So you can create your own lenses and combine them like legos to construct new, "bigger" getters and setters from the smaller components.
This thread is about traversing a tree. At what point do we take a step back and think that iterating through a data structure and "building new compositional foundations for game theory" shouldn't be conflated together?
When does someone give up on the pagentry of programming and just get something done by looping through data instead of "constructing getters and setters to model gradient based machine learning".
It really seems like the straight forward way of doing things is to make simple iteration and worry about the game theory after that is all figured out.
throwaway17_17 · 11h ago
I do get, and to some extent sympathize with, your position. But the comments to the article are, in large part, fundamentally addressing a higher level of abstraction than the more narrow scope of the article’s premise. As several commenters have referenced, traversing a tree in a ‘simple’ and narrow manner is addressed at the appropriate level of abstraction using already established mechanisms, particularly in Haskell using Traversable (a typeclass with a standardized implementation for trees). The discussion of Optics and Lenses are more of a side discussion about the syntax and implementations of a more broad set of data structures than merely trees.
I think your comment is valuable in spirit because it could bring a more thorough discussion of the creation or popularization of a syntactically clean and a shift of the idiomatic-ness of such a solution to traversing trees. Leaving the more abstract and general Optics discussion to be a side dish for commenters to discuss as well.
Also, just a nitpick, but traversing a tree is a broader topic than iteration. Iteration implies, both colloquially and formally, performing a computation on each element of a data structure, while traversal is a more general algorithm that could be the basis of iteration, but may also be more akin to a search through the data structure without expectation of performing computation until the searched for ‘member’ is returned from the traversal. So it’s not an apples-to-oranges comparison with your conflation of iteration and traversal, but does seem a little like Canis Familiaris-to-Mammal comparison.
CyberDildonics · 3h ago
I read your whole comment but I'm still not getting where the actual rubber meets the road. If I have a tree data structure what am I missing? What is it that I can't do that makes this extreme iteration complexity worthwhile? To me, having simple access to the data structure is enough because that's what I want a data structure for.
Someone · 21m ago
I would start with a tree primitive. That’s difficult enough, with variations on branching factor, whether the tree is self-balancing, whether it is immutable, etc)
I guess/expect that will have to expose the implementation detail of the branching factor (yes, a binary tree is a special-case of a https://en.wikipedia.org/wiki/B-tree for B = 2, but algorithms tend to be different)
On top of that, build a library of tree traversal routines (breadth first, depth first with a few variants), allowing the user to specify code that decides, on each node visited, whether to navigate further down and whether to stop the traversal.
Now, whether the language proper needs a generic way to specify tree traversals? I doubt it, and doubt even more that it is possible too that in a way so that a) most graph traversal algorithms can use the primitive and b) the result is easier to use/looks significantly nicer than just calling those routines.
9. It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.
"Let's move on."
nimih · 1d ago
In addition, clojure.core has the handy tree-seq function:
(defn tree-seq
"Returns a lazy sequence of the nodes in a tree, via a depth-first walk.
branch? must be a fn of one arg that returns true if passed a node
that can have children (but may not). children must be a fn of one
arg that returns a sequence of the children. Will only be called on
nodes for which branch? returns true. Root is the root node of the
tree."
{:added "1.0"
:static true}
[branch? children root]
(let [walk (fn walk [node]
(lazy-seq
(cons node
(when (branch? node)
(mapcat walk (children node))))))]
(walk root)))
Xmd5a · 21h ago
(defn tree-seq-breadth
"Like tree-seq, but in breadth-first order"
[branch? children root]
(let [walk (fn walk [node]
(when (branch? node)
(let [cs (children node)]
(lazy-cat cs (mapcat walk cs)))))]
(cons root (walk root))))
Didn't Wirth(may be it was someone else) say that it is better to have a complex data structure and simple algorithm/code that works on them than having simple data structures and complex code.
Complex data structures absorb lot of the complexity of the problem and reduce the complexity of the rest of the code.
Vosporos · 8h ago
Linus Torvalds also spoke in favour of using adequate data structures instead of the code.
parrit · 13h ago
It is even better to have 100 functions work on 100 data structures. Powerful programming languages like Lisp and Haskell give you that. Generics gives you most of that.
If every ounce of performance matters, e.g. in a database, you want 10000 functions, 100 for each data structure.
MathMonkeyMan · 12h ago
Lisp gives you an infinite amount of functions that operate on one data structure: the cons cell. :P
I guess all you really need are dynamically allocated arrays. A cons cell is an array of two. A struct with N fields is an array of N. Everything else is built on that.
pgt · 22h ago
Nathan Marz' talk on Specter (Clojure Library that decouples navigation from transformation) is must-watch if you deal with data: https://www.youtube.com/watch?v=VTCy_DkAJGk
I use it in every project for data navigation and transformation, and it's more performant than standard Clojure data manipulation, while retaining types (instead of coercing back from seqs).
E.g. if you have a map and you want to increment every value in the map:
(require '[com.rpl.specter :as S])
Now let's say you have a map of vectors and want to increment all of those?
(->> {:a 5, :b 6, :c 7}
(S/transform [S/MAP-VALS S/ALL] inc)) ;; note the navigator juts got another level of nesting
=> {:a [2 3], :b [4 5], :c [6 7]}works for all clj data types, and of course it has navigators for recursive walking .
It took me a while to get good at Specter, but it was worth it. I hear Rama uses Specter navigators internally.
erichocean · 3h ago
Meander is also useful if you need to map from one structure to another, and Malli if you need to write coercisions that check the validity of structures at runtime (aka "schema checking").
To some extent, this seems like it's just a symptom of "writing iterators in C++ is harder than it ought to be". You don't need to have a tree in memory to build an iterator over it - folks write iterators over implicit data all the time.
Here's the range based for loop counterexample from the blog post, as a python generator:
import itertools
def iterate_tree():
for length in range(1, 8):
for combo in itertools.product("abc", repeat=length):
yield ''.join(combo)
for s in iterate_tree():
print(s)
uzerfcwn · 7h ago
C++ also has generators[1], but the author is perhaps unaware of them.
"yield from" is also very useful to iterate trees in python.
jerf · 1d ago
That's "just" a particular kind of fancy iterator that you should be able to implement in any language with iterator support. Here's one in Python:
# Expects:
# Tuple of iterateOn where iterateOn can be None
def fancyIt(init, *options):
if init != None:
yield(init)
for f in options:
newVal = f(init)
yield from fancyIt(newVal, *options)
class Tree:
def __init__(self, val, left = None, right = None):
self.val = val
self.left = left
self.right = right
def left(self):
return self.left
def right(self):
return self.right
myTree = Tree(
1,
Tree(2,
Tree(3),
Tree(4, None, Tree(5))),
Tree(6, None, Tree(7)))
for node in fancyIt(myTree, Tree.left, Tree.right):
print(node.val)
which prints the numbers 1 through 7 in order.
Breadth-first is slightly trickier, but only slightly trickier one time.
imglorp · 1d ago
Yes, it seems easy to implement.
Yes, students should absolutely implement the classic algorithms to learn.
Yes, there are some occasions when you need to home grow one at $work.
BUT, in my opinion, most of the time, professional code should use a battle tested, vuln hardened library or builtin version. These things are VERY HARD to get exactly right. Jon Bently's Programming Pearls famously had a latent bug in its binary search for 20 years before someone caught it.
So yeah, it looks easy but don't do it. Stand on some giant's shoulders instead.
jerf · 1d ago
Your reply is not relevant to my reply. The original poster is asking for this functionality and appears to believe it is something other than an iterator and requires some sort of special language support. However, it is completely implementable as an iterator, in a reasonably usable manner, with no additional language support. My specific code is written only to show that fact off.
Anyone who copies and pastes it is welcome to both pieces when it breaks. Others have already alluded to possible improvements that could be made, and I already have my own analysis in a grandchild reply as to why I don't think this is a terribly pressing need or necessarily even a good idea.
The reason I provide code is that it gets past the "oh, you say it's just an iterator, but I still don't believe you, since you haven't spelled it out to the n'th degree". When code is provided, belief ceases to be an issue. It is clearly something an iterator can implement, in existing languages, with existing iterator support.
Unless you're going to claim it is somehow impossible to provide this functionality in a tested manner, you're completely changing the topic in an uninteresting direction, since it is always true that functionality generally needs testing and bits of code slammed into an HN conversation just to make a particular point probably shouldn't be copied wholesale into your production code.
Amadiro · 1d ago
Sure but all pre-made, battle-tested tree datastructures you'd use in production in all languages already come with some form of iterator that you can just for-loop over, so the original articles point is still moot.
thayne · 1d ago
Sure, but it can be done by a library. There's no reason it needs to be built into the language.
packetlost · 1d ago
Yeah, tree traversal is really easy and implementing it as an iterator is natural. Maybe don't use a recursive technique if you plan on working with non-toy datasets, Python's default stack limit is pretty small (1000), but this is otherwise a very flexible API.
While easy, I think bisect would be a good addition to every stdlib too.
jerf · 1d ago
I did the tree just to match the author's example. I would agree that a bespoke iterator for breadth- and depth-first iteration for any given tree is probably a better way to go. As long as we're in a language like Python, build in something that allows you to examine a branch and decline to descend into it while you're at it.
I don't think this is a large problem in practice because you shouldn't be using dozens of tree types in a given code base, so adding iterators to a tree is no big deal. In general there aren't enough types of iteration available to a given data structure that you need to describe how to iterate on it from the "outside". (Generally when you are doing that, it's too non-trivial to fit into this pattern anyhow; see the Visitor pattern in general.) This strikes me as maybe the sort of default tool you might slap in a library somewhere, but it should be a niche tool. If you're using it all the time you're probably doing something wrong. By default your data structures should be providing iteration packaged with them and it should generally be what you need. And your language should support aborting iteration, in whatever that looks like normally. I'm not sure I know a language that doesn't, it's a fairly basic element of iterator support when you get into implementation.
There are also many cases where a tree iterator will perform significantly better, including CPython. I don't have enough experience with PyPy to know if it could inline the Tree.left and Tree.right calls down to zero penalty at JIT time. Rust and C++ and the other static languages with sophisticated compilers might be able to get that down to fully inlined and zero-cost, but even if they can it's probably better not to push that on to the optimizer as the optimizers will eventually give up if this is composed with enough other stuff. Better to just have an efficient implementation in the first place.
I kind of agree, but at the same time, for loops are just a fancy control structure that any student should be able to implement with goto.
IshKebab · 18h ago
I mean people do use iterator instead of for loops quite a lot.
IMO the thing that would be really nice is if control flow like `for` was actually the same as using an iterator. This would really help in Rust too where handling errors inside iterator callbacks is a right pain.
I've seen a few languages try this but it seems to not be very popular. I think it can get a bit confusing how control flow keywords like `return` and `break` work if you turn `if` into syntactic sugar for a function call taking a closure etc.
charrondev · 17h ago
In what way is for different than an iterator?
In PHP you loop through an iterator with the foreach control structure.
In JavaScript you use for of.
In rust it’s for in.
What am I missing?
IshKebab · 30m ago
You're missing that I am an idiot and was mixing up iterators with functional style programming :D
bawolff · 15h ago
That's why i said "goto". Iterators are already a high level cintrol structure.
porphyra · 23h ago
Well, the whole point of the blog post is to argue for a new kind of syntactic sugar and the author explicitly mentions iterators.
> Well a range based for loop requires that your tree exist in memory AND that you have an iterator defined for your tree. With for_tree you could operate on an entirely imperative tree, without needing to define any iterators or generator functions. Here's an example where I'm checking every single string composed of "a", "b", and "c" of length 8 or less.
for_tree(string x = ""; x.size() <= 8; x : {x+"a", x+"b", x+"c"}){
print(x);
}
You could definitely find every string composed of "a", "b", and "c" of length 8 or less by defining a custom iterator but it would be a verbose and unpleasant way of writing it:
class StringIterator {
public:
using iterator_category = std::forward_iterator_tag;
using value_type = std::string;
using difference_type = std::ptrdiff_t;
using pointer = const std::string*;
using reference = const std::string&;
StringIterator(bool begin = false) : is_end_(!begin) { if (begin) s_ = ""; }
const std::string& operator*() const {
if (is_end_) throw std::out_of_range("End iterator");
return s_;
}
StringIterator& operator++() {
if (is_end_) return *this;
if (s_.size() < 8) return s_.push_back('a'), *this;
while (!s_.empty() && s_.back() == 'c') s_.pop_back();
if (s_.empty()) is_end_ = true;
else s_.back() = s_.back() == 'a' ? 'b' : 'c';
return *this;
}
StringIterator operator++(int) { auto tmp = *this; ++(*this); return tmp; }
bool operator==(const StringIterator& other) const {
return is_end_ == other.is_end_ && (is_end_ || s_ == other.s_);
}
bool operator!=(const StringIterator& other) const { return !(*this == other); }
private:
std::string s_;
bool is_end_;
};
int main() {
StringIterator begin(true), end;
int count = 0;
for (auto it = begin; it != end; ++it) ++count;
std::cout << (count == 9841 ? "Pass" : "Fail") << std::endl;
return 0;
}
abbeyj · 12h ago
It seems like it should be possible to factor out most of the iterator boilerplate into a helper class. Then each place where you want to iterate you can construct an instance of that helper class and supply a lambda that specifies how to descend into children. If you're doing the same iteration in several places then you can use a named function instead of a lambda, which means less typing for each `for` loop. Here's a rough sketch: https://godbolt.org/z/x94WY77rv
samus · 19h ago
Same thing, but the iterator is instead hidden inside the language implementation. `foreach` is already a quite general construct, and iterators are the way to extend them. However, I can see the benefit of designing a library to help implement more intricate iterators.
Ceterum censeo this would be a family of simple macros in LISP.
jerf · 23h ago
def itWithStop(init, stop, *options):
if init is not None and not stop(init):
yield(init)
for f in options:
newVal = f(init)
yield from itWithStop(newVal, stop, *options)
for s in itWithStop("",
lambda x: len(x) > 2,
lambda x: x + "a",
lambda x: x + "b",
lambda x: x + "c"):
print(s)
yields the combinations of 0 - 2 length strings with a, b, and c.
Python has a number of ways to achieve this depending on exactly how you want to pass the arguments; multiple functions, optional arguments, etc. How nice the final call looks is more about your local language's closures look.
The main point here is that this will happily iterate on things that don't "exist".
module Tmp where
iter :: forall a. (a -> Bool) -> [a -> a] -> a -> [a]
iter p opts x = if p x then x:concatMap (iter p opts) (opts <*> [x]) else []
ghci> :l tmp.hs
[1 of 1] Compiling Tmp ( tmp.hs, interpreted )
Ok, one module loaded.
ghci> iter (\x -> length x < 3) [(++ "a"), (++ "b"), (++ "c")] ""
["","a","aa","ab","ac","b","ba","bb","bc",
"c","ca","cb","cc"]
(Since things are lazy in Haskell, functions that return lists effectively are iterators. There's probably something in the standard library somewhere for (opts <*> [x]) to avoid the wrapping x in an unnecessary list, but my Haskell is rusty.)
porphyra · 23h ago
The iterator in my example will also happily iterate on things that don't exist. We all agree that it's possible to do in any language. But the main point is that the blog post is talking about syntactic sugar for an easier way to do it.
And yes, Haskell is amazing at this sort of thing.
jerf · 22h ago
The syntactic sugar being asked for in this case is awfully thin. I definitely put this in the class of "stop pining for it and just use what's there".
If the poster wants to particularize this to C++ because C++'s syntax can't support it in any reasonable manner, that's fine, but that's a C++ problem, not a "Programming languages..." problem. Which would be perfectly understandable and I'm not really complaining, more clarifying that most of the rest of the world can just rub together three or four existing constructs in a pretty reasonable manner to get this.
jerf · 1d ago
Whoops, my edit window closed, but that first comment derives from a previous version that had a different signature for the functions in the "options" list. Ignore it.
yxhuvud · 1d ago
No, that seems like language bloat. Better to make your language have internal iterators, as then data structures can add their own logic that matches their need for iteration. Then special cases don't need additional syntax.
naasking · 1d ago
Trees aren't really a "special case". Sequences, trees and graphs are core abstractions to most programming, why try to flatten all trees and graphs to sequences?
smallnamespace · 1d ago
Because your code is actual running serially so no matter what there is an iteration order, and the order matters for performance even if your code does not care.
For example if you literally don’t care about the order then your code can map over a default iterator that is efficient (DFS).
naasking · 23h ago
Not true due to parallelism. Also, SQL demonstrates that you can describe the desired result declaratively without being concerned about iteration order. I see SQL's CTEs as a good example of the kind of primitive the article is talking about.
alterom · 21h ago
Sure, and perhaps that's the reason why we don't have a built-in for_tree: for some people, order matters; for others, it doesn't.
Then, for some cases, depth-first traversal is needed; for others, breadth-first.
Then, there's parallelism, and even the plain old for loops aren't parallel by default.
By the time you specify exactly what you need from a tree traversal, you've written code to do it.
And if you're fine with some default choice — you already can use the default iterator with the for_each loop.
I don't see what need there is for adding an extra for_tree syntax to do that.
bloppe · 17h ago
Also a lot of trees have fixed arity, like binary trees. But to model a general tree, you'd need a dynamically allocated list of children for each node, adding a layer of indirection that can noticeably worsen performance.
Trees are often considered too diverse to include even in the standard library, let alone as a primitive. Even Python doesn't have trees in the standard library. I'm sure it's been proposed and rejected at some point
naasking · 16h ago
> But to model a general tree, you'd need a dynamically allocated list of children for each node, adding a layer of indirection that can noticeably worsen performance.
I hate to be a broken record, but this again goes back to the assumption of a specific sequential computation model. Modelling relations as in SQL automatically supports for N-ary child relations. There are other computation models! eg. logic, relational, etc.
smallnamespace · 2h ago
Those other models lack native mechanical sympathy with the hardware.
Let's be practical about it: the reason you might want to say map_tree is so it could potentially be optimized to run more quickly, e.g. via automatic parallelization.
But to even parallelize a tree read operation we'd have to pull in other threads, split work, we'd want metadata on each node (child counts), and at the end of the day within each worker you'd still have a serial execution order.
For map_tree to have practical benefit your language would need to make a ton of opinionated choices about trees and parallel programming. Feasible? Yes. Worth making it a basic primitive? Well, even LISPs don't offer map_tree, despite building the whole language on top of car/cdr.
yxhuvud · 5h ago
The point wasn't that it should be treated as a sequence of not, but that with good base abstractions you can move the decision of what to do, and how, into the data structure itself, rather than having to have the language decide for it.
lmm · 5h ago
> Sequences, trees and graphs are core abstractions to most programming, why try to flatten all trees and graphs to sequences?
Because "trees" and "graphs" aren't actually single concepts, they're families of concepts that can vary along many dimensions. Most programming languages offer one or a small handful of sequence types, because most sequence use patterns are the same. Most languages don't offer a tree or graph type, and as soon as you try to implement one you see why: there are multiple different kinds of trees and graphs, and multiple different representations that are appropriate for different use cases, and any one you might pick would be unsuitable for most programs people want to write.
monkeyelite · 1d ago
This is solved by iterators in C++. The idea of an iterators is to generalize the concept of a pointer —- something which refers to a location in your data structure.
For example the most basic operations of a pointer are to advance and dereference.
std::map is actually implemented as a tree. To iterator its members you can do
for (cost auto &pair : map)
The only requirement for your custom data structure to work is to implement begin() and end() which return iterators - “pointer like” objects.
rerdavies · 9h ago
I'm surprised nobody's brought up C++ generators yet (which avoid potential problems with recursion stack depth).
for (const auto&node: await depth_first_tree(node))
And generators have the added advantage that walking trees is just a special case of the more general case of traversing a directed graph of elements, which is equally easy.
I do think recursion schemes are pretty cool, particularly the idea that a list fold could be generalized to the concept of catamorphisms, which could be mechanically derived upon defining a data structure, because the shape of the recursion that's defined is the same as the shape of the recursion of the data constructors. It makes me realize that programming languages still could be higher-level; what if a programming language automatically derived the equivalent of the "fold" and "map" functions for any recursive data type defined by the programmer?
At the same time, the way it's implemented in Haskell's recursion-schemes library might be hard to wrap one's head around at first, kind of like how the list functions "foldr" and "foldl" are also often confusing to newbies, even though they're like the go-to, default way to make list functions in Haskell.
s-zeng · 5h ago
Haskell already supports deriving Functor and Foldable
trealira · 5h ago
Forgot about that lol, my bad. I was really thinking of deriving the equivalent of catamorphisms.
dtech · 1d ago
Folding and traversing can be trivially done in all relevant programming languages if you have an iterator
ookdatnog · 7h ago
Don't iterators necessarily "forget" the structure of the tree? I can see how you can write an iterator to, for example, compute the size of a tree, or generate the set of all variables that occur in an expression tree.
But it's not obvious to me how you'd write a generic iterator that supports something like finding all free variables in an expression tree, or even just express a tree mapping function that constructs a new tree with the same structure (for example, "make all variables in the expression uppercase"). It's been a while since I worked with iterators so correct me if I'm wrong and this is in fact easy with iterators.
With something like Haskell's recursion-schemes library [0] these operations are all just a few lines long, guaranteed to terminate, and don't need to be updated if your data structure changes (for example you add new expression nodes). I'm not aware of any non-functional programming language that can do that. See for example the freeVars function at the bottom of the linked recursion-schemes doc.
I feel like programming an iterator like this akin to a state machine is just not that convenient or trivial, though. Below is pseudo-Java.
class InOrderTreeIterator {
Stack stack;
TreeNode cursor;
InOrderTreeIterator(TreeNode root) {
cursor = root;
s = new Stack;
}
bool hasNext() {
return cursor != null || !stack.empty();
}
TreeNode next() {
if (cursor != null) {
while (cursor.left != null) {
stack.push(cursor);
cursor = cursor.left;
}
} else if (!stack.empty()) {
cursor = stack.pop();
} else {
throw new NoSuchElementException();
}
TreeNode ret = cursor;
cursor = cursor.right
return ret;
}
}
munificent · 21h ago
It's much easier if the language has built-in support for generators. Here's an in-order tree iterator in Dart:
Iterable<TreeNode> inOrder(TreeNode node) sync* {
if (node.left != null) yield* inOrder(node.left!);
yield node;
if (node.right != null) yield* inOrder(node.right!);
}
trealira · 21h ago
Yeah, if it has built-in support for generators, it basically makes the state machine implicit for you, which is nice and convenient; it's like going from having to manage your own stack call stack to using a programming language that just allows function calls and recursion.
munchler · 1d ago
Agreed. It’s amusing to see procedural programmers slowly rediscovering what functional programmers have known for years. Paul Graham called this “The Blub Paradox”.
pmontra · 1d ago
I remember that I was using trees quite often last century, when I was writing programs in C. I seldom use tree or comparably complex data structures nowadays, when I almost only write web apps (mostly backend in Ruby or Python but some frontend in JS.) I'm bet that both Ruby and Python have plenty of trees written in C inside their interpreters. I do everything with arrays (lists) and hash tables (dicts) in Ruby and Python. Maybe a language construct for trees would be nice for lower level languages and almost out of scope for higher level ones.
The same from another angle: there are a lot of trees in the indices of SQL databases (example [1]) but we don't zoom in to that level of detail very often when defining our tables.
User interface widgets form a forest - windows contain layouts that contain widgets which also can be layouts, etc, etc.
To implement Brown's algorithm to optimize class-based language models I had to implement a complex forest (DAG, actually) in Python using lists of fixed length. That was not especially nice to work with.
scotty79 · 4h ago
Nested dicts are trees.
walleeee · 1d ago
Do you keep your dicts fastidiously flat?
pmontra · 1d ago
Usually yes. Web apps are simple. There isn't much to do inside a request params to db to response call. But sometimes, not every year, there is something to really reason about. I still remember some project from 10 or 20 years ago.
JaumeGreen · 6h ago
One of the projects that I have only in my mind space is an Iversonian language with more data structures and the tools to easily navigate an traverse them. Trees, graphs, ... the sky is the limit. And with the power of notation as a tool of thought, and a pen interface, it would be fun to do whiteboard sessions that produced real code that could be executed as is.
OTOH I read/heard that the beauty of array languages is that they have few data types, but they can be easily worked on. So maybe the answer is not easier tree traversal primitives, but better literature/training on how to transform it in a more manageable data type.
Sometimes the answer is to adapt our vision to our world, sometimes the answer is to adapt the world to our vision.
Retr0id · 1d ago
> "Why not just use recursive functions"
One great reason not to use recursive functions for traversing trees is that you can allocate your own stack data structure rather than relying on the call stack itself. In most languages/runtimes, the call stack has a maximum depth which limits the depth of trees you can process, usually on the order of thousands of stack frames.
Managing your own stack usually produces weirder looking code (personally I find "naive" recursive approaches more readable) - but having it as a first-class language feature could solve that!
cobbal · 1d ago
If we're fixing the language, may as well fix the real problem and not artificially limit stack space.
Retr0id · 1d ago
The language doesn't get much of a say, stack limits are usually inherited from the OS itself. It's fixable by not using the OS-provided stack but that's a much more invasive change than a new syntax/stdlib feature.
worthless-trash · 12h ago
The compiler/language absolutely gets a say in this, you can do tail call recursion without stack saturation.
Retr0id · 4h ago
Tail call elimination is an optimisation that is only possible for certain code patterns. How would you write a tail-recursive tree walk, without using a separate stack or queue data structure to store state?
soegaard · 1d ago
Pick a language that allows users to define their own control structures and test your idea in practise.
Candidates: Racket, Scheme, Rust.
k__ · 1d ago
You don't even have to got that far.
Defining your own iterator would be enough for most cases.
scotty79 · 4h ago
The point is not to do that.
scotty79 · 4h ago
I think Scala is also close. It has really flexible syntax.
cdrini · 23h ago
I'm not sure if it needs to be at the syntax level, but I think having built in standard library support for graphs (general trees) would help make graphs more common in programming. And seeing how powerful they are, I think that would be a good thing!
I explored this idea with gstd, a standard library for graphs inspired by the JS Array interface: https://github.com/cdrini/gstd/
Tree traversal is an odd one to want to try and make primitive. There are a lot of hidden questions that you need to consider when traversing a tree. Would you want your syntax to indicate what order traversal? Should it indicate a way to control the extra memory that you allow for the common stack/queue used in basic traversal? If you have a threaded tree, should the syntax work without any extra memory usage?
injidup · 1d ago
That's what c++20 coroutines paired with c++23 generators are for. It's easy to define whatever traversal you want and then expose it as generic code.
Sure you can do it any way but the OP was about adding language features for iteration of a specific data structure. Iteration in C++ is state machine based and coroutines give you a neat way to implicitly model them without having to manage the boiler plate yourself.
ellis0n · 16h ago
Great example!
mpweiher · 8h ago
Adaptive Programming[1] by Karl Lieberherr is the most comprehensive attempt I have seen to formalize and separate iteration over arbitrary structures as a distinct PL construct.
It allows you to specify traversal in a structure-shy way. Personally, I think it goes a little too far, but it's definitely a good inspiration.
I bet you could do something generic like this in languages that have deferred execution like C#'s IEnumerable. Something like
foreach (Node node in EnumerateNodes(root, x => x != null, x => [x.Left, x.Right]))
where EnumerateNodes uses `yield return` (i.e. is a generator) and calls itself recursively. Though it'd probably be easier / better performance to write an implementation specific to each node type.
andyferris · 16h ago
My thoughts on this are to add a third keyword alongside `break` and `continue` to loops that allow you to push another (loop) stack frame and descend into a tree-like data structure.
For exposition, lets seperate loop initialization from the rest, make `continue` mandatory to repeat the loop with another specified value (i.e. the loop implicitly ends without a continue) and lets say `descend` to recurse into the tree.
sum = 0;
loop (node = tree.root) {
sum += node.value;
if (node.left) descend node.left;
if (node.right) descend node.right;
}
Compare with a linear loop:
sum = 0;
loop (i = 0) {
if (i >= array.length) break;
sum += array[i];
continue i + 1;
}
The `descend` keyword differs from `continue` in that its more of a continuation - control flow will continue from that point once it is done.
You could make this more functional and less imperative (and I'd be surprised if such things don't exist as libraries in some functional languages). Yes ultimately we can transform this to recursive functions and compile it that way... but for imperative languages (especially those without closures) something like the above might be useful.
Arguably the above loops are _more_ imperative and less declaritive than the standard ones from the C family. You could add a functional twist by breaking with a value (which is the result of the overall `loop` expression).
drbojingle · 3h ago
What programming languages need is dimensional analysis. The fact that I can make variables like
Const km = 1
Const velocity = 11
And do km + velocity = 12 is all kinds of silly.
zupa-hu · 2h ago
That exists, it is called a nominal type system, while in your example the language has a structural type system.
Edit: Assuming your example is even statically typed.
erichocean · 3h ago
What programming languages need is linguistic analysis. The fact that I can make variables like
Const true = False Const false = True
And do if (true) "won't happen" else if (false) "will happen" is all kinds of silly.
HelloNurse · 1d ago
The simplifying assumption that all tree nodes have the same type and the same function should be called on them seems unrealistic.
This kind of imperative iteration seems better served by the traditional visitor design pattern: more verbose (more explicit, not more complex) and more general.
hyperhello · 23h ago
This is interesting, but the syntax doesn't seem to have the right expressiveness for such a large change.
for recursive (Node t = tree.root; t != NULL;) {
puts(t.value);
if (t.value == target) break;
if (t.value == dontfollow) continue;
if (t.left) continue t.left;
if (t.right) continue t.right;
}
return t;
Regular 'break' is to really break out of the structure like a regular for, as regular 'continue' is to do the next iteration. But if continue has a value to recurse on, it reenters the for loop like a subroutine.
As a bonus, I think this is tail-call-optimization friendly.
toxik · 22h ago
Now allow it to do BFS or DFS, suddenly you're not far from just a vanilla BFS or DFS if you have a sensible stack/queue API.
hyperhello · 20h ago
DFS is pure flow control, like iterating an array. BFS isn't really the same simple deal as DFS, it either requires approaching the next level and saving it for the next depth somewhere, or navigating the entire previous level again.
func Search(Root : Tree; Desired_Value : Tree_Value)
-> optional Tree_Id is
var Identifier : optional Tree_Id := null;
for T => Root then Left(T) || Right(T)
while T not null concurrent loop
if Value(T) == Desired_Value then
// Found desired node,
// exit with its identifier
exit loop with Identifier => Key(T);
end if;
end loop;
return Identifier;
end func Search;
I actually don't use this feature much, because usually there is a key of some sort that determines which side of the tree is of interest.
hinkley · 21h ago
I think Java did the right thing by making tree shaped collections, since they’re iterable and that covers the for loop situation, but it doesn’t expose the tree structure, so you can’t tell if two nodes are siblings or relatives or indeed if they are even in the same tree.
But I think grafting those two together is the right answer, not inventing a new loop construct.
Trees are easy to write iterators for. DAGs are a bit harder and full graphs are an advanced interview question.
vinceguidry · 1d ago
When I went looking for this in Ruby, I eventually landed on RubyTree[1]. Being able to subclass the TreeNode makes everything so much friendlier. Sure, I could implement them myself, but getting converters to/from primitives for free is a lot off my mind. Would be nice to have it built into the language but honestly Ruby has a whole lot of stdlibs and default gems already.
Not as ergonomic as a direct tree-iterator, but I can't see of an elegant way to introduce that in an imperative language while keeping the forking/recursion aspect clear
adamc · 20h ago
Or maybe you need a different language, where implementing an efficient tree traversal operation is easier?
tbrownaw · 1d ago
Standard tree traversal loop (yes, missing some obvious conditionals I didn't feel like typing out):
var todo = new List<T>();
todo.append(root);
while (var item = todo.pop_front()) {
todo.append(item.left); // or .prepend for depth-first
todo.append(item.right); // or .prepend()
// do stuff...
}
meltyness · 1d ago
For perspective, I'm a novice with algorithm design, I've been grinding leetcode for the past 6 months or so, almost exclusively in Rust. I was bewildered by the same concern since I had initially set out to, not only focus on Rust, but to primarily maximize use of the Iterator construct, since I was not intricately familiar with it. A few months in I discovered that there was an appropriate Iterator construct which accomplishes the same thing.
// Comments for the non-Rust native reader, regarding this Function declaration:
// successors is a function that accepts an `Option` container for some Value of type T, called `first`
// and a Closure called `succ`, constrained below:
pub fn successors<T, F>(first: Option<T>, succ: F) -> Successors<T, F> ⓘ
where
// `succ` must receive the iterated state, and return the next iterated state
F: FnMut(&T) -> Option<T>,
// Each time the `next()` function is called on the returned Iterator (a Successors-flavored iterator),
// the state of `first` is yielded, and then
// `succ` is called to progress
// until a `None` type is reported by `succ`
I'm not sure where the concept came from, but it's not dissimilar to the author's implementation, but instead of the ControlFlow enum, it relies simply on the Option enum. I know though, that it was initially built in the Itertools crate as unfold and then upstreamed some time later.
Essentially you use `first` to contain a Queue, Stack, or Level for the different traversals, and define traversal or activities from there.
It's fairly ergonomic in practice, ergonomic enough for Leetcode.
Couldn't you just do this with a regular for loop and a few datatypes/functions? (This pseudocode is an Elm/Haskell/Rust/TypeScript-inspired abomination, but pretty portable to any language...)
type Node = { value: Any, left: Node, right: Node }
type Direction = Left | Right
type TreePosition = { root: Node, currentNode: Node = root, position: Direction[] = [] }
# Implementation left as an exercise but should be obvious and run in O(1), I believe. Returns Nothing when we're out of nodes.
function nextPosition(position: TreePosition): Option<TreePosition>
# The tree you want to iterate through
const myTree: Node = ...
# The loop
for(let position: TreePosition? = TreePosition(root: myTree); position != Nothing; position = nextPosition(position) {
node = position!.currentNode
# Your loop code
}
I'd argue this doesn't belong as a language-level feature, but maybe an API/stdlib-level feature.
xxs · 1d ago
Indeed, I don't see any need to have a tree as a language primitive along with a traversal function (iterator). Tree traversal is not really different than iterating over a vector/list. E.g. even java has stuff like: TreeSet.forEach(consumerRef) or for(Type val : tree) doStuffWith(val)
HelloNurse · 5h ago
Trees are an insignificant middle ground between lists (which, by the way, are degenerate trees) and general iterators (on general object graphs, or not even materialized).
foota · 22h ago
One thing I wish for is for standard library trees to present a way to take advantage of a trees structure, even if they don't directly expose the tree itself. For instance, C++ map could expose a way to start find or e.g., lower_bound from some node. While I'm making a wishlist, I also wish lower_bound worked with reverse iterators (see e g., https://stackoverflow.com/questions/64351187/stdlower-bound-...) :)
I think allowing for starting this kind of search from a given node would cover most (though not all) of what you'd want to expose the trees structure for directly.
pbohun · 1d ago
I think this kind of control mechanism is not a great idea and could lead to easy bugs.
I think it's a better idea to write a function that applies a function to the elements of a tree. Ideally you'd make a function for each traversal order. This makes it obvious what is happening.
map_tree_pre(my_tree, my_func);
map_tree_in(my_tree, my_func);
map_tree_post(my_tree, my_func);
map_tree_bfs(my_tree, my_func);
demarq · 22h ago
I would think iterators are exactly this. And they exist in almost every language.
bee_rider · 1d ago
What is this note before the code about?
> This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Also I’m slightly confused by this example.
for_tree(string x = ""; x.size() <= 8; x : {x+"a", x+"b", x+"c"}){
print(x);
}
So, our “next node” operation is to concatenate to x. Won’t we either have to have a method for modifying x to go “up” a node, or we’ll have to keep a record of what x was upon entering each node? Like in this example we’ll end up with x=“aaaaaaaa” and then go up a node, over to a “b” node, and get x=“aaaaaaaab”, right?
voidUpdate · 1d ago
The author said it could be implemented recursively, so a call with x="aaaaaaa" would call three more functions with x="aaaaaaaa", "aaaaaaab" and "aaaaaaac". you never need to go up a node
dmurray · 1d ago
I think the intention is that you keep a record of what x is in the current node, and at every node above the current node. In the recursive implementation which the author describes as equivalent, these values are kept in the stack.
bee_rider · 1d ago
Seems a bit inefficient…
I guess we can delete the a node’s copy of x after all of a node’s child nodes are visited, at least.
dmurray · 1d ago
Well, it doesn't need to be any worse than the stack-based implementation of a recursive tree traversal, and it can't be significantly better. You have to store that state somewhere.
Perhaps it can be optimized to be a little better than the recursive version, depending on how much overhead your language uses for a stack frame that it won't need for this special case.
bjourne · 1d ago
I'm 100% sure smug Haskellers have something very important to say here. :)
chowells · 17h ago
You know, if Haskellers already have that functionality, maybe it's less that they're smug and more that they have already solved the problem. Do you call someone smug when you complain about being freezing and they invite you in out of the cold?
Joel_Mckay · 1d ago
Normally someone would respond, but they are prone to lazy evaluation...
Thank you... I'll see myself out... lol =3
kelseyfrog · 1d ago
I've written about this before, but suggesting recursion for tree traversal is equivalent to suggesting goto is a replacement for if/for/while.
We don't have syntax and semantics for recursion schemes in any programming language - it's always deferred to library support at best. As far as I'm concerned, this is an open problem in programing language design where we finally replace the vestigial recursion hack with a proper structured programming solution.
skribanto · 18h ago
Well, recursion is well-structured and can be reasoned about through induction. Goto is unstructured…
JonChesterfield · 1d ago
This implementation recurses on the call stack and calls that out as a feature. A built into the language tree walk which overflows the stack on large trees is perfect for C++, get a paper over to WG21.
Or for a possibly more useful comment, constant space tree traversal is genuinely quite difficult to implement. I don't know a general solution to it (other than walk from the start N times, quadratic style), would be interested to hear of one.
HelloNurse · 1d ago
Constant space tree traversal is impossible and unnecessary. On the other hand predictable space, including adding parent pointers to cheat, is easy to achieve by counting the total number of nodes or the depth of the tree.
somat · 19h ago
"Constant space tree traversal is impossible..."
Naively I would expect something like left-hand-rule maze traversal to be constant space. The tree may need additional structures to support this sort of travel.
Other thoughts: most state machines are constant space.... there may be something there.
HelloNurse · 6h ago
The "left-hand-rule maze traversal" requires backing out of dead ends (in a tree data structure, up from leaf nodes), which amounts to one permanent parent pointer per node instead of a transient stack of nodes that we have not finished visiting (in addition to the baseline of child pointers).
kevinventullo · 1d ago
If nodes have pointers to their parent, constant space tree traversal is easy enough.
fedeb95 · 9h ago
programming languages shouldn't have such primitives, because they're not primitive. Otherwise, people programming different stuff all the time, would need other primitives, and you'd have unmaintainable languages full of bugs.
thatguysaguy · 18h ago
It's not at the language level, but for python JAX has the notion of a pytree (arbitrarily nested combination of lists, dicts, and tuples), and includes map and reduce operations for those objects. It's very convenient!
MontagFTB · 1d ago
Sean Parent developed ‘forest’ for C++ that automatically maintains the invariant that its structure is always a hierarchy. It includes full-order iteration through its default iterator: https://stlab.cc/2020/12/01/forest-introduction.html
usrnm · 1d ago
Don't need to go that far, std::map is a tree (the standard does not dictate it, but it is in all ilmplementations), and you could always traverse it. The whole idea of iterators was popularized by C++ and its STL.
MontagFTB · 20h ago
std::map uses a tree as an implementation detail to achieve certain performance guarantees. It is not a tree from the user’s perspective, however. That is, there are no parent/child relationships between the elements in a std::map.
lucasoshiro · 1d ago
This is heavy biased to C++, which is a language that already has too many features that people just want to use. This is a very specific case to be placed as a language feature, and could be just a lib.
If this becomes a C++ feature, imagine how many data structures we would need to support?
Many other languages, specially the FP languages, allow to do that as a library. Even the languages that are only inspired by FP. Example, Ruby:
class BinTree
include Enumerable
def initialize v, l, r
@v, @l, @r = v, l, r
end
def each &block
@l.each(&block) unless @l.nil?
yield @v
@r.each(&block) unless @r.nil?
end
end
Using the Enumerable mixin includes many FP-based methods, such as map, filter and reduce by only defining each, which in this case is DFS.
Once you have algebraic data-types in a language, writing a recursive visitor pattern is pretty simple.
Encoding the semantics of a tree traversal operator likewise is difficult in the general case. What exactly would the order be, what if I want to traverse in a non-standard ordering, what about skipping branches; all would be difficult to cleanly represent.
I have seen it done where you return actions with key ones being recurse, stop, replace, and replace & then carry out some function, but again, this is pretty simple to implement.
Though Haskell's Traversable is similar in name, then depending on what the developer intends, both Functor, Foldable or Monad could also help with the traversal. I.e, Haskell already has what the blog post asks for.
MJGrzymek · 1d ago
sticking to dfs, there is still a difference between pre-order/in-order/post-order
f1shy · 23h ago
Well common lisp has it.
scotty79 · 4h ago
Branching iteration seems like a great idea. But I think the syntax shouldn't come from tree traversal. It should be something general. You tell the normal for loop how to go to the next sibling (`increment`) and when to end naturally (`condition`) or prematurely (`break`) and how to skip current sibling (in whole or partially) (`continue`). Branching iteration would also need to know how and when to branch, that includes skipping branching for this particular item, end whole iteration naturally and prematurely.
So something similar to `break` and `continue` should be the method of branching (descending into the tree).
I think Scala might be a great language to prototype implementation of such construct.
lutusp · 1d ago
The article's thesis relies on the idea that a genuinely primitive traversal action exists, in the way that a for-loop is primitive and widely applicable, or adding two floats is common enough to justify building it into the language (or processor).
But tree traversal doesn't have this universal property. There are too many methods and purposes for traversing a tree, sufficient that IMHO no single primitive embodiment could materially improve a language. Also, modern compilers efficiently break down high-level traversal code so well that expressing the idea at a high level incurs no serious penalty compared to having a primitive for that purpose, or a series of them.
fngjdflmdflg · 1d ago
I read a similar argument for graphs in general,[0] where this is more obviously true.
A TXR Lisp macro for-tree for traversing a tree, given a variable, an expression for the root node, and expressions for how to to left and right relative to the variable.
A node structure that for-tree knows nothing about. No iterator abstraction or API, nothing.
(defstruct node ()
value
left
right)
Example tree (a balanced binary tree whose numeric values happen to be sorted):
(defvar example-tree #S(node value 5
left #S(node value 3
left #S(node value 1)
right #S(node value 4))
right #S(node value 7
left #S(node value 6)
right #S(node value 9)))
The left-expr and right-expr must be expressions that involve the var variable; the construct evaluates these to find the left or right child. When the value is nil it means there is no child.
This is almost exactly what the blog is asking for, transliterated to Lisp syntax.
Original concept in fantasy C syntax:
for_tree(Node* N = mytreeroot; N != NULL; N : {N->left, N->right}
{
print(N->value);
}
As required, the construct specifies the node variable to be the loop dummy, the root expression giving its initial value and the two accessor expressions for left and right traversal.
The termination test N != NULL is made implicit in the Lisp version, so early termination requires a break out of the loop. It could be arranged.
The fantasy C syntax specifies a variable name in the navigation declarator: N : { N->left, N->right }. Presumably, the variable here can be renamed to anything you want; it just arbitrarily happens to have the same name as the loop dummy.
I didn't replicate this feature exactly because it just adds verbosity for nothing. It's okay if the navigation declarator just refers to the loop's one and only dummy variable.
Anyway, programming languages should definitely not have a tree traversal primitive. Rather, languages should be Lisp.
kazinator · 18h ago
Here is the same for-tree macro processing TXR Lisp's built-in search tree type. That does not use structs, but built in node and tree types.
The left, right and key fields of a node object are accessed with like-named functions. We must call tree-root on a tree object to get to the encapsulated root node:
How about a tree of integers. Let's say the root integer is 1, and to go left is to multiply by 2, and to go right is to multiply by 2 and mask in a 1. And let's say the depth caps out at the value 16:
3> (for-tree n 1 (if (< n 8) (* 2 n)) (if (< n 8) (+ 1 (* 2 n)))
(format t "~b\n" n))
1000
100
1001
10
1010
101
1011
1
1100
110
1101
11
1110
111
1111
nil
Let's right-align digits to see different patterns in it:
2> (for-tree n 1 (if (< n 8) (* 2 n)) (if (< n 8) (+ 1 (* 2 n)))
(format t "~6b\n" n))
1000
100
1001
10
1010
101
1011
1
1100
110
1101
11
1110
111
1111
nil
Nah, left-aligned wins.
kazinator · 18h ago
I forgot I can remove the outer parentheses, and also use infix with C-like operators, since I have those modes turned on in the REPL. (Infix is an unreleased feature, coming in TXR 300).
It's called sum types and pattern matching, and this is like 80% of the reason why I like Swift and Rust.
eru · 21h ago
Haskell (for example) doesn't even have a looping primitive. Why would it have a tree traversal primitive?
Both looping and tree traversal can be done with library functions.
In general, everything that can be done with library functions, should be done with library functions.
> Doesn't for_tree(...) look a lot nicer and simpler and less error prone than needing to implement a recursive function for each operation you would want to do on a tree?
Eh, in Haskell you can derive many of these things automatically. And even Rust has some automatic derivation.
gue5t · 19h ago
It seems like this is grasping for languages to expose recursors[1] or induction principles. These factor out the structure of an inductive type (e.g. trees) and leave the caller to simply plug in the behavior to perform for each possible situation that might arise during traversal.
For isn't "linear traversal", it's linear execution. It's up to the data structure to define what linear traversal is. For arrays and lists that's trivial. For trees and, more generally, graphs there are many ways to do traversal. This is what iterators are for.
jdeaton · 19h ago
Its called recursion
> Doesn't for_tree(...) look a lot nicer and simpler and less error prone than needing to implement a recursive function for each operation you would want to do on a tree?
No it does not
austin-cheney · 14h ago
Operating systems have this with file systems.
Web browsers have this. It’s the DOM and it scares the shit out of most JavaScript developers. The fun part is confronting people about their irrational fears watching them spin their wheels with all the imagined excuses.
I am a tremendous fan of tree systems. Tree systems should completely replace inheritance in programming to enforce the composition over inheritance principle.
jonstewart · 23h ago
I worked for Guidance Software in the aughts and their main product, EnCase, had its own proprietary scripting language built in, called EnScript. Everything in EnCase inherited from “NodeClass”, a linked list for creating trees.
EnScript had a forall(NodeClass n in tree.GetRoot(){} construct that was very easy. It was essentially a depth-first iterator.
Mikhail_K · 23h ago
Ordering pizza is much more important and frequent part of a developer's work than tree traversal. Therefore, programming languages need pizza ordering primitive even more, than tree traversal ones.
chess_buster · 1d ago
Nonsense.
Programming languages should have less primitives like this and instead we should have better foundation libraries for the languages, i.e., containing iterator/-interfaces like Rust and Python (or Smalltalk).
specialist · 1d ago
FWIW, I use Null Objects to eliminate null checks. Makes recursion (tree climbing) more concise, legible.
Bonus: Java's HotSpot magick will NOP (most?) methods of Null Objects, making this a zero cost abstraction.
I should probably write a concrete example for a blog or something.
TLDR: For every base class such as TreeNode, create a NullTreeNode that does nothing, then replace all uses of null with NullTreeNode. Voilá, no more null checks or NPEs.
meindnoch · 1d ago
Horrible idea.
1. BFS is not supported.
2. Traversal type (inorder, preorer, postorder, or even mixed order!) is also not handled.
3. The syntax doesn't make it apparent that stack overflow can occur, e.g. by doing DFS on a linked list.
danieloj · 23h ago
As a fullstack web engineer I've never had to implement a tree structure at work. I'd love to hear examples of what kinds of companies/platforms people are writing these structures regularly. If anyone is willing to share where they use them I'd appreciate it
rcxdude · 6h ago
I've used them (though more generally a directed acyclic graph) when doing certain kinds of analysis of code execution paths (working in embedded software). They do tend to show up in UI work as well (in an ad-hoc, implicit fashion), in my experience, though generally when you're looking at something across the whole UI as opposed to implementing a specific element of it. I've also used them in a somewhat ad-hoc build-system like utility as well.
(And to be clear, none of these involve sitting down and writing some generic Tree<T> structure, they're all cases of "Well, there's some tree-like datastructure that already exists or is a natural fit for this system and I'm going to be traversing it depth-first or breadth-first to do something to the elements of it or calculate something based on what nodes I pass through on a given traversal")
mvc · 22h ago
You might not implement them but as a web engineer you're using them all the time. So all these tools that you use will have tree implementations in them.
- Every html document is a tree structure. And css documents have special syntax to address nodes within those trees
- If it's any good, you're routing framework probably uses some kind of tree to quickly match the request to it's handler
- The database you write to uses a tree to quickly find the location it needs to write to
chuckadams · 21h ago
I think that’s kind of the GP’s point, that there’s no “implementing a tree” so much as just using the tree that’s naturally there. When I make nested objects, I don’t think of trees or some boxes-and-arrows diagram, I just think “product type”. It’s good to know your graph algorithms, sure, but thinking about the data representation isn’t something one needs to have in the front of their mind in any decently abstracted language.
Philpax · 22h ago
The (V)DOM is a tree. Knowing that is useful for manipulating and composing it.
Major tools that exist today for partial structure traversal and focused manipulation:
- Optics (Lenses, Prisms, Traversals)
- Zippers - Query Languages (for semantic traversal and deep search)[1] https://www.cs.ox.ac.uk/jeremy.gibbons/publications/iterator...
The problem is that 'lens', 'monocle', etc. are famously abstract and difficult for people to apply to their actual problems. IMO, the solution would be for standard libraries to specify interfaces called 'BreadthFirstTraverse', 'DepthFirstTraverse', etc.
Optics are famously abstract in implementation, but I don't think people have trouble applying them - people seem to like JQuery/CSS selectors, and insist on `object.field` syntax; it's kind of wild that no mainstream language has a first-class way to pass around the description of a location in an arbitrary data structure.
[1] https://ghc-proposals.readthedocs.io/en/latest/proposals/002...
[2] https://ghc-proposals.readthedocs.io/en/latest/proposals/015...
[1] https://en.cppreference.com/w/cpp/types/offsetof
[1] https://github.com/graninas/cpp_lenses [2] https://github.com/jonsterling/Lens.hpp
I think people are often too enamored by general purpose languages that can express such abstractions natively. I don't see an issue with a language that provides this as a primitive without being able to express it itself, constraints can be useful for other properties. Once you can traverse trees, most programming problems can be tackled even in such constrained languages, eg. SQL with CTE.
For C++, you could define yourself a template that expands to the two functions you listed.
For any language, you could write yourself a pre-processor that adds for_tree notation and expands it, either with pre-processor semantics or working on the abstract syntax tree (which is more "proper" but also more work"). I would recommend the former to test notations, and then you can live with them for a while to experiment with them and see if and how often you really need the construct (recall that is how C++ evolved from C via "C with classes" - it was first a pre-processor). Once you and others are 100% convinced of the new syntax, go to your prefered language's working group/ISO committee and propose inclusion.
My own feeling is that this is not something for which I need more than recursion; calling inside traverse() traverse(left) and traverse(right) for binary trees or using a normal for loop to iterate over all this->children() from 0 to this->children.size() is something that occurs in graph libraries and parsers/compilers once in a while but not often enough to warrant its own notation. Rather, when I look at languages like C++, I'd have a language that is simpler, cleaner, more homogeneous and more orthogonal; C++ is complicated, convoluted, too large to implement it correctly by a single person in reasonable time (compared to beauties like C, Pascal, Scheme), so I stand on the side of minimalism.
Be that as it may, for C++, Eric Neibler's [Boost.Proto](https://www.boost.org/doc/libs/1_84_0/doc/html/proto.html) could likely help conveniently connect syntax to implementation to achieve something similar to what the author is taking about.
https://wiki.haskell.org/index.php?title=Research_papers/Gen...
Think of these as programmable accessors and updaters.
How is iterating through something already not 'programmable' ?
If you read a bit more about them, I think you will be surprised to see the breadth of what these abstractions can be used for. To start, they've been used to build a new compositional foundation for game theory[1], and they can be used to model gradient-based learning in machine learning[2].
As for their simpler applications as getters and setters, they are programmable in the sense that you can think of lens/prism types as interfaces that can be implemented arbitrarily. So you can create your own lenses and combine them like legos to construct new, "bigger" getters and setters from the smaller components.
[1] https://arxiv.org/pdf/1603.04641 [2] https://arxiv.org/abs/2103.01931
EDIT: fixed typo
When does someone give up on the pagentry of programming and just get something done by looping through data instead of "constructing getters and setters to model gradient based machine learning".
It really seems like the straight forward way of doing things is to make simple iteration and worry about the game theory after that is all figured out.
I think your comment is valuable in spirit because it could bring a more thorough discussion of the creation or popularization of a syntactically clean and a shift of the idiomatic-ness of such a solution to traversing trees. Leaving the more abstract and general Optics discussion to be a side dish for commenters to discuss as well.
Also, just a nitpick, but traversing a tree is a broader topic than iteration. Iteration implies, both colloquially and formally, performing a computation on each element of a data structure, while traversal is a more general algorithm that could be the basis of iteration, but may also be more akin to a search through the data structure without expectation of performing computation until the searched for ‘member’ is returned from the traversal. So it’s not an apples-to-oranges comparison with your conflation of iteration and traversal, but does seem a little like Canis Familiaris-to-Mammal comparison.
I guess/expect that will have to expose the implementation detail of the branching factor (yes, a binary tree is a special-case of a https://en.wikipedia.org/wiki/B-tree for B = 2, but algorithms tend to be different)
On top of that, build a library of tree traversal routines (breadth first, depth first with a few variants), allowing the user to specify code that decides, on each node visited, whether to navigate further down and whether to stop the traversal.
Now, whether the language proper needs a generic way to specify tree traversals? I doubt it, and doubt even more that it is possible too that in a way so that a) most graph traversal algorithms can use the primitive and b) the result is easier to use/looks significantly nicer than just calling those routines.
https://clojuredocs.org/clojure.zip/zipper
Complex data structures absorb lot of the complexity of the problem and reduce the complexity of the rest of the code.
If every ounce of performance matters, e.g. in a database, you want 10000 functions, 100 for each data structure.
I guess all you really need are dynamically allocated arrays. A cons cell is an array of two. A struct with N fields is an array of N. Everything else is built on that.
I use it in every project for data navigation and transformation, and it's more performant than standard Clojure data manipulation, while retaining types (instead of coercing back from seqs).
E.g. if you have a map and you want to increment every value in the map: (require '[com.rpl.specter :as S])
``` (->> {:a 5, :b 6, :c 7} (S/transform [S/MAP-VALS] inc)) => {:a 6, :b 7, :c 8} ```
^ try to write that in normal clojure.
Now let's say you have a map of vectors and want to increment all of those? (->> {:a 5, :b 6, :c 7} (S/transform [S/MAP-VALS S/ALL] inc)) ;; note the navigator juts got another level of nesting => {:a [2 3], :b [4 5], :c [6 7]}works for all clj data types, and of course it has navigators for recursive walking .
It took me a while to get good at Specter, but it was worth it. I hear Rama uses Specter navigators internally.
https://www.metosin.fi/blog/transforming-data-with-malli-and...
Here's the range based for loop counterexample from the blog post, as a python generator:
[1] https://en.cppreference.com/w/cpp/coroutine/generator
Breadth-first is slightly trickier, but only slightly trickier one time.
Yes, students should absolutely implement the classic algorithms to learn.
Yes, there are some occasions when you need to home grow one at $work.
BUT, in my opinion, most of the time, professional code should use a battle tested, vuln hardened library or builtin version. These things are VERY HARD to get exactly right. Jon Bently's Programming Pearls famously had a latent bug in its binary search for 20 years before someone caught it.
https://research.google/blog/extra-extra-read-all-about-it-n...
So yeah, it looks easy but don't do it. Stand on some giant's shoulders instead.
Anyone who copies and pastes it is welcome to both pieces when it breaks. Others have already alluded to possible improvements that could be made, and I already have my own analysis in a grandchild reply as to why I don't think this is a terribly pressing need or necessarily even a good idea.
The reason I provide code is that it gets past the "oh, you say it's just an iterator, but I still don't believe you, since you haven't spelled it out to the n'th degree". When code is provided, belief ceases to be an issue. It is clearly something an iterator can implement, in existing languages, with existing iterator support.
Unless you're going to claim it is somehow impossible to provide this functionality in a tested manner, you're completely changing the topic in an uninteresting direction, since it is always true that functionality generally needs testing and bits of code slammed into an HN conversation just to make a particular point probably shouldn't be copied wholesale into your production code.
While easy, I think bisect would be a good addition to every stdlib too.
I don't think this is a large problem in practice because you shouldn't be using dozens of tree types in a given code base, so adding iterators to a tree is no big deal. In general there aren't enough types of iteration available to a given data structure that you need to describe how to iterate on it from the "outside". (Generally when you are doing that, it's too non-trivial to fit into this pattern anyhow; see the Visitor pattern in general.) This strikes me as maybe the sort of default tool you might slap in a library somewhere, but it should be a niche tool. If you're using it all the time you're probably doing something wrong. By default your data structures should be providing iteration packaged with them and it should generally be what you need. And your language should support aborting iteration, in whatever that looks like normally. I'm not sure I know a language that doesn't, it's a fairly basic element of iterator support when you get into implementation.
There are also many cases where a tree iterator will perform significantly better, including CPython. I don't have enough experience with PyPy to know if it could inline the Tree.left and Tree.right calls down to zero penalty at JIT time. Rust and C++ and the other static languages with sophisticated compilers might be able to get that down to fully inlined and zero-cost, but even if they can it's probably better not to push that on to the optimizer as the optimizers will eventually give up if this is composed with enough other stuff. Better to just have an efficient implementation in the first place.
(this is for iterating over nested JSON-like objects, which are just weird trees)
There are a lot of ways you could avoid the recursion, but that's a particularly nice way!
[1] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.bin...
IMO the thing that would be really nice is if control flow like `for` was actually the same as using an iterator. This would really help in Rust too where handling errors inside iterator callbacks is a right pain.
I've seen a few languages try this but it seems to not be very popular. I think it can get a bit confusing how control flow keywords like `return` and `break` work if you turn `if` into syntactic sugar for a function call taking a closure etc.
In PHP you loop through an iterator with the foreach control structure.
In JavaScript you use for of.
In rust it’s for in.
What am I missing?
> Well a range based for loop requires that your tree exist in memory AND that you have an iterator defined for your tree. With for_tree you could operate on an entirely imperative tree, without needing to define any iterators or generator functions. Here's an example where I'm checking every single string composed of "a", "b", and "c" of length 8 or less.
You could definitely find every string composed of "a", "b", and "c" of length 8 or less by defining a custom iterator but it would be a verbose and unpleasant way of writing it:Ceterum censeo this would be a family of simple macros in LISP.
Python has a number of ways to achieve this depending on exactly how you want to pass the arguments; multiple functions, optional arguments, etc. How nice the final call looks is more about your local language's closures look.
The main point here is that this will happily iterate on things that don't "exist".
(Since things are lazy in Haskell, functions that return lists effectively are iterators. There's probably something in the standard library somewhere for (opts <*> [x]) to avoid the wrapping x in an unnecessary list, but my Haskell is rusty.)And yes, Haskell is amazing at this sort of thing.
If the poster wants to particularize this to C++ because C++'s syntax can't support it in any reasonable manner, that's fine, but that's a C++ problem, not a "Programming languages..." problem. Which would be perfectly understandable and I'm not really complaining, more clarifying that most of the rest of the world can just rub together three or four existing constructs in a pretty reasonable manner to get this.
For example if you literally don’t care about the order then your code can map over a default iterator that is efficient (DFS).
Then, for some cases, depth-first traversal is needed; for others, breadth-first.
Then, there's parallelism, and even the plain old for loops aren't parallel by default.
By the time you specify exactly what you need from a tree traversal, you've written code to do it.
And if you're fine with some default choice — you already can use the default iterator with the for_each loop.
I don't see what need there is for adding an extra for_tree syntax to do that.
Trees are often considered too diverse to include even in the standard library, let alone as a primitive. Even Python doesn't have trees in the standard library. I'm sure it's been proposed and rejected at some point
I hate to be a broken record, but this again goes back to the assumption of a specific sequential computation model. Modelling relations as in SQL automatically supports for N-ary child relations. There are other computation models! eg. logic, relational, etc.
Let's be practical about it: the reason you might want to say map_tree is so it could potentially be optimized to run more quickly, e.g. via automatic parallelization.
But to even parallelize a tree read operation we'd have to pull in other threads, split work, we'd want metadata on each node (child counts), and at the end of the day within each worker you'd still have a serial execution order.
For map_tree to have practical benefit your language would need to make a ton of opinionated choices about trees and parallel programming. Feasible? Yes. Worth making it a basic primitive? Well, even LISPs don't offer map_tree, despite building the whole language on top of car/cdr.
Because "trees" and "graphs" aren't actually single concepts, they're families of concepts that can vary along many dimensions. Most programming languages offer one or a small handful of sequence types, because most sequence use patterns are the same. Most languages don't offer a tree or graph type, and as soon as you try to implement one you see why: there are multiple different kinds of trees and graphs, and multiple different representations that are appropriate for different use cases, and any one you might pick would be unsuitable for most programs people want to write.
For example the most basic operations of a pointer are to advance and dereference.
std::map is actually implemented as a tree. To iterator its members you can do
The only requirement for your custom data structure to work is to implement begin() and end() which return iterators - “pointer like” objects.Are generators not a thing in other languages?
At the same time, the way it's implemented in Haskell's recursion-schemes library might be hard to wrap one's head around at first, kind of like how the list functions "foldr" and "foldl" are also often confusing to newbies, even though they're like the go-to, default way to make list functions in Haskell.
But it's not obvious to me how you'd write a generic iterator that supports something like finding all free variables in an expression tree, or even just express a tree mapping function that constructs a new tree with the same structure (for example, "make all variables in the expression uppercase"). It's been a while since I worked with iterators so correct me if I'm wrong and this is in fact easy with iterators.
With something like Haskell's recursion-schemes library [0] these operations are all just a few lines long, guaranteed to terminate, and don't need to be updated if your data structure changes (for example you add new expression nodes). I'm not aware of any non-functional programming language that can do that. See for example the freeVars function at the bottom of the linked recursion-schemes doc.
[0] https://hackage.haskell.org/package/recursion-schemes-5.2.3
The same from another angle: there are a lot of trees in the indices of SQL databases (example [1]) but we don't zoom in to that level of detail very often when defining our tables.
[1] https://www.postgresql.org/docs/current/btree.html
To implement Brown's algorithm to optimize class-based language models I had to implement a complex forest (DAG, actually) in Python using lists of fixed length. That was not especially nice to work with.
OTOH I read/heard that the beauty of array languages is that they have few data types, but they can be easily worked on. So maybe the answer is not easier tree traversal primitives, but better literature/training on how to transform it in a more manageable data type.
Sometimes the answer is to adapt our vision to our world, sometimes the answer is to adapt the world to our vision.
One great reason not to use recursive functions for traversing trees is that you can allocate your own stack data structure rather than relying on the call stack itself. In most languages/runtimes, the call stack has a maximum depth which limits the depth of trees you can process, usually on the order of thousands of stack frames.
Managing your own stack usually produces weirder looking code (personally I find "naive" recursive approaches more readable) - but having it as a first-class language feature could solve that!
Candidates: Racket, Scheme, Rust.
Defining your own iterator would be enough for most cases.
I explored this idea with gstd, a standard library for graphs inspired by the JS Array interface: https://github.com/cdrini/gstd/
https://godbolt.org/z/fnGzszf3j
Here is how I one could write in C: https://godbolt.org/z/P3ErP4T4d or possibly with the setup code moved into the helper function for convenience: https://godbolt.org/z/soz7W5z1G
It allows you to specify traversal in a structure-shy way. Personally, I think it goes a little too far, but it's definitely a good inspiration.
[1] https://www2.ccs.neu.edu/research/demeter/
Presentation: https://www2.ccs.neu.edu/research/demeter/talks/overview/qua...
For exposition, lets seperate loop initialization from the rest, make `continue` mandatory to repeat the loop with another specified value (i.e. the loop implicitly ends without a continue) and lets say `descend` to recurse into the tree.
Compare with a linear loop: The `descend` keyword differs from `continue` in that its more of a continuation - control flow will continue from that point once it is done.You could make this more functional and less imperative (and I'd be surprised if such things don't exist as libraries in some functional languages). Yes ultimately we can transform this to recursive functions and compile it that way... but for imperative languages (especially those without closures) something like the above might be useful.
Arguably the above loops are _more_ imperative and less declaritive than the standard ones from the C family. You could add a functional twist by breaking with a value (which is the result of the overall `loop` expression).
Const km = 1 Const velocity = 11
And do km + velocity = 12 is all kinds of silly.
Edit: Assuming your example is even statically typed.
Const true = False Const false = True
And do if (true) "won't happen" else if (false) "will happen" is all kinds of silly.
This kind of imperative iteration seems better served by the traditional visitor design pattern: more verbose (more explicit, not more complex) and more general.
As a bonus, I think this is tail-call-optimization friendly.
But I think grafting those two together is the right answer, not inventing a new loop construct.
Trees are easy to write iterators for. DAGs are a bit harder and full graphs are an advanced interview question.
1: https://github.com/evolve75/RubyTree
Not as ergonomic as a direct tree-iterator, but I can't see of an elegant way to introduce that in an imperative language while keeping the forking/recursion aspect clear
Essentially you use `first` to contain a Queue, Stack, or Level for the different traversals, and define traversal or activities from there.
It's fairly ergonomic in practice, ergonomic enough for Leetcode.
Here's a BFS: https://leetcode.com/problems/course-schedule-iv/solutions/6...
[0] https://doc.rust-lang.org/std/iter/fn.successors.html
[1] https://docs.rs/itertools/latest/itertools/fn.unfold.html
I think allowing for starting this kind of search from a given node would cover most (though not all) of what you'd want to expose the trees structure for directly.
I think it's a better idea to write a function that applies a function to the elements of a tree. Ideally you'd make a function for each traversal order. This makes it obvious what is happening.
map_tree_pre(my_tree, my_func);
map_tree_in(my_tree, my_func);
map_tree_post(my_tree, my_func);
map_tree_bfs(my_tree, my_func);
> This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Also I’m slightly confused by this example.
}So, our “next node” operation is to concatenate to x. Won’t we either have to have a method for modifying x to go “up” a node, or we’ll have to keep a record of what x was upon entering each node? Like in this example we’ll end up with x=“aaaaaaaa” and then go up a node, over to a “b” node, and get x=“aaaaaaaab”, right?
I guess we can delete the a node’s copy of x after all of a node’s child nodes are visited, at least.
Perhaps it can be optimized to be a little better than the recursive version, depending on how much overhead your language uses for a stack frame that it won't need for this special case.
Thank you... I'll see myself out... lol =3
We don't have syntax and semantics for recursion schemes in any programming language - it's always deferred to library support at best. As far as I'm concerned, this is an open problem in programing language design where we finally replace the vestigial recursion hack with a proper structured programming solution.
Or for a possibly more useful comment, constant space tree traversal is genuinely quite difficult to implement. I don't know a general solution to it (other than walk from the start N times, quadratic style), would be interested to hear of one.
Naively I would expect something like left-hand-rule maze traversal to be constant space. The tree may need additional structures to support this sort of travel.
Other thoughts: most state machines are constant space.... there may be something there.
If this becomes a C++ feature, imagine how many data structures we would need to support?
Many other languages, specially the FP languages, allow to do that as a library. Even the languages that are only inspired by FP. Example, Ruby:
Using the Enumerable mixin includes many FP-based methods, such as map, filter and reduce by only defining each, which in this case is DFS.Then we can proceed to define a binary tree:
Iterate over all the elements: Iterate over the even elements: Stop iteration when finding a value: And so on. The same can be done in Python, Kotlin and many others.C++ already solved that problem. Iterator are designed so that an algorithm can be written once and used with multiple data structures.
I really don't see your point
https://cs3110.github.io/textbook/chapters/data/trees.html
Encoding the semantics of a tree traversal operator likewise is difficult in the general case. What exactly would the order be, what if I want to traverse in a non-standard ordering, what about skipping branches; all would be difficult to cleanly represent.
I have seen it done where you return actions with key ones being recurse, stop, replace, and replace & then carry out some function, but again, this is pretty simple to implement.
So something similar to `break` and `continue` should be the method of branching (descending into the tree).
I think Scala might be a great language to prototype implementation of such construct.
But tree traversal doesn't have this universal property. There are too many methods and purposes for traversing a tree, sufficient that IMHO no single primitive embodiment could materially improve a language. Also, modern compilers efficiently break down high-level traversal code so well that expressing the idea at a high level incurs no serious penalty compared to having a primitive for that purpose, or a series of them.
[0] https://www.hillelwayne.com/post/graph-types/ and https://news.ycombinator.com/item?id=39592444
A TXR Lisp macro for-tree for traversing a tree, given a variable, an expression for the root node, and expressions for how to to left and right relative to the variable.
A node structure that for-tree knows nothing about. No iterator abstraction or API, nothing.
Example tree (a balanced binary tree whose numeric values happen to be sorted): Walk it: Code: The left-expr and right-expr must be expressions that involve the var variable; the construct evaluates these to find the left or right child. When the value is nil it means there is no child.This is almost exactly what the blog is asking for, transliterated to Lisp syntax.
Original concept in fantasy C syntax:
Lispified: As required, the construct specifies the node variable to be the loop dummy, the root expression giving its initial value and the two accessor expressions for left and right traversal.The termination test N != NULL is made implicit in the Lisp version, so early termination requires a break out of the loop. It could be arranged.
The fantasy C syntax specifies a variable name in the navigation declarator: N : { N->left, N->right }. Presumably, the variable here can be renamed to anything you want; it just arbitrarily happens to have the same name as the loop dummy.
I didn't replicate this feature exactly because it just adds verbosity for nothing. It's okay if the navigation declarator just refers to the loop's one and only dummy variable.
Anyway, programming languages should definitely not have a tree traversal primitive. Rather, languages should be Lisp.
The left, right and key fields of a node object are accessed with like-named functions. We must call tree-root on a tree object to get to the encapsulated root node:
How about a tree of integers. Let's say the root integer is 1, and to go left is to multiply by 2, and to go right is to multiply by 2 and mask in a 1. And let's say the depth caps out at the value 16: Let's see that in binary: Let's right-align digits to see different patterns in it: Nah, left-aligned wins.Both looping and tree traversal can be done with library functions.
In general, everything that can be done with library functions, should be done with library functions.
> Doesn't for_tree(...) look a lot nicer and simpler and less error prone than needing to implement a recursive function for each operation you would want to do on a tree?
Eh, in Haskell you can derive many of these things automatically. And even Rust has some automatic derivation.
[1]: https://lean-lang.org/doc/reference/latest//The-Type-System/...
> Doesn't for_tree(...) look a lot nicer and simpler and less error prone than needing to implement a recursive function for each operation you would want to do on a tree?
No it does not
Web browsers have this. It’s the DOM and it scares the shit out of most JavaScript developers. The fun part is confronting people about their irrational fears watching them spin their wheels with all the imagined excuses.
I am a tremendous fan of tree systems. Tree systems should completely replace inheritance in programming to enforce the composition over inheritance principle.
EnScript had a forall(NodeClass n in tree.GetRoot(){} construct that was very easy. It was essentially a depth-first iterator.
Programming languages should have less primitives like this and instead we should have better foundation libraries for the languages, i.e., containing iterator/-interfaces like Rust and Python (or Smalltalk).
Bonus: Java's HotSpot magick will NOP (most?) methods of Null Objects, making this a zero cost abstraction.
I should probably write a concrete example for a blog or something.
TLDR: For every base class such as TreeNode, create a NullTreeNode that does nothing, then replace all uses of null with NullTreeNode. Voilá, no more null checks or NPEs.
1. BFS is not supported.
2. Traversal type (inorder, preorer, postorder, or even mixed order!) is also not handled.
3. The syntax doesn't make it apparent that stack overflow can occur, e.g. by doing DFS on a linked list.
(And to be clear, none of these involve sitting down and writing some generic Tree<T> structure, they're all cases of "Well, there's some tree-like datastructure that already exists or is a natural fit for this system and I'm going to be traversing it depth-first or breadth-first to do something to the elements of it or calculate something based on what nodes I pass through on a given traversal")
- Every html document is a tree structure. And css documents have special syntax to address nodes within those trees
- If it's any good, you're routing framework probably uses some kind of tree to quickly match the request to it's handler
- The database you write to uses a tree to quickly find the location it needs to write to