> It also has 6 seconds of inactivity before starting any useful work. For comparison, ninja takes 0.4 seconds to start compiling the 2,468,083 line llvm project. Ninja is not a 100% fair comparison to other tools, because it benefits from some “baked in” build logic by the tool that created the ninja file, but I think it’s a reasonable “speed of light” performance benchmark for build systems.
This is an important observation that is often overlooked. What’s more, the changes to the information on which this “baked in” build logic is based is not tracked very precisely.
How close can we get to this “speed of light” without such “baking in”? I ran a little benchmark (not 100% accurate for various reasons but good enough as a general indication) which builds the same project (Xerces-C++) both with ninja as configured by CMake and with build2, which doesn’t require a separate step and does configuration management as part of the build (and with precise change tracking). Ninja builds this project from scratch in 3.23s while build2 builds it in 3.54s. If we omit some of the steps done by CMake (like generating config.h) by not cleaning the corresponding files, then the time goes down to 3.28s. For reference, the CMake step takes 4.83s. So a fully from-scratch CMake+ninja build actually takes 8s, which is what you would normally pay if you were using this project as a dependency.
Night_Thastus · 41m ago
I am extremely interested in this.
I am stuck in an environment with CMake, GCC and Unix Make (no clang, no ninja) and getting detailed information about WHY the build is taking so long is nearly impossible.
It's also a bit of a messy build with steps like copying a bunch of files from the source into the build folder. Multiple languages (C, C++, Fortran, Python), custom cmake steps, etc.
If this tool can handle that kind of mess, I'll be very interested to see what I can learn.
bgirard · 1h ago
That's really cool. Fascinating to think about all the problems that get missed due to poor or missing visualizations like this.
I did a lot of work to improve the Mozilla build system a decade ago where I would have loved this tool. Wish they would have said what problem they found.
dhooper · 55m ago
(OP here) Thanks!
My call with the Mozilla engineer was cut short, so we didn't have time to go into detail about what he found, I want to look into it myself.
tiddles · 29m ago
Nice, I’ve been looking for something like this for a while.
I’ve noticed on my huge catkin cmake project that cmake is checking the existence of the same files hundreds of times too. Is there anything that can hook into fork() and provide a cached value after the first invocation?
lights0123 · 15m ago
My tips for speeding up builds (from making this same project but with ebpf):
- switch to ninja to avoid that exact issue since CMake + Make spawns a subprocess for every directory (use the binary from PyPi for jobserver integration)
- catkin as in ROS? rm /opt/ros/noetic/etc/catkin/profile.d/99.roslisp.sh to remove 2 python spawns per package
aanet · 35m ago
This is fabulous!!
Is there a version available for MacOS today?? I'd love to give it a whirl... For Rust, C++ / Swift and other stuff.
Thanks!
dhooper · 32m ago
I'll be sending out the a macOS version to another wave of beta users after I fix an outstanding issue, if you sign up (at bottom of article) and mention this comment I can make sure you're in that wave.
aanet · 22m ago
Thanks. Signed up
Night_Thastus · 31m ago
It looks like it doesn't have a public release for any OS yet, but has a way to enter for early access.
supportengineer · 49m ago
Amazing! Great job!
What limits your tool to compiler/build tools, can it be used for any arbitrary process?
dhooper · 44m ago
Thank you! Yeah it can be used for any type of program, but I haven't been able to think of anything besides compilation that creates enough processes to be interesting. I'm open to ideas!
DiddlyWinks · 40m ago
Video encoding and 3-D rendering are a couple that come to mind; I'd think they'd launch quite a few.
This looks like a really cool tool!
klik99 · 17m ago
Nice! Leaving a comment to easily find this later, dont have anything to add except this looks cool
mgaunard · 30m ago
The real solution is to eliminate build systems where you have to define your own targets.
Developers always get it wrong and do it badly.
xuhu · 1h ago
Is there a tool that records the timestamp of each executed command during a build, and when you rebuild, it tells you how much time is left instead of "building obj 35 out of 1023" ?
Or (for cmake or ninja) use a CSV that says how long each object takes to build and use it to estimate how much is left ?
dhooper · 50m ago
OP Here. Thats an interesting idea. What The Fork knows all the commands run, and every path they read/write, so I should be able to make it estimate build time just by looking at what files were touched.
corysama · 49m ago
Looks like a general `fork()` visualizer to me. Which is great!
brcmthrowaway · 26m ago
What about OSes that dont use fork()?
dhooper · 25m ago
I use whatever the equivalent is on that OS.
Surac · 55m ago
but why? I have to admit it's a fun project
rvrb · 34m ago
here, I'll copy the first paragraph of TFA for you:
> Many software projects take a long time to compile. Sometimes that’s just due to the sheer amount of code, like in the LLVM project. But often a build is slower than it should be for dumb, fixable reasons.
This is an important observation that is often overlooked. What’s more, the changes to the information on which this “baked in” build logic is based is not tracked very precisely.
How close can we get to this “speed of light” without such “baking in”? I ran a little benchmark (not 100% accurate for various reasons but good enough as a general indication) which builds the same project (Xerces-C++) both with ninja as configured by CMake and with build2, which doesn’t require a separate step and does configuration management as part of the build (and with precise change tracking). Ninja builds this project from scratch in 3.23s while build2 builds it in 3.54s. If we omit some of the steps done by CMake (like generating config.h) by not cleaning the corresponding files, then the time goes down to 3.28s. For reference, the CMake step takes 4.83s. So a fully from-scratch CMake+ninja build actually takes 8s, which is what you would normally pay if you were using this project as a dependency.
I am stuck in an environment with CMake, GCC and Unix Make (no clang, no ninja) and getting detailed information about WHY the build is taking so long is nearly impossible.
It's also a bit of a messy build with steps like copying a bunch of files from the source into the build folder. Multiple languages (C, C++, Fortran, Python), custom cmake steps, etc.
If this tool can handle that kind of mess, I'll be very interested to see what I can learn.
I did a lot of work to improve the Mozilla build system a decade ago where I would have loved this tool. Wish they would have said what problem they found.
My call with the Mozilla engineer was cut short, so we didn't have time to go into detail about what he found, I want to look into it myself.
I’ve noticed on my huge catkin cmake project that cmake is checking the existence of the same files hundreds of times too. Is there anything that can hook into fork() and provide a cached value after the first invocation?
- switch to ninja to avoid that exact issue since CMake + Make spawns a subprocess for every directory (use the binary from PyPi for jobserver integration)
- catkin as in ROS? rm /opt/ros/noetic/etc/catkin/profile.d/99.roslisp.sh to remove 2 python spawns per package
Is there a version available for MacOS today?? I'd love to give it a whirl... For Rust, C++ / Swift and other stuff.
Thanks!
What limits your tool to compiler/build tools, can it be used for any arbitrary process?
This looks like a really cool tool!
Developers always get it wrong and do it badly.
Or (for cmake or ninja) use a CSV that says how long each object takes to build and use it to estimate how much is left ?
> Many software projects take a long time to compile. Sometimes that’s just due to the sheer amount of code, like in the LLVM project. But often a build is slower than it should be for dumb, fixable reasons.