Barrelfish is “multikernel” operating system [3]: it consists of a small kernel running on each core (one kernel per core), and while rest of the OS is structured as a distributed system of single-core processes
atop these kernels. Kernels share no memory, even on a machine with cache-coherent shared RAM, and the rest of the OS does not use shared memory except for transferring messages and data between cores, and booting other cores."
Was I the only one confused by this? It wasn't just me right? I love when I see things like this. "The cool thing about our kernel is that you cannot share memory! It's super secure. Except for, you know, ..." then list nearly everything. What were they trying to provide/gain with this proposal?
I don't know how you got from Barrelfish (a message-passing OS) to a RISC-V CPU.
Bit of a stretch. Just because they are both messaging passing distributed systems?
transpute · 1d ago
Apologies, mistake in my notes. Should be Enzian (48-core Arm + FPGA).
> Building and using a research computer called Enzian for experimentation with hardware/software codesign for servers.. If academics can’t do relevant, impactful, and medium-to-long-term system software research using commodity platforms, and they can’t do it using someone else’s cost-optimized application-specific custom hardware, what can they do? Our response is to build Enzian: a computer.. optimized for exploring the design space for custom hardware/software co-design.. over-engineered relative to any off-the-shelf hardware.. optimized for flexibility and configurability rather than unit code, efficiency, or performance along any particular dimension.
yencabulator · 11h ago
How do you manufacture a connection between a hypervisor and a kernel that does nothing at all with virtualization? Did you just want to mention Xen?
kfreds · 2d ago
Interesting. Do you know of any good SoK papers or articles that summarize the current state of the art, or explains this genealogy?
2018 video by Ian Pratt covers Xen, uXen and AX (2005-2015), https://news.ycombinator.com/item?id=44135977#44141164. Citrix acquired XenSource. Pratt left to work at Bromium, acquired by HP (which previously acquired BIOS company from Bromium co-founder). The former CTO of XenSource co-founded Qumranet (KVM), acquired by RedHat.
AWS began with Xen, then migrated to a subset of KVM. Nitro used Arm hardware to virtualize I/O (storage, network) paths, leaving KVM responsible for x86 CPU and memory virtualization, https://www.youtube.com/watch?v=e8DVmwj3OEs & https://news.ycombinator.com/item?id=24515019#24516523. Parallels could be drawn to the Apple T2 enclave (Arm) coprocessor being used for disk encryption on x86 Apple Macbooks.
Under the "Confidential Computing" umbrella, Intel has TDX and a new (closed?) hypervisor on servers, using SGX and new hardware privilege levels.
Apple recently added Secure eXclaves to iOS, and Apple Silicon hardware supports nested virtualization, which is what Google pKVM uses on Pixel (and upcoming ChromeOS?) devices, https://news.ycombinator.com/item?id=43314657
For production code, pKVM deserves attention because it's open (upstreamed to mainline Linux), exists in the real world (Pixel phones), stands in stark contrast to Apple's neutered iPads and has the potential to improve upon TrustZone security, https://news.ycombinator.com/item?id=41523758.
Finally, to bring this thread back to Barrelfish, Google OpenTitan open silicon root of trust (OCP servers, Chromebooks) is partly under Pulp Platform research, alongside Snitch (descended from Barrelfish research) open hardware from ETH Zurich. So progress is being made in both mainstream-compatible systems software and greenfield hardware cores.
(hopefully readers can correct any errors or gaps above)
kfreds · 2d ago
The virtualization of I/O is fascinating, and VirtIO's progress from the Linux kernel to hardware implementations. My only wish is that Linux would support inter-VM shared memory as a VirtIO transport in addition to pci and mmio.
Thanks for the pKVM tip, and the connection between OpenTitan and Barrelfish.
Speaking of security and open-source hardware, shameless plug of stuff I work on:
- dev.tillitis.se (FPGA-based OSHW RoT)
- system-transparency.org (related to CC, TDX, SNP)
Yes. I've used Qubes on and off since 2012. What I'd love to do is run Linux on top of seL4, and virtio-backends in VMs. There is a patch for ivshmemv2, but it seems abandoned.
kfreds · 2d ago
Thank you! I realize now that I was thinking about a different aspect of systems research, but failed to say so.
Barrelfish (multikernel) and your username made me think of manycore systems and the scheduling challenges we will surely face as systems become more heterogeneous. I'm in a period of trying to learn more about that. Any and all recommendations are much appreciated.
> compute.. is handled by 140 of Tenstorrent's Tensix cores, each of which is composed of five "Baby RISC-V" cores, a pair of routers, a compute complex, and some L1 cache.. Tensix cores account for 700 of the 752 so-called baby RISC-V cores on board.. TT-Metalium low-level programming model.. kernels themselves are plain C++ with APIs.. Tenstorrent aims to support running any AI model on its accelerators using commonly used runtimes like PyTorch, ONNX, JAX, TensorFlow, and vLLM.
> A novel mapping interface provides explicit programmer controlled placement of data in the memory hierarchy and assignment of tasks to processors in a way that is orthogonal to correctness, thereby enabling easy porting and tuning of Legion applications to new architectures.. Legion is developed as an open source project, with major contributions from LANL, NVIDIA Research, SLAC, and Stanford.
kfreds · 2d ago
It seems we read the same stuff. :)
I assume you're also aware of the Oxide and Friends podcast, and the Microarch Club podcast?
transpute · 2d ago
Yes on Oxide, will check out Microarch Club, thanks!
bionsystem · 2d ago
So far when Jim starts something it's a massive success, can't wait to see how this one goes.
I find these type of efforts somewhat disappointing. So much OS research boils down to “We’ll handle scheduling and rudimentary peripheral multiplexing good luck on rest”. These basics are so far from a useful system that you’d have to slap linux on top and immediately lose most/all benefits of the new architecture.
yencabulator · 11h ago
It's a research project. It has been influential enough. Researchers have also made hardware prototypes that pushed this message-passing-cores design ever further.
>"1.1 High level overview
Barrelfish is “multikernel” operating system [3]: it consists of a small kernel running on each core (one kernel per core), and while rest of the OS is structured as a distributed system of single-core processes atop these kernels. Kernels share no memory, even on a machine with cache-coherent shared RAM, and the rest of the OS does not use shared memory except for transferring messages and data between cores, and booting other cores."
- Akaros, an OS for manycore systems: http://akaros.org/news.html
- VMThreads, an interesting paper on scheduling challenges, related to Akaros: https://iwp9.org/11iwp9proceedings.pdf
From Mothy (Barrelfish researcher) profile, https://people.inf.ethz.ch/troscoe/ & https://enzian.systems/why-enzian/
> Building and using a research computer called Enzian for experimentation with hardware/software codesign for servers.. If academics can’t do relevant, impactful, and medium-to-long-term system software research using commodity platforms, and they can’t do it using someone else’s cost-optimized application-specific custom hardware, what can they do? Our response is to build Enzian: a computer.. optimized for exploring the design space for custom hardware/software co-design.. over-engineered relative to any off-the-shelf hardware.. optimized for flexibility and configurability rather than unit code, efficiency, or performance along any particular dimension.
2018 video by Ian Pratt covers Xen, uXen and AX (2005-2015), https://news.ycombinator.com/item?id=44135977#44141164. Citrix acquired XenSource. Pratt left to work at Bromium, acquired by HP (which previously acquired BIOS company from Bromium co-founder). The former CTO of XenSource co-founded Qumranet (KVM), acquired by RedHat.
AWS began with Xen, then migrated to a subset of KVM. Nitro used Arm hardware to virtualize I/O (storage, network) paths, leaving KVM responsible for x86 CPU and memory virtualization, https://www.youtube.com/watch?v=e8DVmwj3OEs & https://news.ycombinator.com/item?id=24515019#24516523. Parallels could be drawn to the Apple T2 enclave (Arm) coprocessor being used for disk encryption on x86 Apple Macbooks.
Under the "Confidential Computing" umbrella, Intel has TDX and a new (closed?) hypervisor on servers, using SGX and new hardware privilege levels.
Apple recently added Secure eXclaves to iOS, and Apple Silicon hardware supports nested virtualization, which is what Google pKVM uses on Pixel (and upcoming ChromeOS?) devices, https://news.ycombinator.com/item?id=43314657
For production code, pKVM deserves attention because it's open (upstreamed to mainline Linux), exists in the real world (Pixel phones), stands in stark contrast to Apple's neutered iPads and has the potential to improve upon TrustZone security, https://news.ycombinator.com/item?id=41523758.
Finally, to bring this thread back to Barrelfish, Google OpenTitan open silicon root of trust (OCP servers, Chromebooks) is partly under Pulp Platform research, alongside Snitch (descended from Barrelfish research) open hardware from ETH Zurich. So progress is being made in both mainstream-compatible systems software and greenfield hardware cores.
(hopefully readers can correct any errors or gaps above)
Thanks for the pKVM tip, and the connection between OpenTitan and Barrelfish.
Speaking of security and open-source hardware, shameless plug of stuff I work on:
- dev.tillitis.se (FPGA-based OSHW RoT)
- system-transparency.org (related to CC, TDX, SNP)
- sigsum.org
Virtio on Xen is still a work in progress, https://wiki.xenproject.org/wiki/Virtio_On_Xen
Barrelfish (multikernel) and your username made me think of manycore systems and the scheduling challenges we will surely face as systems become more heterogeneous. I'm in a period of trying to learn more about that. Any and all recommendations are much appreciated.
> compute.. is handled by 140 of Tenstorrent's Tensix cores, each of which is composed of five "Baby RISC-V" cores, a pair of routers, a compute complex, and some L1 cache.. Tensix cores account for 700 of the 752 so-called baby RISC-V cores on board.. TT-Metalium low-level programming model.. kernels themselves are plain C++ with APIs.. Tenstorrent aims to support running any AI model on its accelerators using commonly used runtimes like PyTorch, ONNX, JAX, TensorFlow, and vLLM.
Legion from the Stanford research team that lead to CUDA, https://legion.stanford.edu/ & https://elliottslaughter.com/2024/02/legion-paper-history
> A novel mapping interface provides explicit programmer controlled placement of data in the memory hierarchy and assignment of tasks to processors in a way that is orthogonal to correctness, thereby enabling easy porting and tuning of Legion applications to new architectures.. Legion is developed as an open source project, with major contributions from LANL, NVIDIA Research, SLAC, and Stanford.
I assume you're also aware of the Oxide and Friends podcast, and the Microarch Club podcast?
The Barrelfish project is no longer active. See https://systems.ethz.ch/ for information about our current research activities.
It is still interesting though.
Tenstorrent (Jim Keller) shipped RISC-V manycore design for inference.
In the non-academic world, HPC and realtime people have reimplemented some of these ideas into Linux, making it so that a core can be fully dedicated to an application, not receiving interrupts ("tickless"), not handling any kernel tasks, etc. For example, https://htor.inf.ethz.ch/ross2012/slides/ross2012-akkan.pdf https://insidehpc.com/2009/10/tilera-100-core-x86-architectu... https://lwn.net/Articles/549580/ https://lwn.net/Articles/816298/