What would a Kubernetes 2.0 look like

90 Bogdanp 145 6/19/2025, 12:00:54 PM matduggan.com ↗

Comments (145)

NathanFlurry · 1h ago
The #1 problem with Kubernetes is it's not something that "Just Works." There's a very small subset of engineers who can stand up services on Kubernetes without having it fall over in production – not to mention actually running & maintaining a Kubernetes cluster on your own VMs.

In response, there's been a wave of "serverless" startups because the idea of running anything yourself has become understood as (a) a time sink, (b) incredibly error prone, and (c) very likely to fail in production.

I think a Kubernetes 2.0 should consider what it would look like to have a deployment platform that engineers can easily adopt and feel confident running themselves – while still maintaining itself as a small-ish core orchestrator with strong primitives.

I've been spending a lot of time building Rivet to itch my own itch of an orchestrator & deployment platform that I can self-host and scale trivially: https://github.com/rivet-gg/rivet

We currently advertise as the "open-source serverless platform," but I often think of the problem as "what does Kubernetes 2.0 look like." People are already adopting it to push the limits into things that Kubernetes would traditionally be good at. We've found the biggest strong point is that you're able to build roughly the equivalent of a Kubernetes controller trivially. This unlocks features more complex workload orchestration (game servers, per-tenant deploys), multitenancy (vibe coding per-tenant backends, LLM code interpreters), metered billing per-tenant, more powerful operators, etc.

stuff4ben · 1h ago
I really dislike this take and I see it all the time. Also I'm old and I'm jaded, so it is what it is...

Someone decides X technology is too heavy-weight and wants to just run things simply on their laptop because "I don't need all that cruft". They spend time and resources inventing technology Y to suit their needs. Technology Y gets popular and people add to it so it can scale, because no one runs shit in production off their laptops. Someone else comes along and says, "damn, technology Y is too heavyweight, I don't need all this cruft..."

"There are neither beginnings nor endings to the Wheel of Time. But it was a beginning.”

adrianmsmith · 1h ago
It’s also possible for things to just be too complex.

Just because something’s complex doesn’t necessarily mean it has to be that complex.

mdaniel · 38m ago
IMHO, the rest of that sentence is "be too complex for some metric within some audience"

I can assure you that trying to reproduce kubernetes with a shitload of shell scripts, autoscaling groups, cloudwatch metrics, and hopes-and-prayers is too complex for my metric within the audience of people who know Kubernetes

wongarsu · 20m ago
Or too generic. A lot of the complexity if from trying to support all use cases. For each new feature there is a clear case of "we have X happy users, and Y people who would start using it if we just added Z". But repeat that often enough and the whole things becomes so complex and abstract that you lose those happy users.

The tools I've most enjoyed (including deployment tools) are those with a clear target group and vision, along with leadership that rejects anything that falls too far outside of it. Yes, it usually doesn't have all the features I want, but it also doesn't have a myriad of features I don't need

otterley · 44m ago
First, K8S doesn't force anyone to use YAML. It might be idiomatic, but it's certainly not required. `kubectl apply` has supported JSON since the beginning, IIRC. The endpoints themselves speak JSON and grpc. And you can produce JSON or YAML from whatever language you prefer. Jsonnet is quite nice, for example.

Second, I'm curious as to why dependencies are a thing in Helm charts and why dependency ordering is being advocated, as though we're still living in a world of dependency ordering and service-start blocking on Linux or Windows. One of the primary idioms in Kubernetes is looping: if the dependency's not available, your app is supposed to treat that is a recoverable error and try again until the dependency becomes available. Or, crash, in which case, the ReplicaSet controller will restart the app for you.

You can't have dependency conflicts in charts if you don't have dependencies (cue "think about it" meme here), and you install each chart separately. Helm does let you install multiple versions of a chart if you must, but woe be unto those who do that in a single namespace.

If an app truly depends on another app, one option is to include the dependency in the same Helm chart! Helm charts have always allowed you to have multiple application and service resources.

Arrowmaster · 30m ago
You say supposed to. That's great when building your own software stack in house but how much software is available that can run on kubenetes but was created before it existed. But somebody figured out it could run in docker and then later someone realized it's not that hard to make it run in kubenetes because it already runs in docker.

You can make an opinionated platform that does things how you think is the best way to do them, and people will do it how they want anyway with bad results. Or you can add the features to make it work multiple ways and let people choose how to use it.

delusional · 29m ago
> One of the primary idioms in Kubernetes is looping

Indeed, working with kubernetes I would argue that the primary architectural feature of kubernetes is the "reconciliation loop". Observe the current state, diff a desired state, apply the diff. Over and over again. There is no "fail" or "success" state, only what we can observe and what we wish to observe. Any difference between the two is iterated away.

I think it's interesting that the dominant "good enough technology" of mechanical control, the PID feedback loop, is quite analogous to this core component of kubernetes.

nunez · 2h ago
I _still_ think Kubernetes is insanely complex, despite all that it does. It seems less complex these days because it's so pervasive, but complex it remains.

I'd like to see more emphasis on UX for v2 for the most common operations, like deploying an app and exposing it, then doing things like changing service accounts or images without having to drop into kubectl edit.

Given that LLMs are it right now, this probably won't happen, but no harm in dreaming, right?

Pet_Ant · 2h ago
Kubernetes itself contains so many layers of abstraction. There are pods, which is the core new idea, and it's great. But now there are deployments, and rep sets, and namespaces... and it makes me wish we could just use Docker Swarm.

Even Terraform seems to live on just a single-layer and was relatively straight-forward to learn.

Yes, I am in the middle of learning K8s so I know exactly how steep the curve is.

jakewins · 2h ago
The core idea isn’t pods. The core idea is reconciliation loops: you have some desired state - a picture of how you’d like a resource to look or be - and little controller loops that indefinitely compare that to the world, and update the world.

Much of the complexity then comes from the enormous amount of resource types - including all the custom ones. But the basic idea is really pretty small.

I find terraform much more confusing - there’s a spec, and the real world.. and then an opaque blob of something I don’t understand that terraform sticks in S3 or your file system and then.. presumably something similar to a one-shot reconciler that wires that all together each time you plan and apply?

jonenst · 26m ago
To me the core of k8s is pod scheduling on nodes, networking ingress (e.g. nodeport service), networking between pods (everything addressable directly), and colocated containers inside pods.

Declarative reconciliation is (very) nice but not irreplaceable (and actually not mandatory, e.g. kubectl run xyz)

mdaniel · 1h ago
> a one-shot reconciler that wires that all together each time you plan and apply?

You skipped the { while true; do tofu plan; tofu apply; echo "well shit"; patch; done; } part since the providers do fuck-all about actually, no kidding, saying whether the plan could succeed

vrosas · 1h ago
Someone saying "This is complex but I think I have the core idea" and someone to responding "That's not the core idea at all" is hilarious and sad. BUT ironically what you just laid out about TF is exactly the same - you just manually trigger the loop (via CI/CD) instead of a thing waiting for new configs to be loaded. The state file you're referencing is just a cache of the current state and TF reconciles the old and new state.
jauco · 50m ago
Always had the conceptual model that terraform executes something that resembles a merge using a three way diff.

There’s the state file (base commit, what the system looked like the last time terraform succesfully executed). The current system (the main branch, which might have changed since you “branched off”) and the terraform files (your branch)

Running terraform then merges your branch into main.

Now that I’m writing this down, I realize I never really checked if this is accurate, tf apply works regardless of course.

mdaniel · 30m ago
and then the rest of the owl is working out the merge conflicts :-D

I don't know how to have a cute git analogy for "but first, git deletes your production database, and then recreates it, because some attribute changed that made the provider angry"

NathanFlurry · 1h ago
We're still missing a handful of these features, but this is the end goal with what we're building over at Rivet: https://github.com/rivet-gg/rivet

This whole thing started scratching my own itch of wanting an orchestrator that I can confidently stand up, delpoy to, then forget about.

stackskipton · 18m ago
Ops type here, after looking at Rivet, I've started doing The Office "Dear god no, PLEASE NO"

Most people are looking for Container Management runtime with HTTP(S) frontend that will handle automatic certificate from Let's Encrypt.

I don't want Functions/Actors or require this massive suite:

FoundationDB: Actor state

CockroachDB: OLTP

ClickHouse: Developer-facing monitoring

Valkey: Caching

NATS: Pub/sub

Traefik: Load balancers & tunnels

This is just switching Kubernetes cloud lock in with KEDA and some other more esoteric operators to Rivet Cloud lock in. At least Kubernetes is slightly more portable than this.

Oh yea, I don't know what Clickhouse is doing with monitoring but Prometheus/Grafana suite called, said they would love for you to come home.

mdaniel · 31m ago
I recognize that I'm biased, but you'll want to strongly consider whether https://rivet.gg/docs/config is getting your audience where they can be successful, as compared to (e.g.) https://kubernetes.io/docs/reference/generated/kubernetes-ap...
coderatlarge · 59m ago
where is that in the design space relative to where goog internal cluster management has converged to after the many years and the tens of thousands of engineers who have sanded it down under heavy fire since the original borg?
throwaway5752 · 2h ago
I've come to think that it is a case of "the distinctions between types of computer programs are a human construct" problem.

I agree with you on a human level. Operators and controllers remind me of COM and CORBA, in a sense. They are hightly abstract things, that are intrinsically so flexible that they allow judgement (and misjudgement) in design.

For simple implementations, I'd want k8s-lite, that was more opinionated and less flexible. Something which doesn't allow for as much shooting ones' self in the foot. For very complex implementations, though, I've felt existing abstractions to be limiting. There is a reason why a single cluster is sometimes the basis for cell boundaries in cellular architectures.

I sometimes wonder if one single system - kubernetes 2.0 or anything else - can encompass the full complexity of the problem space while being tractable to work with by human architects and programmers.

nine_k · 1h ago
> I'd want k8s-lite, that was more opinionated and less flexible

You seem to want something like https://skateco.github.io/ (still compatible to k8s manifests).

Or maybe even something like https://uncloud.run/

Or if you still want real certified Kubernetes, but small, there is https://k3s.io/

mdaniel · 28m ago
mountainriver · 2h ago
We have started working on a sort of Kubernetes 2.0 with https://github.com/agentsea/nebulous -- still pre-alpha

Things we are aiming to improve:

* Globally distributed * Lightweight, can easily run as a single binary on your laptop while still scaling to thousands of nodes in the cloud. * Tailnet as the default network stack * Bittorrent as the default storage stack * Multi-tenant from the ground up * Live migration as a first class citizen

Most of these needs were born out of building modern machine learning products, and the subsequent GPU scarcity. With ML taking over the world though this may be the norm soon.

hhh · 2h ago
Wow… Cool stuff, the live migration is very interesting. We do autoscaling across clusters across clouds right now based on pricing, but actual live migration is a different beast
Thaxll · 2h ago
This is not Kubernetes, this a custom made solution to run GPU.
nine_k · 1h ago
Since it still can consume Kubernetes manifests, it's of interest for k8s practitioners.

Since k8s manifests are a language, there can be multiple implementations of it, and multiple dialects will necessarily spring up.

mountainriver · 38m ago
Which is the future of everything and Kuberentes does a very bad job at
znpy · 2h ago
> * Globally distributed

Non-requirement?

> * Tailnet as the default network stack

That would probably be the first thing I look to rip out if I ever was to use that.

Kubernetes assuming the underlying host only has a single NIC has been a plague for the industry, setting it back ~20 years and penalizing everyone that's not running on the cloud. Thank god there are multiple CNI implementation.

Only recently with Multus (https://www.redhat.com/en/blog/demystifying-multus) some sense seem to be coming back into that part of the infrastructure.

> * Multi-tenant from the ground up

How would this be any different from kubernetes?

> * Bittorrent as the default storage stack

Might be interesting, unless you also mean seeding public container images. Egress traffic is crazy expensive.

mountainriver · 33m ago
>> * Globally distributed >Non-requirement?

It is a requirement because you can't find GPUs in a single region reliably and Kubernetes doesn't run on multiple regions.

>> * Tailnet as the default network stack

> That would probably be the first thing I look to rip out if I ever was to use that.

This is fair, we find it very useful because it easily scales cross clouds and even bridges them locally. It was the simplest solution we could implement to get those properties, but in no way would we need to be married to it.

>> * Multi-tenant from the ground up

> How would this be any different from kubernetes?

Kuberentes is deeply not multi-tenant, anyone who has tried to make a multi-tenant solution over kube has dealt with this. I've done it at multiple companies now, its a mess.

>> * Bittorrent as the default storage stack

> Might be interesting, unless you also mean seeding public container images. Egress traffic is crazy expensive.

Yeah egress cost is a concern here, but its lazy so you don't pay for it unless you need it. This seemed like the lightest solution to sync data when you do live migrations cross cloud. For instance, I need to move my dataset and ML model to another cloud, or just replicate it there.

nine_k · 1h ago
> Non-requirement

> the first thing I look to rip out

This only shows how varied the requirements are across the industry. One size does not fit all, hence multiple materially different solutions spring up. This is only good.

mdaniel · 1h ago
heh, I think you didn't read the room given this directory https://github.com/agentsea/nebulous/blob/v0.1.88/deploy/cha...

Also, ohgawd please never ever do this ever ohgawd https://github.com/agentsea/nebulous/blob/v0.1.88/deploy/cha...

mountainriver · 30m ago
Why not? We can run on Kube and extend it to multi-region when needed, or we can run on any VM as a single binary, or just your laptop.

If you mean Helm, yeah I hate it but it is the most common standard. Also not sure what you mean by the secret, that is secure.

pm90 · 5h ago
Hard disagree with replacing yaml with HCL. Developers find HCL very confusing. It can be hard to read. Does it support imports now? Errors can be confusing to debug.

Why not use protobuf, or similar interface definition languages? Then let users specify the config in whatever language they are comfortable with.

vanillax · 1h ago
Agree HCL is terrible. K8s YAML is fine. I have yet to hit a use case that cant be solved with its types. If you are doing too much perhaps a config map is the wrong choice.
geoctl · 5h ago
You can very easily build and serialize/deserialize HCL, JSON, YAML or whatever you can come up with outside Kubernetes from the client-side itself (e.g. kubectl). This has actually nothing to do with Kubernetes itself at all.
dilyevsky · 5h ago
Maybe you know this but Kubernetes interface definitions are already protobufs (except for crds)
cmckn · 5h ago
Sort of. The hand-written go types are the source of truth and the proto definitions are generated from there, solely for the purpose of generating protobuf serializers for the hand-written go types. The proto definition is used more as an intermediate representation than an “API spec”. Still useful, but the ecosystem remains centered on the go types and their associated machinery.
dilyevsky · 4h ago
Given that i can just take generated.proto and ingest it in my software then marshal any built-in type and apply it via standard k8s api, why would I even need all the boilerplate crap from apimachinery? Perfectly happy with existing rest-y semantics - full grpc would be going too far
dochne · 1h ago
My main beef with HCL is a hatred for how it implemented for loops.

Absolutely loathsome syntax IMO

znpy · 2h ago
> Hard disagree with replacing yaml with HCL.

I see some value instead. Lately I've been working on Terraform code to bring up a whole platform in half a day (aws sub-account, eks cluster, a managed nodegroup for karpenter, karpenter deployment, ingress controllers, LGTM stack, public/private dns zones, cert-manager and a lot more) and I did everything in Terraform, including Kubernetes resources.

What I appreciated about creating Kubernetes resources (and helm deployments) in HCL is that it's typed and has a schema, so any ide capable of talking to an LSP (language server protocol - I'm using GNU Emacs with terraform-ls) can provide meaningful auto-completion as well proper syntax checking (I don't need to apply something to see it fail, emacs (via the language server) can already tell me what I'm writing is wrong).

I really don't miss having to switch between my ide and the Kubernetes API reference to make sure I'm filling each field correctly.

dangus · 2h ago
Confusing? Here I am working on the infrastructure side thinking that I’m working with a a baby configuration language for dummies who can’t code when I use HCL/Terraform.

The idea that someone who works with JavaScript all day might find HCL confusing seems hard to imagine to me.

To be clear, I am talking about the syntax and data types in HCL, not necessarily the way Terraform processes it, which I admit can be confusing/frustrating. But Kubernetes wouldn’t have those pitfalls.

mdaniel · 1h ago
orly, what structure does this represent?

  outer {
    example {
      foo = "bar"
    }
    example {
      foo = "baz"
    }
  }
it reminds me of the insanity of toml

  [lol]
  [[whut]]
  foo = "bar"
  [[whut]]
  foo = "baz"
only at least with toml I can $(python3.13 -c 'import tomllib, sys; print(tomllib.loads(sys.stdin.read()))') to find out, but with hcl too bad
johngossman · 6h ago
Not a very ambitious wishlist for a 2.0 release. Everyone I talk to complains about the complexity of k8s in production, so I think the big question is could you do a 2.0 with sufficient backward compatibility that it could be adopted incrementally and make it simpler. Back compat almost always mean complexity increases as the new system does its new things and all the old ones.
herval · 4h ago
The question is always what part of that complexity can be eliminated. Every “k8s abstraction” I’ve seen to date either only works for a very small subset of stuff (eg the heroku-like wrappers) or eventually develops a full blown dsl that’s as complex as k8s (and now you have to learn that job-specific dsl)
mdaniel · 3h ago
Relevant: Show HN: Canine – A Heroku alternative built on Kubernetes - https://news.ycombinator.com/item?id=44292103 - June, 2025 (125 comments)
herval · 3h ago
yep, that's the latest of a long lineage of such projects (one of which I worked on myself). Ohers include kubero, dokku, porter, kr0, etc. There was a moment back in 2019 where every big tech company was trying to roll out their own K8s DSL (I know of Twitter, Airbnb, WeWork, etc).

For me, the only thing that really changed was LLMs - chatgpt is exceptional at understanding and generating valid k8s configs (much more accurately than it can do coding). It's still complex, but it feels I have a second brain to look at it now

mrweasel · 6h ago
What I would add is "sane defaults", as in unless you pick something different, you get a good enough load balancer/network/persistent storage/whatever.

I'd agree that YAML isn't a good choice, but neither is HCL. Ever tried reading Terraform, yeah, that's bad too. Inherently we need a better way to configure Kubernetes clusters and changing out the language only does so much.

IPv6, YES, absolutely. Everything Docker, container and Kubernetes should have been IPv6 only internal from the start. Want IPv4? That should be handle by a special case ingress controller.

zdw · 5h ago
Sane defaults is in conflict with "turning you into a customer of cloud provider managed services".

The longer I look at k8s, the more I see it "batteries not included" around storage, networking, etc, with the result being that the batteries come with a bill attached from AWS, GCP, etc. K8s is less of an open source project, and more as a way encourage dependency on these extremely lucrative gap filler services from the cloud providers.

JeffMcCune · 5h ago
Except you can easily install calico, istio, and ceph on used hardware in your garage and get an experience nearly identical to every hyper scaler using entirely free open source software.
zdw · 5h ago
Having worked on on-prem K8s deployments, yes, you can do this. But getting it to production grade is very different than a garage-quality proof of concept.
mdaniel · 5h ago
I think OP's point was: but how much of that production grade woe is the fault of Kubernetes versus, sure, turns out booting up an PaaS from scratch is hard as nails. I think that k8s pluggable design also blurs that boundary in most people's heads. I can't think of the last time the control plane shit itself, versus everyone and their cousin has a CLBO story for the component controllers installed on top of k8s
zdw · 4h ago
CLBO?
mdaniel · 4h ago
Crash Loop Back Off
benced · 2h ago
I found Kubernetes insanely intuitive coming from the frontend world. I was used to writing code that took in data and made the UI react to that - now I write code that the control panel uses reconciles resources with config.
mootoday · 14m ago
Why containers when you can have Wasm components on wasmCloud :-)?!

https://wasmcloud.com/

0xbadcafebee · 6m ago
[delayed]
darkwater · 5h ago
I totally dig the HCL request. To be honest I'm still mad at Github that initially used HCL for Github Actions and then ditched it for YAML when they went stable.
carlhjerpe · 5h ago
I detest HCL, the module system is pathetic. It's not composable at all and you keep doing gymnastics to make sure everything is known at plan time (like using lists where you should use dictionaries) and other anti-patterns.

I use Terranix to make config.tf.json which means I have the NixOS module system that's composable enough to build a Linux distro at my fingertips to compose a great Terraform "state"/project/whatever.

It's great to be able to run some Python to fetch some data, dump it in JSON, read it with Terranix, generate config.tf.json and then apply :)

jitl · 4h ago
What’s the list vs dictionary issue in Terraform? I use a lot of dictionaries (maps in tf speak), terraform things like for_each expect a map and throw if handed a list.
carlhjerpe · 4h ago
Internally a lot of modules cast dictionaries to lists of the same length because the keys of the dict might not be known at plan time or something. The "Terraform AWS VPC module does this internally for many things.

I couldn't tell you exactly, but modules always end up either not exposing enough or exposing too much. If I were to write my module with Terranix I can easily replace any value in any resource from the module I'm importing using "resource.type.name.parameter = lib.mkForce "overridenValue";" without having to expose that parameter in the module "API".

The nice thing is that it generates "Terraform"(config.tf.json) so the supremely awesome state engine and all API domain knowledge bound in providers work just the same and I don't have to reach for something as involved as Pulumi.

You can even mix Terranix with normal HCL since config.tf.json is valid in the same project as HCL. A great way to get started is to generate your provider config and other things where you'd reach to Terragrunt/friends. Then you can start making options that makes resources at your own pace.

The terraform LSP sadly doesn't read config.tf.json yet so you'll get warnings regarding undeclared locals and such but for me it's worth it, I generally write tf/tfnix with the provider docs open and the language (Nix and HCL) are easy enough to write without full LSP.

https://terranix.org/ says it better than me, but by doing it with Nix you get programatical access to the biggest package library in the world to use at your discretion (Build scripts to fetch values from weird places, run impure scripts with null_resource or it's replacements) and an expressive functional programming language where you can do recursion and stuff, you can use derivations to run any language to transform strings with ANY tool.

It's like Terraform "unleashed" :) Forget "dynamic" blocks, bad module APIs and hacks (While still being able to use existing modules too if you feel the urge).

Groxx · 2h ago
Internally... in what? Not HCL itself, I assume? Also I'm not seeing much that implies HCL has a "plan time"...

I'm not familiar with HCL so I'm struggling to find much here that would be conclusive, but a lot of this thread sounds like "HCL's features that YAML does not have are sub-par and not sufficient to let me only use HCL" and... yeah, you usually can't use YAML that way either, so I'm not sure why that's all that much of a downside?

I've been idly exploring config langs for a while now, and personally I tend to just lean towards JSON5 because comments are absolutely required... but support isn't anywhere near as good or automatic as YAML :/ HCL has been on my interest-list for a while, but I haven't gone deep enough into it to figure out any real opinion.

jitl · 3h ago
I think Pulumi is in a similar spot, you get a real programming language (of your choice) and it gets to use the existing provider ecosystem. You can use the programming language composition facilities to work around the plan system if necessary, although their plans allow more dynamic stuff than Terraform.

The setup with Terranix sounds cool! I am pretty interested in build system type things myself, I recently wrote a plan/apply system too that I use to manage SQL migrations.

I want learn nix, but I think that like Rust, it's just a bit too wide/deep for me to approach on my own time without a tutor/co-worker or forcing function like a work project to push me through the initial barrier.

carlhjerpe · 2h ago
Yep it's similar, but you bring all your dependencies with you through Nix rather than a language specific package manager.

Try using something like devenv.sh initially just to bring tools into $PATH in a distro agnostic & mostly-ish MacOS compatible way (so you can guarantee everyone has the same versions of EVERYTHING you need to build your thing).

Learn the language basics after it brings you value already, then learn about derivations and then the module system which is this crazy composable multilayer recursive magic merging type system implemented on top of Nix, don't be afraid to clone nixpkgs and look inside.

Nix derivations are essentially Dockerfiles on steroids, but Nix language brings /nix/store paths into the container, sets environment variables for you and runs some scripts, and all these things are hashed so if any input changes it triggers automatic cascading rebuilds, but also means you can use a binary cache as a kind of "memoization" caching thingy which is nice.

It's a very useful tool, it's very non-invasive on your system (other than disk space if you're not managing garbage collection) and you can use it in combination with other tools.

Makes it very easy to guarantee your DevOps scripts runs exactly your versions of all CLI tools and build systems and whatever even if the final piece isn't through Nix.

Look at "pgroll" for Postgres migrations :)

jitl · 2h ago
pgroll seems neat but I ended up writing my own tools for this one because I need to do somewhat unique shenanigans like testing different sharding and resource allocation schemes in Materialize.com (self hosted). I have 480 source input schemas (postgres input schemas described here if you're curious, the materialize stuff is brand new https://www.notion.com/blog/the-great-re-shard) and manage a bunch of different views & indexes built on top of those; create a bunch of different copies of the views/indexes striped across compute nodes, like right now I'm testing 20 schemas per whole-aws-instance node, versus 4 schemas per quarter-aws-node, M/N*Y with different permutations of N and Y. With the plan/apply model I just need to change a few lines in TypeScript and get the minimal changes to all downstream dependencies needed to roll it out.
mdaniel · 3h ago
Sounds like the kustomize mental model: take code you potentially don't control, apply patches to it until it behaves like you wish, apply

If the documentation and IDE story for kustomize was better, I'd be its biggest champion

carlhjerpe · 3h ago
You can run Kustomize in a Nix derivation with inputs from Nix and apply the output using Terranix and the kubectl provider, gives you a very nice reproducible way to apply Kubernetes resources with the Terraform state engine, I like how Terraform makes managing the lifecycle of CRUD with cascading changes and replacements which often is pretty optimal-ish at least.

And since it's Terraform you can create resources using any provider in the registry to create resources according to your Kubernetes objects too, it can technically replace things like external-dns and similar controllers that create stuff in other clouds, but in a more "static configuration" way.

Edit: This works nicely with Gitlab Terraform state hosting thingy as well.

geoctl · 5h ago
I would say k8s 2.0 needs: 1. gRPC/proto3-based APIs to make controlling k8s clusters easier using any programming language not just practically Golang as is the case currently and this can even make dealing with k8s controllers easier and more manageable, even though it admittedly might actually complicates things at the API server-side when it comes CRDs. 2. PostgreSQL or pluggable storage backend by default instead of etcd. 3. Clear identity-based, L7-aware ABAC-based access control interface that can be implemented by CNIs for example. 4. Applying userns by default 5. Easier pluggable per-pod CRI system where microVMs and container-based runtimes can easily co-exist based on the workload type.
jitl · 5h ago
All the APIs, including CRDs, already have a well described public & introspectable OpenAPI schema you can use to generate clients. I use the TypeScript client generated and maintained by Kubernetes organization. I don’t see what advantage adding a binary serialization wire format has. I think gRPC makes sense when there’s some savings to be had with latency, multiplexing, streams etc but control plane things like Kubernetes don’t seem to me like it’s necessary.
ofrzeta · 2m ago
I think the HTTP API with OpenAPI schema is part of what's so great about Kubernetes and also a reason for its success.
geoctl · 4h ago
I haven't used CRDs myself for a few years now (probably since 2021), but I still remember developing CRDs was an ugly and hairy experience to say the least, partly due to the flaws of Golang itself (e.g. no traits like in Rust, no macros, no enums, etc...). With protobuf you can easily compile your definitions to any language with clear enum, oneof implementations, you can use the standard protobuf libraries to do deepCopy, merge, etc... for you and you can also add basic validations in the protobuf definitions and so on. gRPC/protobuf will basically allow you to develop k8s controllers very easily in any language.
mdaniel · 3h ago
CRDs are not tied to golang in any way whatsoever; <https://www.redhat.com/en/blog/writing-custom-controller-pyt...> and <https://metacontroller.github.io/metacontroller/guide/create...> are two concrete counter-examples, with the latter being the most "microservices" extreme. You can almost certainly implement them in bash if you're trying to make the front page of HN
geoctl · 3h ago
I never said that CRDs are tied to Golang, I said that the experience of compiling CRDs, back then gen-controller or whatever is being used these days, to Golang types was simply ugly partly due to the flaws of the language itself. What I mean is that gRPC can standardize the process of compiling both k8s own resource definitions as well as CRDs to make the process of developing k8s controllers in any language simply much easier. However this will probably complicate the logic of the API server trying to understand and decode the binary-based protobuf resource serialized representations compared to the current text-based JSON representations.
znpy · 2h ago
> have a well described public & introspectable OpenAPI schema you can use to generate clients.

Last time I tried loading the openapi schema in the swagger ui on my work laptop (this was ~3-4 years ago, and i had an 8th gen core i7 with 16gb ram) it hang my browser, leading to tab crash.

mdaniel · 44m ago
Loading it in what? I just slurped the 1.8MB openapi.json for v1.31 into Mockroon and it fired right up instantly
dilyevsky · 4h ago
1. The built-in types are already protos. Imo gRPC wouldn't be a good fit - actually will make the system harder to use. 2. Already can be achieved today via kine[0] 3. Couldn't you build this today via regular CNI? Cilium NetworkPolicies and others basically do this already

4,5 probably don't require 2.0 - can be easily added within existing API via KEP (cri-o already does userns configuration based on annotations)

[0] - https://github.com/k3s-io/kine

geoctl · 3h ago
Apart from 1 and 3, probably everything else can be added today if the people in charge have the will to do that, and that's assuming that I am right and these points are actually that important to be standardized. However the big enterprise-tier money in Kubernetes is made from dumbing down the official k8s interfaces especially those related to access control (e.g. k8s own NetworkPolicy compared to Istio's access control related resources).
jitl · 5h ago
I feel like I’m already living in the Kubernetes 2.0 world because I manage my clusters & its applications with Terraform.

- I get HCL, types, resource dependencies, data structure manipulation for free

- I use a single `tf apply` to create the cluster, its underlying compute nodes, related cloud stuff like S3 buckets, etc; as well as all the stuff running on the cluster

- We use terraform modules for re-use and de-duplication, including integration with non-K8s infrastructure. For example, we have a module that sets up a Cloudflare ZeroTrust tunnel to a K8s service, so with 5 lines of code I can get a unique public HTTPS endpoint protected by SSO for whatever running in K8s. The module creates a Deployment running cloudflared as well as configures the tunnel in the Cloudflare API.

- Many infrastructure providers ship signed well documented Terraform modules, and Terraform does reasonable dependency management for the modules & providers themselves with lockfiles.

- I can compose Helm charts just fine via the Helm terraform provider if necessary. Many times I see Helm charts that are just “create namespace, create foo-operator deployment, create custom resource from chart values” (like Datadog). For these I opt to just install the operator & manage the CRD from terraform directly or via a thin Helm pass through chat that just echos whatever HCL/YAML I put in from Terraform values.

Terraform’s main weakness is orchestrating the apply process itself, similar to k8s with YAML or whatever else. We use Spacelift for this.

ofrzeta · 4m ago
In a way it's redundant to have the state twice: once in Kubernetes itself and once in the Terraform state. This can lead to problems when resources are modified through mutating webhooks or similar. Then you need to mark your properties as "computed fields" or something like that. So I am not a fan of managing applications through TF. Managing clusters might be fine, though.
zdw · 6h ago
Related to this, a 2020 take on the topic from the MetalLB dev: https://blog.dave.tf/post/new-kubernetes/
jauntywundrkind · 5h ago
152 comments on A Better Kubernetes, from the Ground Up, https://news.ycombinator.com/item?id=25243159
woile · 48m ago
What bothers me:

- it requires too much RAM to run in small machines (1GB RAM). I want to start small but not have to worry about scalability. docker swarm was nice in this regard.

- use KCL lang or CUE lang to manage templates

nikisweeting · 1h ago
It should natively support running docker-compose.yml configs, essentially treating them like swarm configurations and "automagically" deploying them with sane defaults for storage and network. Right now the gap between compose and full-blown k8s is too big.
mdaniel · 23m ago
So, what I'm hearing is that it should tie itself to a commercial company, who now have a private equity master to answer to, versus an open source technology run by a foundation

Besides, easily half of this thread is whining about helm for which docker-compose has no answer whatsoever. There is no $(docker compose run oci://example.com/awesome --version 1.2.3 --set-string root-user=admin)

d4mi3n · 2h ago
I agree with the author that YAML as a configuration format leaves room for error, but please, for the love of whatever god or ideals you hold dear, do not adopt HCL as the configuration language of choice for k8s.

While I agree type safety in HCL beats that of YAML (a low bar), it still leaves a LOT to be desired. If you're going to go through the trouble of considering a different configuration language anyway, let's do ourselves a favor and consider things like CUE[1] or Starlark[2] that offer either better type safety or much richer methods of composition.

1. https://cuelang.org/docs/introduction/#philosophy-and-princi...

2. https://github.com/bazelbuild/starlark?tab=readme-ov-file#de...

mdaniel · 1h ago
I repeatedly see this "yaml isn't typesafe" claim but have no idea where it's coming from since all the Kubernetes APIs are OpenAPI, and thus JSON Schema, and since YAML is a superset of JSON it is necessarily typesafe

Every JSON Schema aware tool in the universe will instantly know this PodSpec is wrong:

  kind: 123
  metadata: [ {you: wish} ]
I think what is very likely happening is that folks are -- rightfully! -- angry about using a text templating language to try and produce structured files. If they picked jinja2 they'd have the same problem -- it does not consider any textual output as "invalid" so jinja2 thinks this is a-ok

  jinja2.Template("kind: {{ youbet }}").render(youbet=True)
I am aware that helm does *YAML* sanity checking, so one cannot just emit whatever crazypants yaml they wish, but it does not then go one step further to say "uh, your json schema is fubar friend"
solatic · 35m ago
I don't get the etcd hate. You can run single-node etcd in simple setups. You can't easily replace it because so much of the Kubernetes API is a thin wrapper around etcd APIs like watch that are quite essential to writing controllers and don't map cleanly to most other databases, certainly not sqlite or frictionless hosted databases like DynamoDB.

What actually makes Kubernetes hard to set up by yourself are a) CNIs, in particular if you both intend to avoid cloud-provider specific CNIs, support all networking (and security!) features, and still have high performance; b) all the cluster PKI with all the certificates for all the different components, which Kubernetes made an absolute requirement because, well, prpduction-grade security.

So if you think you're going to make an "easier" Kubernetes, I mean, you're avoiding all the lessons learned and why we got here in the first place. CNI is hardly the naive approach to the problem.

Complaining about YAML and Helm are dumb. Kubernetes doesn't force you to use either. The API server anyway expects JSON at the end. Use whatever you like.

mdaniel · 5h ago
> Allow etcd swap-out

From your lips to God's ears. And, as they correctly pointed out, this work is already done, so I just do not understand the holdup. Folks can continue using etcd if it's their favorite, but mandating it is weird. And I can already hear the butwhataboutism yet there is already a CNCF certification process and a whole subproject just for testing Kubernetes itself, so do they believe in the tests or not?

> The Go templates are tricky to debug, often containing complex logic that results in really confusing error scenarios. The error messages you get from those scenarios are often gibberish

And they left off that it is crazypants to use a textual templating language for a whitespace sensitive, structured file format. But, just like the rest of the complaints, it's not like we don't already have replacements, but the network effect is very real and very hard to overcome

That barrier of "we have nicer things, but inertia is real" applies to so many domains, it just so happens that helm impacts a much larger audience

dzonga · 4h ago
I thought this would be written along the lines of an lllm going through your code - spinning up a railway file. then say have tf for few of the manual dependencies etc that can't be easily inferred.

& get automatic scaling out of the box etc. a more simplified flow rather than wrangling yaml or hcl

in short imaging if k8s was a 2-3 max 5 line docker compose like file

jcastro · 6h ago
For the confusion around verified publishing, this is something the CNCF encourages artifact authors and their projects to set up. Here are the instructions for verifying your artifact:

https://artifacthub.io/docs/topics/repositories/

You can do the same with just about any K8s related artifact. We always encourage projects to go through the process but sometimes they need help understanding that it exists in the first place.

Artifacthub is itself an incubating project in the CNCF, ideas around making this easier for everyone are always welcome, thanks!

(Disclaimer: CNCF Staff)

calcifer · 4h ago
> We always encourage projects to go through the process but sometimes they need help understanding that it exists in the first place.

Including ingress-nginx? Per OP, it's not marked as verified. If even the official components don't bother, it's hard to recommend it to third parties.

rwmj · 5h ago
Make there be one, sane way to install it, and make that method work if you just want to try it on a single node or single VM running on a laptop.
mdaniel · 5h ago
My day job makes this request of my team right now, and yet when trying to apply this logic to a container and cloud-native control plane, there are a lot more devils hiding in those details. Use MetalLB for everything, even if NLBs are available? Use Ceph for storage even if EBS is available? Definitely don't use Ceph on someone's 8GB laptop. I can keep listing "yes, but" items that make doing such a thing impossible to troubleshoot because there's not one consumer

So, to circle back to your original point: rke2 (Apache 2) is a fantastic, airgap-friendly, intelligence community approved distribution, and pairs fantastic with rancher desktop (also Apache 2). It's not the kubernetes part of that story which is hard, it's the "yes, but" part of the lego build

- https://github.com/rancher/rke2/tree/v1.33.1%2Brke2r1#quick-...

- https://github.com/rancher-sandbox/rancher-desktop/releases

moomin · 4h ago
Let me add one more: give controllers/operators a defined execution order. Don’t let changes flow both ways. Provide better ways for building things that don’t step on everyone else’s toes. Make whatever replaces helm actually maintain stuff rather than just splatting it out.
clvx · 2h ago
This is a hard no for me. This is the whole thing about reconciliation loop. You can just push something to the api/etcd and eventually it will become ready when all the dependencies exist. Now, rejecting manifests because crd’s don’t exist yet is a different discussion. I’m down to have a cache of manifests to be deployed waiting for crd but if the crd isn't deployed, then a garbage collection alike tool removes them from cache. This is what fluxcd and argocd already do in a way but I would like to Have it natively.
fragmede · 1h ago
Instead of yaml, json, or HCL, how about starlark? It's a stripped down Python, used in production by bazel, so it's already got the go libraries.
mdaniel · 1h ago
As the sibling comment points out, I think that would be a perfectly fine helm replacement, but I would never ever want to feed starlark into k8s apis directly
fjasdfwa · 1h ago
kube-apiserver uses a JSON REST API. You can use whatever serializes to JSON. YAML is the most common and already works directly with kubectl.

I personally use TypeScript since it has unions and structural typing with native JSON support but really anything can work.

mdaniel · 19m ago
Fun fact, while digging into the sibling comment's complaint about the OpenAPI spec, I learned that it actually advertises multiple content-types:

  application/json
  application/json;stream=watch
  application/vnd.kubernetes.protobuf
  application/vnd.kubernetes.protobuf;stream=watch
  application/yaml
which I presume all get coerced into protobuf before being actually interpreted
dijit · 6h ago
Honestly; make some blessed standards easier to use and maintain.

Right now running K8S on anything other than cloud providers and toys (k3s/minikube) is disaster waiting to happen unless you're a really seasoned infrastructure engineer.

Storage/state is decidedly not a solved problem, debugging performance issues in your longhorn/ceph deployment is just pain.

Also, I don't think we should be removing YAML, we should instead get better at using it as an ILR (intermediate language representation) and generating the YAML that we want instead of trying to do some weird in-place generation (Argo/Helm templating) - Kubernetes sacrificed a lot of simplicity to be eventually consistent with manifests, and our response was to ensure we use manifests as little as possible, which feels incredibly bizzare.

Also, the design of k8s networking feels like it fits ipv6 really well, but it seems like nobody has noticed somehow.

zdc1 · 5h ago
I like YAML since anything can be used to read/write it. Using Python / JS / yq to generate and patch YAML on-the-fly is quite nifty as part of a pipeline.

My main pain point is, and always has been, helm templating. It's not aware of YAML or k8s schemas and puts the onus of managing whitespace and syntax onto the chart developer. It's pure insanity.

At one point I used a local Ansible playbook for some templating. It was great: it could load resource template YAMLs into a dict, read separately defined resource configs, and then set deeply nested keys in said templates and spit them out as valid YAML. No helm `indent` required.

pm90 · 5h ago
yaml is just not maintainable if you’re managing lots of apps for eg a midsize company or larger. Upgrades become manual/painful.
lawn · 5h ago
k3s isn't a toy though.
dijit · 5h ago
* Uses Flannel bi-lateral NAT for SDN

* Uses local-only storage provider by default for PVC

* Requires entire cluster to be managed by k3s, meaning no freebsd/macos/windows node support

* Master TLS/SSL Certs not rotated (and not talked about).

k3s is very much a toy - a nice toy though, very fun to play with.

zug_zug · 6h ago
Meh, imo this is wrong.

What Kubernetes is missing most is a 10 year track record of simplicity/stability. What it needs most to thrive is a better reputation of being hard to foot-gun yourself with.

It's just not a compelling business case to say "Look at what you can do with kubernetes, and you only need a full-time team of 3 engineers dedicated to this technology at tho cost of a million a year to get bin-packing to the tune of $40k."

For the most part Kubernetes is becoming the common-tongue, despite all the chaotic plugins and customizations that interact with each other in a combinatoric explosion of complexity/risk/overhead. A 2.0 would be what I'd propose if I was trying to kill kuberenetes.

candiddevmike · 6h ago
Kubernetes is what happens when you need to support everyone's wants and desires within the core platform. The abstraction facade breaks and ends up exposing all of the underlying pieces because someone needs feature X. So much of Kubernetes' complexity is YAGNI (for most users).

Kubernetes 2.0 should be a boring pod scheduler with some RBAC around it. Let folks swap out the abstractions if they need it instead of having everything so tightly coupled within the core platform.

sitkack · 6h ago
Kubernetes is when you want to sell complexity because complexity makes money and and naturally gets you vendor lockin even while being ostensibly vendor neutral. Never interrupt the customer while they are foot gunning themselves.

Swiss Army Buggy Whips for Everyone!

wredcoll · 5h ago
Not really. Kubernetes is still wildly simpler than what came before, especially accounting for the increased capabilities.
KaiserPro · 5h ago
the fuck it is.

The problem is k8s is both a orchestration system and a service provider.

Grid/batch/tractor/cube are all much much more simple to run at scale. More over they can support complex dependencies. (but mapping storage is harder)

but k8s fucks around with DNS and networking, disables swap.

Making a simple deployment is fairly simple.

But if you want _any_ kind of ci/cd you need flux, any kind of config management you need helm.

JohnMakin · 5h ago
> But if you want _any_ kind of ci/cd you need flux, any kind of config management you need helm.

Absurdly wrong on both counts.

jitl · 5h ago
K8s has swap now. I am managing a fleet of nodes with 12TB of NVMe swap each. Each container gets (memory limit / node memory) * (total swap) swap limit. No way to specify swap demand on the pod spec yet so needs to be managed “by hand” with taints or some other correlation.
cogman10 · 5h ago
Yup. Having migrated from a puppet + custom scripts environment and terraform + custom scripts. K8S is a breath of fresh air.

I get that it's not for everyone, I'd not recommend it for everyone. But once you start getting a pretty diverse ecosystem of services, k8s solves a lot of problems while being pretty cheap.

Storage is a mess, though, and something that really needs to be addressed. I typically recommend people wanting persistence to not use k8s.

oceanplexian · 10m ago
> Yup. Having migrated from a puppet + custom scripts environment and terraform + custom scripts. K8S is a breath of fresh air.

Having experience in both the former and the latter (in big tech) and then going on to write my own controllers and deal with fabric and overlay networking problems, not sure I agree.

In 2025 engineers need to deal with persistence, they need storage, they need high performance networking, they need HVM isolation and they need GPUs. When a philosophy starts to get in the way of solving real problems and your business falls behind, that philosophy will be left on the side of the road. IMHO it's destined to go the way as OpenStack when someone builds a simpler, cleaner abstraction, and it will take the egos of a lot of people with it when it does.

mdaniel · 5h ago
> Storage is a mess, though, and something that really needs to be addressed. I typically recommend people wanting persistence to not use k8s.

I have actually come to wonder if this is actually an AWS problem, and not a Kubernetes problem. I mention this because the CSI controllers seem to behave sanely, but they are only as good as the requests being fulfilled by the IaaS control plane. I secretly suspect that EBS just wasn't designed for such a hot-swap world

Now, I posit this because I haven't had to run clusters in Azure nor GCP to know if my theory has legs

I guess the counter-experiment would be to forego the AWS storage layer and try Ceph or Longhorn but no company I've ever worked at wants to blaze trails about that, so they just build up institutional tribal knowledge about treating PVCs with kid gloves

wredcoll · 3h ago
Honestly this just feels like kubernetes just solving the easy problems and ignoring the hard bits. You notice the pattern a lot after a certain amount of time watching new software being built.
mdaniel · 3h ago
Apologies, but what influence does Kubernetes have over the way AWS deals with attach and detach behavior of EBS?

Or is your assertion that Kubernetes should be its own storage provider and EBS can eat dirt?

wredcoll · 2h ago
I was tangenting, but yes, kube providing no storage systems has led to a lot of annoying 3rd party ones
selcuka · 6h ago
> Let folks swap out the abstractions if they need it instead of having everything so tightly coupled within the core platform.

Sure, but then one of those third party products (say, X) will catch up, and everyone will start using it. Then job ads will start requiring "10 year of experience in X". Then X will replace the core orchestrator (K8s) with their own implementation. Then we'll start seeing comments like "X is a horribly complex, bloated platform which should have been just a boring orchestrator" on HN.

echelon · 6h ago
Kubernetes doesn't need a flipping package manager or charts. It needs to do one single job well: workload scheduling.

Kubernetes clusters shouldn't be bespoke and weird with behaviors that change based on what flavor of plugins you added. That is antithetical to the principal of the workloads you're trying to manage. You should be able to headshot the whole thing with ease.

Service discovery is just one of many things that should be a different layer.

KaiserPro · 5h ago
> Service discovery is just one of many things that should be a different layer.

hard agree. Its like jenkins, good idea, but its not portable.

12_throw_away · 1h ago
> Its like jenkins

Having regretfully operated both k8s and Jenkins, I fully agree with this, they have some deep DNA in common.

cyberax · 40m ago
I would love:

1. Instead of recreating the "gooey internal network" anti-pattern with CNI, provide strong zero-trust authentication for service-to-service calls.

2. Integrate with public networks. With IPv6, there's no _need_ for an overlay network.

3. Interoperability between several K8s clusters. I want to run a local k3s controller on my machine to develop a service, but this service still needs to call a production endpoint for a dependent service.

znpy · 2h ago
I'd like to add my points of view:

1. Helm: make it official, ditch the text templating. The helm workflow is okay, but templating text is cumbersome and error-prone. What we should be doing instead is patching objects. I don't know how, but I should be setting fields, not making sure my values contain text that are correctly indented (how many spaces? 8? 12? 16?)

2. Can we get a rootless kubernetes already, as a first-class citizen? This opens a whole world of possibilities. I'd love to have a physical machine at home where I'm dedicating only an unprivileged user to it. It would have limitations, but I'd be okay with it. Maybe some setuid-binaries could be used to handle some limited privileged things.

Melatonic · 5h ago
MicroVM's
Dedime · 2h ago
From someone who was recently tasked with "add service mesh" - make service mesh obsolete. I don't want to install a service mesh. mTLS or some other from of encryption between pods should just happen automatically. I don't want some janky ass sidecar being injected into my pod definition ala linkerd, and now I've got people complaining that cilium's god mode is too permissive. Just have something built-in, please.
mdaniel · 1h ago
For my curiosity, what threat model is mTLS and encryption between pods driving down? Do you run untrusted workloads in your cluster and you're afraid they're going to exfil your ... I dunno, SQL login to the in-cluster Postgres?

As someone who has the same experience you described with janky sidecars blowing up normal workloads, I'm violently anti service-mesh. But, cert expiry and subjectAltName management is already hard enough, and you would want that to happen for every pod? To say nothing of the TLS handshake for every connection?

recursivedoubts · 4h ago
please make it look like old heroku for us normies
tayo42 · 5h ago
> where k8s is basically the only etcd customer left.

Is that true. No one is really using it?

I think one thing k8s would need is some obvious answer for stateful systems(at scale, not mysql at a startup). I think there are some ways to do it? Where I work there is basically everything on k8s, then all the databases on their own crazy special systems to support they insist its impossible and costs to much. I work in the worst of all worlds now supporting this.

re: comments about k8s should just schedule pods. mesos with aurora or marathon was basically that. If people wanted that those would have done better. The biggest users of mesos switched to k8s

haiku2077 · 5h ago
I had to go deep down the etcd rabbit hole several years ago. The problems I ran into:

1. etcd did an fsync on every write and required all nodes to complete a write to report a write as successful. This was not configurable and far higher a guarantee than most use cases actually need - most Kubernetes users are fine with snapshot + restore an older version of the data. But it really severely impacts performance.

2. At the time, etcd had a hard limit of 8GB. Not sure if this is still there.

3. Vanilla etcd was overly cautious about what to do if a majority of nodes went down. I ended up writing a wrapper program to automatically recover from this in most cases, which worked well in practice.

In conclusion there was no situation where I saw etcd used that I wouldn't have preferred a highly available SQL DB. Indeed, k3s got it right using sqlite for small deployments.

nh2 · 5h ago
For (1), I definitely want my production HA databases to fsync every write.

Of course configurability is good (e.g. for automated fasts tests you don't need it), but safe is a good default here, and if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable (e.g. 1000 fsyncs/second).

haiku2077 · 5h ago
> I definitely want my production HA databases to fsync every write.

I didn't! Our business DR plan only called for us to restore to an older version with short downtime, so fsync on every write on every node was a reduction in performance for no actual business purpose or benefit. IIRC we modified our database to run off ramdisk and snapshot every few minutes which ran way better and had no impact on our production recovery strategy.

> if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable

At the time one of the problems I ran into was that public cloud regions in southeast asia had significantly worse SSDs that couldn't keep up. This was on one of the big three cloud providers.

1000 fsyncs/second is a tiny fraction of the real world production load we required. An API that only accepts 1000 writes a second is very slow!

Also, plenty of people run k8s clusters on commodity hardware. I ran one on an old gaming PC with a budget SSD for a while in my basement. Great use case for k3s.

dilyevsky · 5h ago
1 and 2 can be overridden via flag. 3 is practically the whole point of the software
haiku2077 · 5h ago
With 3 I mean that in cases where there was an unambiguously correct way to recover from the situation, etcd did not automatically recover. My wrapper program would always recover from thise situations. (It's been a number of years and the exact details are hazy now, though.)
dilyevsky · 4h ago
If the majority of quorum is truly down, then you’re down. That is by design. There’s no good way to recover from this without potentially losing state so the system correctly does nothing at this point. Sure you can force it into working state with external intervention but that’s up to you
haiku2077 · 3h ago
Like I said I'm hazy on the details, this was a small thing I did a long time ago. But I do remember our on-call having to deal with a lot of manual repair of etcd quorum, and I noticed the runbook to fix it had no steps that needed any human decision making, so I made that wrapper program to automate the recovery. It wasn't complex either, IIRC it was about one or two pages of code, mostly logging.
dilyevsky · 5h ago
That is decisively not true. A number of very large companies use etcd directly for various needs
singularity2001 · 4h ago
More like wasm?
mdaniel · 3h ago
As far as I know one can do that right now, since wasmedge (Apache 2) exposes a CRI interface https://wasmedge.org/docs/develop/deploy/oci-runtime/crun#pr... (et al)
jonenst · 5h ago
What about kustomize and kpt ? I'm using them (instead of helm) but but:

* kpt is still not 1.0

* both kustomize and kpt require complex setups to programatically generate configs (even for simple things like replicas = replicasx2)

fatbird · 5h ago
How many places are running k8s without OpenShift to wrap it and manage a lot of the complexity?
raincom · 4h ago
Openshift, if IBM and Redhat want to milk the license and support contracts. There are other vendors that sell k8s: Rancher, for instance. SuSe bought Rancher.
jitl · 4h ago
I’ve never used OpenShift nor do I know anyone irl who uses it. Sample from SF where most people I know are on AWS or GCP.
coredog64 · 2h ago
You can always go for the double whammy and run ROSA: RedHat OpenShift on AWS