What Would a Kubernetes 2.0 Look Like

48 Bogdanp 51 6/19/2025, 12:00:54 PM matduggan.com ↗

Comments (51)

jitl · 5m ago
I feel like I’m already living in the Kubernetes 2.0 world because I manage my cluster & its applications with Terraform.

- I get HCL, types, resource dependencies, data structure manipulation for free

- I use a single `tf apply` to create the cluster, its underlying compute nodes, related cloud stuff like S3 buckets, etc; as well as all the stuff running on the cluster

- We use terraform modules for re-use and de-duplication, including integration with non-K8s infrastructure. For example, we have a module that sets up a Cloudflare ZeroTrust tunnel to a K8s service, so with 5 lines of code I can get a unique public HTTPS endpoint protected by SSO for whatever running in K8s. The module creates a Deployment running cloudflared as well as configures the tunnel in the Cloudflare API.

- Many infrastructure providers ship signed well documented Terraform modules, and Terraform does reasonable dependency management for the modules & providers themselves with lockfiles.

- I can compose Helm charts just fine via the Helm terraform provider if necessary. Many times I see Helm charts that are just “create namespace, create foo-operator deployment, create custom resource from chart values” (like Datadog). For these I opt to just install the operator & manage the CRD from terraform directly or via a thin Helm pass through chat that just echos whatever HCL/YAML I put in from Terraform values.

Terraform’s main weakness is orchestrating the apply process itself, similar to k8s with YAML or whatever else. We use Spacelift for this.

pm90 · 52m ago
Hard disagree with replacing yaml with HCL. Developers find HCL very confusing. It can be hard to read. Does it support imports now? Errors can be confusing to debug.

Why not use protobuf, or similar interface definition languages? Then let users specify the config in whatever language they are comfortable with.

geoctl · 46m ago
You can very easily build and serialize/deserialize HCL, JSON, YAML or whatever you can come up with outside Kubernetes from the client-side itself (e.g. kubectl). This has actually nothing to do with Kubernetes itself at all.
dilyevsky · 29m ago
Maybe you know this but Kubernetes interface definitions are already protobufs (except for crds)
cmckn · 17m ago
Sort of. The hand-written go types are the source of truth and the proto definitions are generated from there, solely for the purpose of generating protobuf serializers for the hand-written go types. The broader machinery/semantics of the Kubernetes API are not expressed in the proto definitions.
johngossman · 1h ago
Not a very ambitious wishlist for a 2.0 release. Everyone I talk to complains about the complexity of k8s in production, so I think the big question is could you do a 2.0 with sufficient backward compatibility that it could be adopted incrementally and make it simpler. Back compat almost always mean complexity increases as the new system does its new things and all the old ones.
mrweasel · 1h ago
What I would add is "sane defaults", as in unless you pick something different, you get a good enough load balancer/network/persistent storage/whatever.

I'd agree that YAML isn't a good choice, but neither is HCL. Ever tried reading Terraform, yeah, that's bad too. Inherently we need a better way to configure Kubernetes clusters and changing out the language only does so much.

IPv6, YES, absolutely. Everything Docker, container and Kubernetes should have been IPv6 only internal from the start. Want IPv4? That should be handle by a special case ingress controller.

zdw · 37m ago
Sane defaults is in conflict with "turning you into a customer of cloud provider managed services".

The longer I look at k8s, the more I see it "batteries not included" around storage, networking, etc, with the result being that the batteries come with a bill attached from AWS, GCP, etc. K8s is less of an open source project, and more as a way encourage dependency on these extremely lucrative gap filler services from the cloud providers.

JeffMcCune · 28m ago
Except you can easily install calico, istio, and ceph on used hardware in your garage and get an experience nearly identical to every hyper scaler using entirely free open source software.
zdw · 23m ago
Having worked on on-prem K8s deployments, yes, you can do this. But getting it to production grade is very different than a garage-quality proof of concept.
mdaniel · 16m ago
I think OP's point was: but how much of that production grade woe is the fault of Kubernetes versus, sure, turns out booting up an PaaS from scratch is hard as nails. I think that k8s pluggable design also blurs that boundary in most people's heads. I can't think of the last time the control plane shit itself, versus everyone and their cousin has a CLBO story for the component controllers installed on top of k8s
geoctl · 53m ago
I would say k8s 2.0 needs: 1. gRPC/proto3-based APIs to make controlling k8s clusters easier using any programming language not just practically Golang as is the case currently and this can even make dealing with k8s controllers easier and more manageable, even though it admittedly might actually complicates things at the API server-side when it comes CRDs. 2. PostgreSQL or pluggable storage backend by default instead of etcd. 3. Clear identity-based, L7-aware ABAC-based access control interface that can be implemented by CNIs for example. 4. Applying userns by default 5. Easier pluggable per-pod CRI system where microVMs and container-based runtimes can easily co-exist based on the workload type.
jitl · 29m ago
All the APIs, including CRDs, already have a well described public & introspectable OpenAPI schema you can use to generate clients. I use the TypeScript client generated and maintained by Kubernetes organization. I don’t see what advantage adding a binary serialization wire format has. I think gRPC makes sense when there’s some savings to be had with latency, multiplexing, streams etc but control plane things like Kubernetes don’t seem to me like it’s necessary.
darkwater · 51m ago
I totally dig the HCL request. To be honest I'm still mad at Github that initially used HCL for Github Actions and then ditched it for YAML when they went stable.
carlhjerpe · 5m ago
I detest HCL, the module system is pathetic. It's not composable at all and you keep doing gymnastics to make sure everything is known at plan time (like using lists where you should use dictionaries) and other anti-patterns.

I use Terranix to make config.tf.json which means I have the NixOS module system that's composable enough to build a Linux distro at my fingertips to compose a great Terraform "state"/project/whatever.

It's great to be able to run some Python to fetch some data, dump it in JSON, read it with Terranix, generate config.tf.json and then apply :)

zdw · 2h ago
Related to this, a 2020 take on the topic from the MetalLB dev: https://blog.dave.tf/post/new-kubernetes/
jauntywundrkind · 13m ago
152 comments on A Better Kubernetes, from the Ground Up, https://news.ycombinator.com/item?id=25243159
mdaniel · 45m ago
> Allow etcd swap-out

From your lips to God's ears. And, as they correctly pointed out, this work is already done, so I just do not understand the holdup. Folks can continue using etcd if it's their favorite, but mandating it is weird. And I can already hear the butwhataboutism yet there is already a CNCF certification process and a whole subproject just for testing Kubernetes itself, so do they believe in the tests or not?

> The Go templates are tricky to debug, often containing complex logic that results in really confusing error scenarios. The error messages you get from those scenarios are often gibberish

And they left off that it is crazypants to use a textual templating language for a whitespace sensitive, structured file format. But, just like the rest of the complaints, it's not like we don't already have replacements, but the network effect is very real and very hard to overcome

That barrier of "we have nicer things, but inertia is real" applies to so many domains, it just so happens that helm impacts a much larger audience

jcastro · 1h ago
For the confusion around verified publishing, this is something the CNCF encourages artifact authors and their projects to set up. Here are the instructions for verifying your artifact:

https://artifacthub.io/docs/topics/repositories/

You can do the same with just about any K8s related artifact. We always encourage projects to go through the process but sometimes they need help understanding that it exists in the first place.

Artifacthub is itself an incubating project in the CNCF, ideas around making this easier for everyone are always welcome, thanks!

(Disclaimer: CNCF Staff)

dijit · 1h ago
Honestly; make some blessed standards easier to use and maintain.

Right now running K8S on anything other than cloud providers and toys (k3s/minikube) is disaster waiting to happen unless you're a really seasoned infrastructure engineer.

Storage/state is decidedly not a solved problem, debugging performance issues in your longhorn/ceph deployment is just pain.

Also, I don't think we should be removing YAML, we should instead get better at using it as an ILR (intermediate language representation) and generating the YAML that we want instead of trying to do some weird in-place generation (Argo/Helm templating) - Kubernetes sacrificed a lot of simplicity to be eventually consistent with manifests, and our response was to ensure we use manifests as little as possible, which feels incredibly bizzare.

Also, the design of k8s networking feels like it fits ipv6 really well, but it seems like nobody has noticed somehow.

zdc1 · 53m ago
I like YAML since anything can be used to read/write it. Using Python / JS / yq to generate and patch YAML on-the-fly is quite nifty as part of a pipeline.

My main pain point is, and always has been, helm templating. It's not aware of YAML or k8s schemas and puts the onus of managing whitespace and syntax onto the chart developer. It's pure insanity.

At one point I used a local Ansible playbook for some templating. It was great: it could load resource template YAMLs into a dict, read separately defined resource configs, and then set deeply nested keys in said templates and spit them out as valid YAML. No helm `indent` required.

pm90 · 50m ago
yaml is just not maintainable if you’re managing lots of apps for eg a midsize company or larger. Upgrades become manual/painful.
lawn · 57m ago
k3s isn't a toy though.
dijit · 54m ago
* Uses Flannel bi-lateral NAT for SDN

* Uses local-only storage provider by default for PVC

* Requires entire cluster to be managed by k3s, meaning no freebsd/macos/windows node support

* Master TLS/SSL Certs not rotated (and not talked about).

k3s is very much a toy - a nice toy though, very fun to play with.

rwmj · 59m ago
Make there be one, sane way to install it, and make that method work if you just want to try it on a single node or single VM running on a laptop.
mdaniel · 39m ago
My day job makes this request of my team right now, and yet when trying to apply this logic to a container and cloud-native control plane, there are a lot more devils hiding in those details. Use MetalLB for everything, even if NLBs are available? Use Ceph for storage even if EBS is available? Definitely don't use Ceph on someone's 8GB laptop. I can keep listing "yes, but" items that make doing such a thing impossible to troubleshoot because there's not one consumer

So, to circle back to your original point: rke2 (Apache 2) is a fantastic, airgap-friendly, intelligence community approved distribution, and pairs fantastic with rancher desktop (also Apache 2). It's not the kubernetes part of that story which is hard, it's the "yes, but" part of the lego build

- https://github.com/rancher/rke2/tree/v1.33.1%2Brke2r1#quick-...

- https://github.com/rancher-sandbox/rancher-desktop/releases

jonenst · 20m ago
What about kustomize and kpt ? I'm using them (instead of helm) but but:

* kpt is still not 1.0

* both kustomize and kpt require complex setups to programatically generate configs (even for simple things like replicas = replicasx2)

Melatonic · 56m ago
MicroVM's
zug_zug · 1h ago
Meh, imo this is wrong.

What Kubernetes is missing most is a 10 year track record of simplicity/stability. What it needs most to thrive is a better reputation of being hard to foot-gun yourself with.

It's just not a compelling business case to say "Look at what you can do with kubernetes, and you only need a full-time team of 3 engineers dedicated to this technology at tho cost of a million a year to get bin-packing to the tune of $40k."

For the most part Kubernetes is becoming the common-tongue, despite all the chaotic plugins and customizations that interact with each other in a combinatoric explosion of complexity/risk/overhead. A 2.0 would be what I'd propose if I was trying to kill kuberenetes.

candiddevmike · 1h ago
Kubernetes is what happens when you need to support everyone's wants and desires within the core platform. The abstraction facade breaks and ends up exposing all of the underlying pieces because someone needs feature X. So much of Kubernetes' complexity is YAGNI (for most users).

Kubernetes 2.0 should be a boring pod scheduler with some RBAC around it. Let folks swap out the abstractions if they need it instead of having everything so tightly coupled within the core platform.

selcuka · 1h ago
> Let folks swap out the abstractions if they need it instead of having everything so tightly coupled within the core platform.

Sure, but then one of those third party products (say, X) will catch up, and everyone will start using it. Then job ads will start requiring "10 year of experience in X". Then X will replace the core orchestrator (K8s) with their own implementation. Then we'll start seeing comments like "X is a horribly complex, bloated platform which should have been just a boring orchestrator" on HN.

sitkack · 1h ago
Kubernetes is when you want to sell complexity because complexity makes money and and naturally gets you vendor lockin even while being ostensibly vendor neutral. Never interrupt the customer while they are foot gunning themselves.

Swiss Army Buggy Whips for Everyone!

wredcoll · 1h ago
Not really. Kubernetes is still wildly simpler than what came before, especially accounting for the increased capabilities.
KaiserPro · 44m ago
the fuck it is.

The problem is k8s is both a orchestration system and a service provider.

Grid/batch/tractor/cube are all much much more simple to run at scale. More over they can support complex dependencies. (but mapping storage is harder)

but k8s fucks around with DNS and networking, disables swap.

Making a simple deployment is fairly simple.

But if you want _any_ kind of ci/cd you need flux, any kind of config management you need helm.

jitl · 21m ago
K8s has swap now. I am managing a fleet of nodes with 12TB of NVMe swap each. Each container gets (memory limit / node memory) * (total swap) swap limit. No way to specify swap demand on the pod spec yet so needs to be managed “by hand” with taints or some other correlation.
JohnMakin · 32m ago
> But if you want _any_ kind of ci/cd you need flux, any kind of config management you need helm.

Absurdly wrong on both counts.

cogman10 · 45m ago
Yup. Having migrated from a puppet + custom scripts environment and terraform + custom scripts. K8S is a breath of fresh air.

I get that it's not for everyone, I'd not recommend it for everyone. But once you start getting a pretty diverse ecosystem of services, k8s solves a lot of problems while being pretty cheap.

Storage is a mess, though, and something that really needs to be addressed. I typically recommend people wanting persistence to not use k8s.

mdaniel · 34m ago
> Storage is a mess, though, and something that really needs to be addressed. I typically recommend people wanting persistence to not use k8s.

I have actually come to wonder if this is actually an AWS problem, and not a Kubernetes problem. I mention this because the CSI controllers seem to behave sanely, but they are only as good as the requests being fulfilled by the IaaS control plane. I secretly suspect that EBS just wasn't designed for such a hot-swap world

Now, I posit this because I haven't had to run clusters in Azure nor GCP to know if my theory has legs

I guess the counter-experiment would be to forego the AWS storage layer and try Ceph or Longhorn but no company I've ever worked at wants to blaze trails about that, so they just build up institutional tribal knowledge about treating PVCs with kid gloves

echelon · 1h ago
Kubernetes doesn't need a flipping package manager or charts. It needs to do one single job well: workload scheduling.

Kubernetes clusters shouldn't be bespoke and weird with behaviors that change based on what flavor of plugins you added. That is antithetical to the principal of the workloads you're trying to manage. You should be able to headshot the whole thing with ease.

Service discovery is just one of many things that should be a different layer.

KaiserPro · 42m ago
> Service discovery is just one of many things that should be a different layer.

hard agree. Its like jenkins, good idea, but its not portable.

tayo42 · 1h ago
> where k8s is basically the only etcd customer left.

Is that true. No one is really using it?

I think one thing k8s would need is some obvious answer for stateful systems(at scale, not mysql at a startup). I think there are some ways to do it? Where I work there is basically everything on k8s, then all the databases on their own crazy special systems to support they insist its impossible and costs to much. I work in the worst of all worlds now supporting this.

re: comments about k8s should just schedule pods. mesos with aurora or marathon was basically that. If people wanted that those would have done better. The biggest users of mesos switched to k8s

haiku2077 · 55m ago
I had to go deep down the etcd rabbit hole several years ago. The problems I ran into:

1. etcd did an fsync on every write and required all nodes to complete a write to report a write as successful. This was not configurable and far higher a guarantee than most use cases actually need - most Kubernetes users are fine with snapshot + restore an older version of the data. But it really severely impacts performance.

2. At the time, etcd had a hard limit of 8GB. Not sure if this is still there.

3. Vanilla etcd was overly cautious about what to do if a majority of nodes went down. I ended up writing a wrapper program to automatically recover from this in most cases, which worked well in practice.

In conclusion there was no situation where I saw etcd used that I wouldn't have preferred a highly available SQL DB. Indeed, k3s got it right using sqlite for small deployments.

nh2 · 43m ago
For (1), I definitely want my production HA databases to fsync every write.

Of course configurability is good (e.g. for automated fasts tests you don't need it), but safe is a good default here, and if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable (e.g. 1000 fsyncs/second).

haiku2077 · 20m ago
> I definitely want my production HA databases to fsync every write.

I didn't! Our business DR plan only called for us to restore to an older version with short downtime, so fsync on every write was a reduction in performance for no actual business purpose or benefit.

> if somebody sets up a Kubernetes cluster, they can and should afford enterprise SSDs where fsync of small data is fast and reliable

At the time one of the problems I ran into was that public cloud regions in southeast asia had significantly worse SSDs that couldn't keep up. This was on one of the big three cloud providers.

1000 fsyncs/second is a tiny fraction of the real world production load we required. An API that only accepts 1000 writes a second is very slow!

Also, plenty of people run k8s clusters on commodity hardware. I ran one on an old gaming PC with a budget SSD for a while in my basement. Great use case for k3s.

dilyevsky · 26m ago
1 and 2 can be overridden via flag. 3 is practically the whole point of the software
haiku2077 · 18m ago
With 3 I mean that in cases where there was an unambiguously correct way to recover from the situation, etcd did not automatically recover. My wrapper program would always recover from thise situations. (It's been a number of years and the exact details are hazy now, though.)
dilyevsky · 24m ago
That is decisively not true. A number of very large companies use etcd directly for various needs
fatbird · 59m ago
How many places are running k8s without OpenShift to wrap it and manage a lot of the complexity?