At Next '26 I was in a lab where the instructor had us deploy an agent on Cloud Run. Halfway through he just shrugged and said, "Cloud Run is fine," and that's pretty much how I treat it. Every container I run in my personal projects lives on Cloud Run, because it scales to zero, I genuinely don't care about cold starts, and it's far less of a headache for me to set up than the alternative.
But it nagged at me, because someone had recently asked me the obvious follow-up: so when would you actually reach for Kubernetes instead? I gave them an answer, felt fine about it, then looked it up later and realised I didn't really understand my own answer. So this is me sorting that out.
Kubernetes isn't better, just more
Let's get this out of the way first, because it's the assumption that trips everyone up: Kubernetes isn't better than Cloud Run just because it does more stuff. More features doesn't mean more value, and if anything it's the opposite — all that power is something you then have to configure, secure, patch, and keep in your head. Cloud Run wins for me precisely because it does less and hides the rest.
So the real question isn't "which one's better," it's "what does Kubernetes give me that's actually worth the extra hassle?" Here's where I landed.
Where Kubernetes actually pulls its weight
State that sticks around
Kubernetes is good at holding state across lots of sessions, and that's a genuine edge. A stateless system has to keep re-establishing all the necessary bits every single time, which means the same payload getting redefined over and over. It's a bit like opening a fresh chat with an LLM: it has forgotten everything you told it last time, so now you're paying to explain the whole thing again. Stateful infrastructure doesn't make you do that.
Sidecars
You can run a second container right next to your main one inside the same pod, and let it handle the boring-but-necessary stuff like logging, encryption, or traffic shaping. The benefit isn't really the sidecar itself, it's what it does to your actual container — it leaves it doing nothing but business logic, so your code stops dragging around plumbing it was never supposed to own.
CRDs and operators
A CRD lets you define a "kind" of thing that can get provisioned, and an operator just keeps stamping that thing out straight from the definition. My platform people love this and I get why, because once you've described what a thing should look like, you've turned one-off setup into something repeatable that more or less heals itself.
GPUs and TPUs
This one's blunt: you can't run accelerators on Cloud Run at all, so if your workload needs them, the decision has already been made for you.
Performance, under the right conditions
This one comes with a big asterisk, because Cloud Run is plenty fast for most things. But if you've got steady, high-volume traffic and you're willing to actually tune the thing, Kubernetes lets you get at the levers Cloud Run keeps hidden — you can pick your node types, pin CPU and memory, keep pods warm so there's no cold start to think about, bin-pack workloads tightly to squeeze more out of every machine, and hold long-lived connections open instead of working request-by-request. None of that matters when you're scaling to zero and serving sporadic traffic, but at scale, with the tuning done, it's where Kubernetes can genuinely pull ahead.
They're closer to the same thing than the debate admits
The whole "Cloud Run vs Kubernetes" framing makes it sound like a fight, and it really isn't. Cloud Run actually runs on Kubernetes — it's GKE underneath with the messy operational bits sanded off and tucked out of sight. Both take a container image and run it, both autoscale, and both handle rolling deploys, health checks, and traffic splitting without you thinking much about it.
The difference isn't the engine, it's how much of the machinery gets handed to you. Cloud Run gives you the container and keeps the cluster hidden, while Kubernetes hands you the whole cluster and says "here, it's yours." That's exactly what you want when you need to reach in and shape stateful workloads, sidecars, custom resources, or GPUs — and exactly what you don't want when you just need a container to serve traffic and scale to zero.
The takeaway
Use the least powerful tool that gets the job done. That's not settling for less, it's a discipline worth keeping: start with Cloud Run and make Kubernetes prove it's needed, whether that's stateful workloads, sidecar offloading, a CRD-driven platform, accelerators, or performance you've actually tuned for. If none of that is in play, the extra power is just weight you're carrying for no reason.
And in the AI era, where most of what we're shipping is stateless agents and inference endpoints, "Cloud Run is fine" turns out to be a pretty solid place to stand.
0 Comments
Leave a Comment