🚀

Google Cloud Run

ComputeServerless Container Execution

Google Cloud Run runs containers in a fully serverless environment. Deploy a Docker image and get auto-scaling, HTTPS, and custom domains out of the box — with instances scaling down to zero when there's no traffic.

▶Architecture Diagram

🔗 Relationship

💻Client

⚖️Cloud LB

🚀Container (1)

🚀Container (2)

🚀Container (N)

🗄️Cloud SQL

🗄️Firestore

📦Artifact Registry

Dashed line animations indicate the flow direction of data or requests

Why do you need it?

You've packaged your app in a container, but getting it into production means either setting up a Kubernetes cluster or installing Docker directly on a VM. Running a full cluster for a simple API is overkill.

Why did this approach emerge?

As containers became the standard deployment unit, teams containerized their apps expecting a simpler path to production. Instead, they discovered they had to learn Kubernetes: setting up a control plane, writing YAML manifests for deployments and services, configuring networking policies, and managing cluster upgrade cycles — all just to run a single API. The operational overhead was disproportionate to the workload. Knative emerged as an open-source abstraction layer on top of Kubernetes, offering request-driven autoscaling and scale-to-zero semantics. Google took that model and productized it as Cloud Run — giving teams container-based deployment without any Kubernetes knowledge required.

How does it work inside?

Deploy a container image to Cloud Run and it creates an HTTPS endpoint. Cloud Run monitors the number of concurrent requests hitting each running instance. When concurrency exceeds the maximum you configured (default 80), new instances spin up to absorb the load. When requests drop, instances are removed — and when traffic falls to zero, all instances are removed so you pay nothing. Revision-based deployments let you gradually shift traffic to new versions, so you can canary a release at 5% before rolling it out fully.

Boundaries & Distinctions

Cloud Run and Cloud Functions are both serverless, but they differ in deployment unit and use case. Cloud Functions connects a single function to an event — one trigger, one handler. Cloud Run deploys an entire container, so it can serve multiple endpoints, carry complex dependency trees, and handle any protocol that listens on a port. Use Cloud Run for apps with routing logic or heavyweight dependencies; use Cloud Functions for isolated event handlers. Cloud Run and GKE both run containers, but they sit at different points on the control spectrum. GKE exposes the full Kubernetes feature set — scheduling policies, service mesh, DaemonSets, custom operators — which matters when you need fine-grained orchestration across many services. Cloud Run removes all of that in exchange for simplicity: no nodes to manage, no YAML manifests to maintain, no upgrade cycles to track. Choose GKE when Kubernetes features are a requirement, not just a familiar tool.

When should you use it?

Ideal for HTTP-based container workloads: web APIs, microservices, and SSR frontends. Not the best fit when traffic is consistently high (making scale-to-zero pointless) or when you need Kubernetes-level orchestration.

Web API serversMicroservicesBackend tasksFrontend SSR