⚖️

Google Cloud Load Balancing

NetworkingGlobal and Regional Traffic Distribution

Google Cloud Load Balancing is a managed load balancer that routes user requests to the most appropriate backend. It can front HTTP(S), TCP, UDP, and serverless backends behind a single entry point.

▶Architecture Diagram

🔗 Relationship

👥Users

🌍Global IP

⚖️Cloud LB

🧭URL Map

🚀Cloud Run

☸️GKE

🖥️Compute Engine

❤️Health Check

Dashed line animations indicate the flow direction of data or requests

Why do you need it?

As application instances multiply, you need a reliable way to decide which server should handle each request. If traffic keeps flowing to failed backends, users see the outage immediately.

Why did this approach emerge?

Teams used to install and operate their own load balancers, VIPs, health checks, and failover logic. As cloud backends expanded across VMs, containers, and serverless platforms, managed front doors that unify all of them became necessary.

How does it work inside?

Cloud Load Balancing accepts requests at a global or regional front door, then selects a destination using URL maps and backend service policies. Health checks continuously evaluate each backend and automatically remove unhealthy targets from routing.

Boundaries & Distinctions

Cloud Load Balancing and Cloud CDN both sit in front of user traffic, but Load Balancing decides where a request should go while Cloud CDN reuses an already-generated response from edge caches. If backend selection and high availability are the core problem, Load Balancing is central; if delivering static content faster is the core problem, CDN takes the lead.

When should you use it?

A good fit for services that span multiple regions or multiple execution environments. It is less compelling for a single-backend internal service with no failover or path-routing needs.

Multi-backend routingHighly available web servicesGlobal entry pointIncremental migrations