🖥️

Google Compute Engine

ComputeCloud Virtual Machines

Google Compute Engine lets you run virtual machines on Google's infrastructure. Combine CPUs and memory freely with custom machine types and save costs with per-second billing.

▶Architecture Diagram

🔍 Structure

👥Users

🌐Cloud DNS

⚖️Cloud LB

🖥️CE (Zone-a)

🖥️CE (Zone-b)

💾Persistent Disk

📈Instance Group

Dashed line animations indicate the flow direction of data or requests

Why do you need it?

You need to run an application long-term, but buying physical servers means high upfront costs and weeks-long lead times for scaling. If your traffic predictions are wrong, you either waste money or your service slows down.

Why did this approach emerge?

Before cloud VMs, getting a new server meant a capacity-planning cycle that could stretch months: forecast traffic, write a purchase order, wait for hardware to arrive, rack and cable it, install an OS, configure networking. All of it was capital expenditure tied to a three-to-five-year hardware refresh cycle. If a traffic spike hit before the next refresh, you had no recourse — the fixed capacity was all you had, and users hit errors or slow responses until the spike passed. Over-provisioning to avoid that risk meant paying for idle hardware year-round. Cloud VMs replaced this with minute-level provisioning: request an instance, get it running, and delete it when the load subsides.

How does it work inside?

When a request to create an instance arrives, the hypervisor on a physical host carves out the specified CPU and memory resources and boots the VM with your chosen machine type and OS image. The instance lands inside a VPC with the firewall rules you defined. For production workloads, a Managed Instance Group watches load metrics — CPU utilization, request count, or a custom signal — and compares them against the threshold you set. When the metric crosses the threshold, new instances are added; when it drops, excess instances are removed. Persistent Disk survives instance deletion because it is network-attached storage, not a local drive — so you can detach it and reattach it to a different instance without losing data.

Boundaries & Distinctions

Both Compute Engine and Cloud Run run applications, but Compute Engine gives you direct VM management while Cloud Run handles infrastructure automatically once you deploy a container. Choose Compute Engine when you need OS-level control, custom drivers, or long-running processes.

When should you use it?

Ideal for web servers, long-running batch jobs, and stateful applications that need to stay up continuously. Not the right fit when you only need to run code briefly in response to incoming requests.

Web application hostingBatch processingHigh-performance computingDev/test environments