HOME
ABOUT
- RESULTS
- differences
- BENEFITS
- HISTORY
- TEAM
- LOCATION
- FACILITIES
- BANKING
- MEMBERSHIPS
- APPROVALS
- LICENCES
- SUPPLIERS
- SPONSORSHIPS
- MEDIA
- PRIVACY
AUCTIONS
SHIPPING
FEES
- TS REWARDS
TOOLS
guides
FAQ
CONTACT
- CONNECT

VEHICLES
BRAND
- JAPANESE CARS
  - DAIHATSU
  - EUNOS
  - FORD
  - HONDA
  - ISUZU
  - LEXUS
  - MAZDA
  - MITSUBISHI
  - MITSUOKA
  - NISSAN
  - SUBARU
  - SUZUKI
  - TOYOTA
- GERMAN CARS
- AMERICAN CARS
- BRITISH CARS
- ITALIAN CARS
- FRENCH CARS
- SWEDISH CARS
- KOREAN CARS
TYPE
- mobility
- VENDING
- instruction
- TAXIS
- AMBULANCES
- FIRE ENGINES
- HEARSES
- LIMOUSINES
- COMMERCIAL
CLASS
FUEL
TRUCKS
minitrucks
- DAIHATSU
- HONDA
- MAZDA
- MITSUBISHI
- NISSAN
- SUBARU
- SUZUKI
- DUMP
- CRANE
- CAMPER
- REFRIGERATED
- 4WD
- NEW
BUSES
MOTORHOMES
- YAHOO!
- RAKUTEN
- DEALER

PARTS
- FREE REPORT
- PARTS CONTAINERS
- PARTS SYSTEMS
- PARTS PROTECTION
- BODY SHELLS
- DISMANTLING
- ONLINE PARTS
- NEW PARTS
- INTERIOR PARTS
- EXTERIOR PARTS
  - BONNETS
  - BUMPERS
  - GRILLES
  - FENDERS
  - DOORS
  - TRUNKS
  - SPOILERS
  - LIGHTS
  - EMBLEMS
  - CAMERAS
- ENGINES
- TRANSMISSIONS
- WHEELS & TYRES
  - WHEELS
  - TYRES
CUTS
PERFORMANCE PARTS
TRUCK PARTS
MOTORBIKE PARTS
- MOTORBIKE ENGINES
- MOTORBIKE ACCESSORIES

MOTORBIKES
MARINE
FORKLIFTS
MACHINERY
AGRICULTURAL
OTHER
COUNTRY
- AUSTRALIA
- CANADA
- KENYA
- MYANMAR
- NEW ZEALAND
- PAKISTAN
- TANZANIA
- UNITED STATES

CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 last-modified: Wed, 16 Jul 2025 09:53:40 GMT content-type: text/html; charset=utf-8 vary: Cookie vary: Accept-Encoding content-security-policy: base-uri 'self'; object-src 'none'; script-src 'strict-dynamic' 'unsafe-inline' https: http: 'nonce-SSS2wpxPT52fRggzTtNAOO0KVuF7/B' 'unsafe-eval'; frame-ancestors 'self' https://developers.google.com/_d/analytics-iframe; report-uri https://csp.withgoogle.com/csp/devsite/v2 strict-transport-security: max-age=63072000; includeSubdomains; preload x-xss-protection: 0 x-content-type-options: nosniff cache-control: no-cache, must-revalidate expires: 0 pragma: no-cache content-encoding: gzip x-cloud-trace-context: 9553825835c058363c68e166be39c73e date: Thu, 17 Jul 2025 11:01:52 GMT server: Google Frontend content-length: 63176 alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 Using load balancing for highly available applications | Compute Engine Documentation | Google Cloud

Documentation Technology areas

AI and ML
Application development
Application hosting
Compute
Data analytics and pipelines
Databases
Distributed, hybrid, and multicloud
Generative AI
Industry solutions
Networking
Observability and monitoring
Security
Storage

Cross-product tools

Access and resources management
Costs and usage management
Google Cloud SDK, languages, frameworks, and tools
Infrastructure as code
Migration

Related sites

Google Cloud Home
Free Trial and Free Tier
Architecture Center
Blog
Contact Sales
Google Cloud Developer Center
Google Developer Center
Google Cloud Marketplace
Google Cloud Marketplace Documentation
Google Cloud Skills Boost
Google Cloud Solution Center
Google Cloud Support
Google Cloud Tech Youtube Channel

/

English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어

Console

Compute Engine

Guides Reference Samples Resources

Documentation
- Guides
- Reference
- Samples
- Resources
Technology areas
- More
Cross-product tools
- More
Related sites
- More
Console
Contact Us
Start free

Discover
Product overview
Compute Engine instances
Instance groups
Compute Engine machine resources
Regions and zones
Get started
Plan and prepare
- Work with regions and zones
  - View available regions and zones
  - Change the default region or zone
- Review VM deployment options
- Networking overview for VMs
- Images and operating systems
  - OS images
    About OS images
    Operating system details
    OS image lifecycle
    Support policy
  - Premium operating systems
    RHEL FAQ
    SLES FAQ
    Ubuntu Pro FAQ
    Microsoft Licensing on Google Cloud
    Microsoft licenses FAQ
  - License Manager
    About License Manager
    Use License Manager for Microsoft Office
    View audit logs
- Access control
- Name resources
Quickstarts
Create instances
Instance creation overview
Create an instance
- Create and start an instance
- Create an instance and customize machine configuration
  - Create an instance with a custom hostname
  - Create an instance with a custom machine type
  - Specify a minimum CPU platform for an instance
  - Create an instance with attached GPUs
    Overview
    
    Accelerator-optimized instances
    Create an A3 Ultra or A4 instance
    Create an A3, A2, or G2 instance
    Create an A3 instance with GPUDirect-TCPX enabled
    Create an N1 instance that has attached GPUs
- Create an instance and customize OS configuration
- Create an instance and customize networking configuration
- Create an instance and customize observability configuration
  - Create an instance for Ops Agent monitoring and logging
  - Enable virtual displays on an instance
- Create an instance and customize security configuration
  - Create an instance that uses a user-managed service account
- Create an instance using an existing configuration
  - Create an instance from an instance template
  - Create an instance similar to an existing instance
- Create a Spot VM
  - Spot VMs
    About Spot VMs
    Create and use Spot VMs
  - Preemptible VMs
    About preemptible VMs
    Create and use preemptible VMs
- Create instances for specific workload types
Create custom images
Create and manage instance templates
Create multiple VMs
- Create a managed instance group (MIG)
- Bulk creation of VMs
Create sole-tenant VMs
Create a virtual workstation
Use nested virtualization
Manage VM boot disks
- Detach and reattach a boot disk
- Create a customized boot disk
Migrate VMs
Choose a migration path
Bring your own licenses
Import disks and images
- Prerequisites for importing and exporting VM images
- Automatic import
  - Import virtual disks
  - Import virtual appliances
- Manual import
Move a VM within Google Cloud
Move an existing VM to a new VM
Connect to VMs
Connect to a VM
- About SSH connections
- Linux VMs
  - Connect to VMs
  - Connect through internal IP addresses
    Connection options for internal-only VMs
    Connect using IAP
    Connect using a bastion host
    Connect using Cloud VPN
  - Connect as the root user
  - Connect using service accounts
  - Configure apps to use SSH
  - Best practices
    Securely connect to VMs
- Windows VMs
Manage access to VMs
- Linux VMs
- Windows VMs
  - Manage accounts and credentials on Windows VMs
  - Automate Windows password generation
- Manually manage SSH keys
- Best practices for securing SSH access
- Manage tags for resources
Transfer files to or from a VM
- Transfer files to Linux VMs
- Transfer files to Windows VMs
IP addresses
Internal DNS
Create a PTR record for a VM
Verify VM identity
Manage storage
Choose a disk type
Disk types
- About Hyperdisk
- About Persistent Disk
- Extreme Persistent Disk
- About Local SSD
Configure storage pools
- Create a storage pool
- Manage storage pools
Configure disks
- Add disks to VMs
- Prepare disks for use
  - Format and mount a non-boot disk on Linux
  - Format and prepare a non-boot disk on Windows
- Access disks attached to a VM
  - Best practice: Use persistent device names
  - Symbolic links to disks
- Transfer data to disks attached to a VM
  - Transfer files to Linux VMs
  - Transfer files to Windows VMs
Encrypt disks
Modify disks
Evaluate disk performance
- About Persistent Disk performance
- Review disk performance
- Benchmark disk performance
Make disks highly available
- Replicate disks across regions
- Cross-zonal synchronous disk replication
Back up and restore
Data protection options
Configure the default backup setting
Back up VMs
- Use machine images
- Use Backup and DR backup plans
Back up disks
- Back up a disk in place
- Back up a disk for disaster recovery
- Schedule disk backups
- Duplicate a disk with clones
Restore from a backup
Recover a VM with a corrupted or full disk
Manage VMs
Basic operations and lifecycle
- VM instance lifecycle
- Schedule VM operations
  - Schedule a VM to start and stop
  - Limit the run time of a VM
- View VM properties
- Stop or suspend a VM
  - Stop or suspend VMs overview
  - Stop or restart a VM
  - Gracefully shut down a VM
    Graceful shutdown overview
    Enable graceful shutdown
    View graceful shutdown
    Disable graceful shutdown
  - Suspend or resume a VM
  - Reset a VM
- Update VM details
  - Rename a VM
  - Update VM properties
  - Edit the machine type of a VM
  - Add or remove GPUs
  - Change the attached service account
  - Update the physical location of a VM
    About placement policies
    Create and apply spread placement policies to VMs
    View placement policies
    Remove or delete placement policies
  - Update network configuration for instances
    Configure static external IP addresses
    Configure static internal IP addresses
    Configure IPv6 for instances and instance templates
    Update network interfaces
- Delete VMs
  - Delete a VM
  - Prevent accidental VM deletion
Update VM tenancy
Manage groups of VMs
- Work with managed VMs in a MIG
- View info about MIGs and managed instances
- Add or remove VMs in a MIG
- Limit the run time of VMs in a MIG
- Add GPU VMs all at once in a MIG
- Configure instance flexibility in a MIG
- Distribute VMs across zones in a regional MIG
- Work with suspended and stopped VMs in a MIG
- Apply new VM configurations in a MIG
- Maintain high availability during VM failures
  - About repairing VMs for high availability
  - Repair a VM when an application fails
    Set up an application-based health check and autohealing
    Monitor VM health state changes
    Disable and enable health state change logs
  - Turn off repairs in a MIG
- Support a stateful workload with a MIG
- Group VMs together
  - Migrate an existing workload to a stateful managed instance group
  - Group unmanaged VMs together
- Delete a MIG
Host maintenance events
Manage metadata
Securing VMs
Manage operating systems
Guest environment
- About the guest environment
- Install the guest environment
Manage operating systems using VM Manager
Manage OS images
Manage OS packages
Manage licenses
Use startup scripts
Run shutdown scripts
Configure NTP
Enable the virtual random number generator (Virtio RNG)
Deploy workloads
Set up authentication for workloads
Agent for Compute Workloads overview
Web servers
Applications
Databases
Containers
Microsoft Windows
Others
- Load testing
  - Distributed load testing using Kubernetes
  - SSH port forwarding and load testing
- Analytics
  - Monte Carlo methods using Apache Spark
- Machine learning
  - Run TensorFlow inference workloads with TensorRT5 and NVIDIA T4 GPU
Monitor
Monitor logs
Monitor resources
Organize resources using labels
Scale
Autoscale groups of VMs
Autoscale node groups
Reserve VM capacity
- Choose a reservation type
- On-demand reservations
  - About on-demand reservations
  - Create an on-demand reservation
    For a single project
    For multiple projects
  - Combine an on-demand reservation with a CUD
  - Modify an on-demand reservation
  - Delete an on-demand reservation
- Future reservations
  - About future reservations
  - Create a reservation request
    For a single project
    For multiple projects
  - Modify a reservation request
  - Delete a reservation request
- Future reservations in calendar mode
  - About future reservations in calendar mode
  - Create a reservation request in calendar mode
- View reservations or reservation requests
- Consume a reservation
- Prevent VMs from consuming reservations
Load balancing
Build reliable and scalable applications
Optimize
Resource utilization
- Use recommendations to manage resources
- Overcommit CPUs on sole-tenant VMs
- Manual live migration
  - About manual live migration
  - Manually live migrate VMs
- Share sole-tenant node groups
- Next generation dynamic resource management
Cost savings
- Get discounts for committed usage
  - About commitments and committed use discounts (CUDs)
  - Resource-based CUDs
  - Manage resource-based commitments
    Renew commitments automatically
    Extend the term length of commitments
    Merge and split commitments
    Upgrade the term of commitments
- Get discounts for sustained usage
Disk performance
Workload performance
- Set the number of threads per core
- Customize the number of visible CPU cores
- Analyze the CPU performance using the PMU
- Accelerated workloads with GPUs
  - GPUs on Compute Engine
    About GPUs
  - Install drivers
    Install GPU drivers
    Install drivers for NVIDIA RTX Virtual Workstations (vWS)
    Drivers for NVIDIA RTX Virtual Workstations (vWS)
Network performance
Troubleshoot
General tips
Troubleshoot connectivity
Troubleshoot VMs
- Troubleshoot VM operations
- Troubleshoot unresponsive VMs
- Troubleshoot VM configurations
- Troubleshoot Windows VMs
  - Troubleshoot Windows VMs
- Troubleshoot using the serial console
  - Troubleshoot using the serial console
  - Viewing serial port output
Troubleshoot instance groups
- Troubleshoot managed instance groups (MIGs)
Troubleshoot OS management
Troubleshoot metadata server
- Troubleshoot metadata server
Troubleshoot networking issues
Troubleshoot storage
Troubleshoot reservations and commitments
Troubleshoot quota errors
- Troubleshoot concurrent operation quota errors
Troubleshoot workload authentication
- Troubleshoot default service accounts
- Troubleshoot workload to workload authentication

AI and ML
Application development
Application hosting
Compute
Data analytics and pipelines
Databases
Distributed, hybrid, and multicloud
Generative AI
Industry solutions
Networking
Observability and monitoring
Security
Storage

Access and resources management
Costs and usage management
Google Cloud SDK, languages, frameworks, and tools
Infrastructure as code
Migration

Google Cloud Home
Free Trial and Free Tier
Architecture Center
Blog
Contact Sales
Google Cloud Developer Center
Google Developer Center
Google Cloud Marketplace
Google Cloud Marketplace Documentation
Google Cloud Skills Boost
Google Cloud Solution Center
Google Cloud Support
Google Cloud Tech Youtube Channel

Home
Compute Engine
Documentation
Guides

Using load balancing for highly available applications

This tutorial explains how to use load balancing with a regional managed instance group to redirect traffic away from busy or unavailable VM instances, allowing you to provide high availability even during a zonal outage.

A regional managed instance group distributes an application on multiple instances across multiple zones. A global load balancer directs traffic across multiple regions via a single IP address. By using both of these services to distribute your application across multiple zones, you can help ensure that your application is available even in extreme cases, like a zonal disruption.

Load balancers can be used to direct a variety of traffic types. This tutorial shows you how to create a global load balancer that directs external HTTP traffic, but much of the content of this tutorial is still relevant to other types of load balancers. To learn about other types of traffic that can be directed with a load balancer, see Types of Cloud Load Balancing.

This tutorial includes detailed steps for launching a web application on a regional managed instance group, configuring network access, creating a load balancer for directing traffic to the web application, and observing the load balancer by simulating a zonal outage. Depending on your experience with these features, this tutorial takes about 45 minutes to complete.

Objectives

Launch a demo web application on a regional managed instance group.
Configure a global load balancer that directs HTTP traffic across multiple zones.
Observe the effects of the load balancer by simulating a zonal outage.

Costs

In this document, you use the following billable components of Google Cloud:

Compute Engine

To generate a cost estimate based on your projected usage, use the pricing calculator.

New Google Cloud users might be eligible for a free trial.

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

Application architecture

The application includes the following Compute Engine components:

VPC network: a virtual network within Google Cloud that can provide global connectivity using its own routes and firewall rules.
Firewall rule: a Google Cloud firewall lets you allow or deny traffic to your instances.
Instance template: a template used to create each VM instance in the managed instance group.
Regional managed instance group: a group of VM instances running the same application across multiple zones.
Global static external IP address: a static IP address that is accessible on external networks and can be attached to a global resource.
Global load balancer: a load balancer that allows backend instances to be distributed across multiple regions. Use a global load balancer when your users need access to the same applications and content, and you want to provide access using a single anycast IP address.
Health check: a policy used by the load balancer to evaluate the responsiveness of the application on each VM instance.

Launching the web application

This tutorial uses a web application that is stored on GitHub. If you would like learn more about how the application was implemented, see the GoogleCloudPlatform/python-docs-samples repository on GitHub.

Launch the web application on every VM in an instance group by including a startup script in an instance template. Additionally, run the instance group in a dedicated VPC network to keep this tutorial's firewall rules from interfering with any existing resources running in your project.

Create a VPC network

Using a VPC network protects existing resources in your project from being affected by the resources that you will create for this tutorial. A VPC network is also required to restrict incoming traffic so that it must go through the load balancer.

Create a VPC network to encapsulate the firewall rules for the demo web application:

In the Google Cloud console, go to the VPC networks page.

Go to VPC networks
Click Create VPC Network.
Under Name, enter web-app-vpc.
Set Subnet creation mode to Custom.
Create a new subnet as follows:
1. In the Subnets section, set the Name field, enter web-app-vpc-subnet.
2. In the Region drop-down, select us-central1.
3. Make sure that the IP stack type option is set to IPv4.
4. In the Primary IPv4 range section, enter the IPv4 range 10.2.0.0/24.
At the bottom of the page, click Create.

Wait until the VPC network is created before continuing.

Create a firewall rule

After the VPC network is created, set up a firewall rule to allow HTTP traffic to the VPC network:

Note: This example creates an ingress allow VPC firewall rule of which the target is all instances in the network. For production applications, consider using a more specific target. You can also use rules in a global network firewall policy, regional network firewall policy, or hierarchical firewall policy. For more information, see Firewall policies and best practices for network security.

In the Google Cloud console, go to the Firewalls page.

Go to Firewalls
Click Create firewall rule.
In the Name field, enter allow-web-app-http.
Set Network to web-app-vpc.
Make sure that the following options are set as given:
- Direction of traffic option is set to Ingress.
- Action on match option is set to Allow.
In the Targets drop-down, select All instances in the network.
Set Source filter to IPv4 ranges.
In the Source IP ranges field, enter 130.211.0.0/22, 35.191.0.0/16 to allow for load balancer health checks.

Note: Health check probes for the load balancer come from addresses in the ranges 130.211.0.0/22 and 35.191.0.0/16. For this tutorial, your health check uses the HTTP protocol, so the firewall rule should allow connections to port 80. For more information on firewall rules for health checks, see Probe IP ranges and firewall rules.
Under Protocols and ports, do the following:
1. Select Specified protocols and ports.
2. Select TCP.
3. In the Ports field, enter 80 to allow access for HTTP traffic.
Click Create.

Create an instance template

Create a template that you will use to create a group of VM instances. Each instance created from the template launches a demo web application by using a startup script.

In the Google Cloud console, go to the Instance templates page.

Go to Instance templates
Click Create instance template.
Under Name, enter load-balancing-web-app-template.
Under Machine configuration, set the Machine type to e2-medium.
Click the Advanced options section to expand.
Click the Networking section and do the following:
1. In the Network interfaces section, delete any existing network interfaces by clicking the icon next to them.
2. Click Add a network interface, and then select the web-app-vpc network. This forces each instance created with this template to run on the previously created network.
3. In the Subnetwork drop-down, select web-app-vpc-subnet.
4. Click Done.

Click the Management section and do the following:

In the Automation section, enter the following startup script:

apt-get update
apt-get -y install git python3-pip python3-venv
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
python3 -m venv venv
./venv/bin/pip3 install -Ur ./python-docs-samples/compute/managed-instances/demo/requirements.txt
./venv/bin/pip3 install gunicorn
./venv/bin/gunicorn --bind 0.0.0.0:80 app:app --daemon --chdir ./python-docs-samples/compute/managed-instances/demo

The script gets, installs, and launches the web application when a VM instance starts up.

Leave the default values for the other options.
Click Create.

Wait until the template is created before continuing.

Create a regional managed instance group

To run the web application, use the instance template to create a regional managed instance group:

In the Google Cloud console, go to the Instance groups page.

Go to Instance groups
Click Create instance group.
For Name, enter load-balancing-web-app-group.
For Instance template, select load-balancing-web-app-template.
Set Number of instances to 6. If this field is disabled, turn off autoscaling first.

To turn off autoscaling, go to the Autoscaling section. In the Autoscaling mode drop-down, select Off: do not autoscale.

Pro Tip: When creating a regional managed instance group, Compute Engine recommends that you provision enough instances so that, if all of the instances in any one zone are unavailable, the remaining instances still meet the minimum number of instances that you require. However, provisioning more instances than you need might incur additional costs. For more information, see How to increase availability by overprovisioning.
For Location, select Multiple zones.

Pro Tip: To ensure your application is available during extreme events, like zonal outages, Compute Engine recommends that you distribute your application across multiple zones.
For Region, select us-central1.
For Zones, select the following zones from the drop-down list:
- us-central1-b
- us-central1-c
- us-central1-f
Leave the default values for the other options.
Click Create. This redirects you back to the Instance groups page.

You might need to wait a few minutes until all of the instances in the group are running.

Configuring the load balancer

To use a load balancer to direct traffic to your web application, you must reserve an external IP address to receive all incoming traffic. Then, create a load balancer that accepts traffic from that IP address and redirects that traffic to the instance group.

Reserve a static IP address

Use a global static external IP address to provide the load balancer with a single point of entry for receiving all user traffic. Compute Engine preserves static IP addresses even if you change or delete any affiliated Google Cloud resources. This allows the web application to always have the same entry point, even if other parts of the web application might change.

In the Google Cloud console, go to the IP addresses page.

Go to IP addresses
Click Reserve external static IP address.
In the Name field, enter web-app-ipv4.
Set IP version to IPv4.
Set Type to Global.
Click Reserve.

Create a load balancer

This section explains the steps required to create a global load balancer that directs HTTP traffic.

This load balancer uses a frontend to receive incoming traffic and a backend to distribute this traffic to healthy instances. Because the load balancer is made of multiple components, this task is divided into five parts:

Select the load balancer type
Name the load balancer
Configure the frontend
Configure the backend
Review and finalize

Complete all the parts to create the load balancer.

Note: For simplicity, this tutorial uses an HTTP load balancer. To learn how to support HTTPS and HTTP/2, see Creating content-based HTTP(S) load balancing. For other types of traffic, see Choosing a load balancer.

Select the load balancer type

In the Google Cloud console, go to the Load balancing page.

Go to Load balancing
Click Create load balancer.
For Type of load balancer, select Application Load Balancer (HTTP/HTTPS) and click Next.
For Public facing or internal, select Public facing (external) and click Next.
For Global or single region deployment, select Best for global workloads and click Next.
For Load balancer generation, select Global external Application Load Balancer and click Next.
Click Configure.

Name the load balancer

In the left panel, for Load balancer name, enter web-app-load-balancer.

Configure the frontend

On the Frontend configuration page, under Name, enter web-app-ipv4-frontend.
Set the Protocol to HTTP.
Set the IP version to IPv4.
Set the IP address to web-app-ipv4.
Set the Port to 80.
Click Done to create the frontend.

Configure the backend

In the left panel, click Backend configuration.
Click Backend services & backend buckets drop-down to open a menu, and then click Create a backend service.
In the new window, for the Name of the backend service, enter web-app-backend.
In the Backends section, do the following:
1. Set Instance group to load-balancing-web-app-group.
2. Set Port numbers to 80. This allows HTTP traffic between the load balancer and the instance group.
3. Under Balancing mode, select Utilization.
4. Click Done.
Create the health check for the backend of the load balancer as follows:
Pro Tip: Health checks are used for both load balancing and autohealing, but for different purposes:
- Health checks for load balancing are used for detecting unresponsive instances and directing traffic away from them.
- Health checks for autohealing are used for detecting and recreating failed instances.
Use separate health checks for load balancing and for autohealing. Using the same health check for these services would remove the distinction between unresponsive instances and failed instances, causing unnecessary latency and/or unavailability for your users. For more information, see Health check concepts.
1. Click the Health check drop-down, and then click Create a health check. A new window opens.
2. In the new window under Name, enter web-app-load-balancer-check.
3. Set the Protocol to HTTP.
4. Under Port, enter 80.
5. For this tutorial, set the Request path to /health, which is a path that the demo web application is set up to respond to.
6. Set the following Health criteria:
  1. Set Check interval to 3 seconds. This defines the amount of time from the start of one probe to the start of the next one.
  2. Set Timeout to 3 seconds. This defines the amount of time that Google Cloud waits for a response to a probe. Its value must be less than or equal to the check interval.
  3. Set Healthy Threshold to 2 consecutive successes. This defines the number of sequential probes that must succeed in order for the instance to be considered healthy.
  4. Set Unhealthy Threshold to 2consecutive failures. This defines the number of sequential probes that must fail in order for the instance to be considered unhealthy.
  Pro Tip: For information about refining the Check interval and Timeout values for your own application, see How health checks work. For detailed information about optimizing and measuring latency, see Optimizing Application Latency with Load Balancing
7. Click Create to create the health check.
Leave the default values for the other options.
Click Create to create the backend service.

Review and finalize

Verify your load balancing settings before creating the load balancer:

In the left panel of the Create global external Application Load Balancer page, click Review and finalize.
On the Review and finalize page, verify that Frontend uses an IP address with a Protocol of HTTP.
On the same page, verify the following Backend settings:
- The Backend service is web-app-backend.
- The Endpoint protocol is HTTP.
- The Health check is web-app-load-balancer-check.
- The Instance group is load-balancing-web-app-group.
Click Create to finish creating the load balancer.

You might need to wait a few minutes for the load balancer to finish being created.

Test the load Balancer

Verify that you can connect to the web application by using the load balancer as follows:

In the Google Cloud console, go to the Load balancing page.

Go to Load balancing
In the Name column, click web-app-load-balancer to expand the load balancer you just created.
To connect to the web-app using the external static IP addresses, do the following:
1. In the Frontend section, copy the IP address shown in the IP:Port column.
2. Open a new browser tab and paste the IP address into the address bar. This should display the demo web application:
Notice that, whenever you refresh the page, the load balancer connects to different instances in different zones. This happens because you are not connecting to an instance directly; you are connecting to the load balancer, which selects the instance you are redirected to.

When you are done, close the browser tab for the demo web application.

Simulating a zonal outage

You can observe the functionality of the load balancer by simulating the widespread unavailability of a zonal outage. This simulation works by forcing all of the instances located in a specified zone to report an unhealthy status on the /health request path. When these instances report an unhealthy status, they fail the load balancing health check, prompting the load balancer to stop directing traffic to these instances.

Monitor which zones the load balancer is directing traffic to.
1. In the Google Cloud console, go to Cloud Shell.
  
  Open Cloud Shell
  
  Cloud Shell opens in a pane of the Google Cloud console. It can take a few seconds for the session to initialize.
  
  Pro Tip: You can open Cloud Shell from any Google Cloud console page by using the Activate Cloud Shell button.
2. Save the static external IP address of your load balancer as follows:
  1. Get the external IP address from the frontend forwarding rule of the load balancer by entering the following command in your terminal:
```
gcloud compute forwarding-rules describe web-app-ipv4-frontend --global
```
    The output looks as follows. Copy the EXTERNAl_IP_ADDRESS from the output.
```
IPAddress: EXTERNAl_IP_ADDRESS
...
```
  2. Create a local bash variable:
```
export LOAD_BALANCER_IP=EXTERNAl_IP_ADDRESS
```
    Replace EXTERNAl_IP_ADDRESS with the external IP address that you copied.
3. To monitor which zones the load balancer is directing traffic to, run the following bash script:
```
while true
do
    BODY=$(curl -s "$LOAD_BALANCER_IP")
    NAME=$(echo -n "$BODY" | grep "load-balancing-web-app-group" | perl -pe 's/.+?load-balancing-web-app-group-(.+?)<.+/\1/')
    ZONE=$(echo -n "$BODY" | grep "us-" | perl -pe 's/.+?(us-.+?)<.+/\1/')
    echo $ZONE
    sleep 2 # Wait for 2 seconds
done
```
  This script continuously attempts to connect to the web application by using the IP address for the frontend of the load balancer, and outputs which zone the web application is running from for each connection.
  
  The resulting output should include zones us-central1-b, us-central1-c, and us-central1-f:
```
us-central1-f
us-central1-b
us-central1-c
us-central1-f
us-central1-f
us-central1-c
us-central1-f
us-central1-c
us-central1-c
```
  Keep this terminal open.
  
  Note: This monitor should run continuously. But, you can stop it at any time by pressing Control+C in the terminal.
While your monitor is running, begin simulating the zonal outage.
1. In Cloud Shell, open a second terminal session by clicking the Add button.
2. Create a local bash variable for the project ID:
```
export PROJECT_ID=PROJECT_ID
```
  where PROJECT_ID is the project ID for your current project, which is displayed on each new line in the Cloud Shell:
```
user@cloudshell:~ (PROJECT_ID)$
```
3. Create a local bash variable for the zone that you want to disable. To simulate a failure of zone us-central1-f, use the following command:
```
export DISABLE_ZONE=us-central1-f
```
  Then, run the following bash script. This script causes the demo web application instances in the disabled zone to output unhealthy responses to the load balancer health check. Unhealthy responses prompt the load balancer to direct traffic away from these instances.
```
export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($DISABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")
for i in $MACHINES;
do
  NAME=$(echo "$i" | cut -f1 -d,)
  IP=$(echo "$i" | cut -f2 -d,)
  echo "Simulating zonal failure for zone $DISABLE_ZONE, instance $NAME"
  curl -q -s "https://$IP/makeUnhealthy" >/dev/null --retry 2
done
```
  After a short delay, the load balancer stops directing traffic to the unhealthy zones, so the output from the first terminal window stops listing zone us-central1-f:
```
us-central1-c
us-central1-c
us-central1-c
us-central1-b
us-central1-b
us-central1-c
us-central1-b
us-central1-c
us-central1-c
```
  This indicates that the load balancer is directing traffic only to the healthy, responsive instances.
  
  Note: Optionally, you can repeat this step to simulate failures of zones us-central1-b and us-central1-c.
  
  Keep both terminals open.
4. In the second terminal, create a local bash variable for the zone that you want to restore. To restore traffic to zone us-central1-f, use the following command:
```
export ENABLE_ZONE=us-central1-f
```
  Then, run the following bash script. This script causes the demo web application instances in the enabled zone to output healthy responses to the load balancer health check. Healthy responses prompt the load balancer to begin distributing traffic back toward these instances.
```
export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($ENABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")
for i in $MACHINES;
do
  NAME=$(echo "$i" | cut -f1 -d,)
  IP=$(echo "$i" | cut -f2 -d,)
  echo "Simulating zonal restoration for zone $ENABLE_ZONE, instance $NAME"
  curl -q -s "https://$IP/makeHealthy" >/dev/null --retry 2
done
```
  After a few minutes, the output from the first terminal window gradually lists zone us-central1-fagain:
```
us-central1-b
us-central1-b
us-central1-c
us-central1-f
us-central1-c
us-central1-c
us-central1-b
us-central1-c
us-central1-f
```
  This indicates that the load balancer is directing incoming traffic to all zones again.
  
  Note: If you also disabled zone us-central1-b or zone us-central1-c, you can repeat this step to restore traffic to them.
  
  Close both terminals when you have finished.

Clean up

After you finish the tutorial, you can clean up the resources that you created so that they stop using quota and incurring charges. The following sections describe how to delete or turn off these resources.

If you created a separate project for this tutorial, delete the entire project. Otherwise, if the project has resources that you want to keep, only delete the resources created in this tutorial.

Deleting the project

Caution: Deleting a project has the following effects:

Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as an appspot.com URL, delete selected resources inside the project instead of deleting the whole project.

If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

In the Google Cloud console, go to the Manage resources page.
Go to Manage resources
In the project list, select the project that you want to delete, and then click Delete.
In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting specific resources

The following sections describe how to delete the specific resources that you created during this tutorial.

Deleting the load balancer

In the Google Cloud console, go to the Load balancing page.

Go to Load balancing
Click the checkbox next to web-app-load-balancer.
Click Delete at the top of the page.
In the new window, select all checkboxes. Then, click Delete load balancer and selected resources to confirm the deletion.

Deleting the static external IP address

Wait until the load balancer is deleted before deleting the static external IP address.

In the Google Cloud console, go to the External IP addresses page.

Go to External IP addresses
Click the checkbox next to web-app-ipv4.
Click Release static address at the top of the page. In the new window, click Release to confirm the release.

Deleting the instance group

Wait until the load balancer is deleted before deleting the instance group.

In the Google Cloud console, go to the Instance groups page.
Go to Instance groups
Select the checkbox for your load-balancing-web-app-group instance group.
To delete the instance group, click Delete.

Deleting the instance template

You must finish deleting the instance group before deleting the instance template. You cannot delete an instance template if a managed instance group is using it.

In the Google Cloud console, go to the Instance Templates page.

Go to Instance templates
Click the checkbox next to load-balancing-web-app-template.
Click Delete at the top of the page. In the new window, click Delete to confirm the deletion.

Deleting the VPC network

You must finish deleting the instance group before deleting the VPC network. You cannot delete a VPC network if other resources still uses it.

In the Google Cloud console, go to the VPC networks page.

Go to VPC networks
Click web-app-vpc.
Click Delete VPC network at the top of the page. In the new window, click Delete to confirm the deletion.

What's next

Try another tutorial:
- Using autohealing for highly available applications.
- Using autoscaling for highly scalable applications.
Learn more about Managed Instance Groups.
Learn more about Load Balancing.
Learn more about Optimizing Application Latency with Load Balancing.
Learn more about Designing Robust Systems.
Learn more about Building Scalable and Resilient Web Applications on Google Cloud.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-07-16 UTC.

Why Google
Products and pricing
Support
Resources
Engage

About Google
Privacy
Site terms
Google Cloud terms
Our third decade of climate action: join us
Sign up for the Google Cloud newsletter Subscribe

English
Deutsch
Español – América Latina
Français
Indonesia
Italiano
Português – Brasil
中文 – 简体
中文 – 繁體
日本語
한국어

HOME
ABOUT
AUCTIONS
SHIPPING
FEES
TOOLS
HOW
FAQ
CONTACT

Original Source | Taken Source

Using load balancing for highly available applications Stay organized with collections Save and categorize content based on your preferences.

Objectives

Costs

Before you begin

Application architecture

Launching the web application

Create a VPC network

Create a firewall rule

Create an instance template

Create a regional managed instance group

Configuring the load balancer

Reserve a static IP address

Create a load balancer

Select the load balancer type

Name the load balancer

Configure the frontend

Configure the backend

Review and finalize

Test the load Balancer

Simulating a zonal outage

Clean up

Deleting the project

Deleting specific resources

Deleting the load balancer

Deleting the static external IP address

Deleting the instance group

Deleting the instance template

Deleting the VPC network

What's next

Using load balancing for highly available applications