Redundancy and failover

Redundancy is an important part of ensuring the high availability of your Fastly service for end users. Delivering content from our edge cache or constructing responses at the edge is extremely fast and not dependent on the availability of any other system.

To maximize the number of end user requests you can answer from the edge:

  1. Follow caching best practices to ensure the maximum amount of content is cacheable at the edge, for as long as possible.
  2. Enable serving stale content, which allows Fastly to use stale content while refreshing it from your origin or while your origin servers are unavailable.
  3. Enable shielding to focus requests from the edge through a single POP, reducing the load on your origins and maximizing cache performance.

For when requests must be sent to a backend, you can also take steps to maximize the reliability of retrieving content from backend servers. This starts with ensuring that every backend has a health check. With health information available, you can manage redundancy in one of two ways:

  • Load balancing, where you provide more than one backend and spread traffic across them all the time. If one fails, Fastly adapts automatically to send traffic only to healthy backends. Learn more about load balancing.
  • Fallback, where you provide more than one backend but use only one by default, and switch to the second one only if the first fails.

This page describes a variety of redundancy scenarios and the tools available to implement solutions.

HINT: Some of these patterns require you to know the VCL name of a backend. The VCL name is set in the backend declaration in your VCL, or if generated by Fastly, visible in generated VCL. View generated VCL using the API or by choosing Show VCL in the web interface.

WARNING: The Compute platform does not expose the health status of backends, so it is not possible to create failover logic in Compute programs.

Using automatic load balancing

Using automatic load balancing in VCL services is a simple way to perform a number of straightforward failover patterns. Solutions that use automatic load balancing for failover support shielding but do not support weights, quorum, or non-random allocation policies such as consistent hashing. Automatic load balancing is not available in Compute services.

Basic fallback

A single primary backend taking all traffic, with a single backup backend to be used if the primary backend is down.

  1. Configure automatic load balancing on both backends.
  2. Add a health check to both backends.
  3. Create a request condition (via API or web interface) on the backup backend.
    • Set the condition to backend.{PRIMARY BACKEND NAME}.healthy == false.

Do not apply any condition to the primary backend. When requests are received, Fastly will first consider the backup backend, because it has a condition. Provided that the primary backend is healthy, the condition will be false, so the backup backend will not be selected, leaving the primary backend to be selected by default.

Fallback to load-balanced group

Multiple primary backends with traffic distributed between them, and multiple backup backends to be used if all the primary backends are down. In this scenario, if one primary backend is down, Fastly will continue to use only the remaining primary backends and won't switch to the backup group unless all primary backends are down.

  1. Configure automatic load balancing on all backends.
  2. Add health checks to all backends.
  3. Create a request condition (via API or web interface) on one of the backup backends:
    • Set the condition to backend.autodirector_.healthy == false.
  4. Apply the same condition to the other backup backend (or backends, if there are more than two):
    • Don't apply any condition to the primary backends.
    • Don't create more than one condition: it's important that all backup backends share the same condition.

When you enable automatic load balancing, Fastly generates a director in your VCL called autodirector_, which groups together all the backends that don't have any conditions attached. Applying a condition to only the backup backends will exclude them from the primary director, but will use them in preference to the primary director if the condition matches.

Active-active with failover

In some cases you may have backends that are selected based on distinct criteria, for example:

  • geographic routing (e.g. requests from the USA go to backends A and B)
  • path prefix routing (e.g. requests to URLs starting /account go to backends C and D)

Even if all your backends can serve all kinds of traffic, this pattern can allow lower latency routes to be preferred, or for backends to use their internal caches more efficiently. But if the designated origin is down, you may want to use one of the others as a backup. To do this:

  1. Configure automatic load balancing on all backends.
  2. Add health checks to all backends.
  3. Create a request condition (via API or web interface) for each distinct group of backends other than the default one:
    • Set a condition that combines the health status of the other backends with the criteria for selecting this group of backends in normal operation. For example:
      backend.{OTHER_BACKEND_NAME}.healthy == false || client.geo.country_code == "US"`.
  4. Apply the conditions to each backend.

HINT: For example, imagine you have four backends (usa1, usa2, eur1, and eur2), you could have a condition ("US traffic") which distinguishes the US backends from the European ones:

client.geo.country_code == "US" || (!backend.eur1.healthy && !backend.eur2.healthy)

Using directors

Using custom director declarations in VCL services is the most flexible way to design load balancing and failover solutions with complete flexibility. Directors support quorum, weighting, a choice of selection policies, and can be nested. This section describes fallback scenarios using custom directors.

Because directors can only be assigned using explicit custom VCL, using them will typically override Fastly's backend selection logic and therefore prevent shielding from working. However, this can be worked around by allowing a shielded backend to be auto-assigned and then replacing it with a director in vcl_miss and vcl_pass - see combining with shielding for details.

Directors are not available in Compute services.

Basic fallback

A single primary backend taking all traffic, with a single secondary backend to be used if the primary backend is down.

director origin_director fallback {
{ .backend = F_prod; }
{ .backend = F_prod_backup; }
}
sub vcl_recv {
#FASTLY RECV
set req.backend = origin_director;
}

Fallback to load-balanced group

Multiple primary backends with traffic distributed between them, and multiple backup backends to be used if the primary backends are down. In this scenario, in contrast to the solution powered by automatic load balancing above, Fastly will switch to the backup group if any of the primary backends is down (because the director declaration sets .quorum = 100%).

Assuming you have defined backends: F_primary1, F_primary2, F_primary3, F_backup1, F_backup2:

director primary_group random {
.quorum = 100%;
.retries = 3;
{ .backend = F_primary1; .weight = 1; }
{ .backend = F_primary2; .weight = 3; }
{ .backend = F_primary3; .weight = 3; }
}
director backup_group random {
{ .backend = F_backup1; .weight = 1; }
{ .backend = F_backup2; .weight = 3; }
}
director origin_director fallback {
{ .backend = primary_group; }
{ .backend = backup_group; }
}
sub vcl_recv {
#FASTLY RECV
set req.backend = origin_director;
}

Active-active with failover

Just as with automatic load balancing, it's equally possible to use custom directors to create failover to backends that are normally active for other traffic. In this case, as shown earlier, the example demonstrates splitting traffic between two pools of servers based on the location of the end user, and if the preferred pool is not available, the other is used.

director director_eur random {
{ .backend = F_eur1; .weight = 1; }
{ .backend = F_eur2; .weight = 1; }
}
director director_usa random {
{ .backend = F_usa1; .weight = 1; }
{ .backend = F_usa2; .weight = 1; }
}
director origin_director_usa fallback {
{ .backend = director_usa; }
{ .backend = director_eur; }
}
director origin_director_eur fallback {
{ .backend = director_eur; }
{ .backend = director_usa; }
}
sub vcl_recv {
if (client.geo.country_code == "US") {
set req.backend = origin_director_usa;
} else {
set req.backend = origin_director_eur;
}
}

Manually using VCL

The req.backend.healthy and backend.{NAME}.healthy VCL variables can be used to query the health status of the currently selected backend, or a nominated backend, at runtime. This information can be used as part of a hand-built routing or fallback solution. For example, if you have deployed infrastructure at geographically distributed locations, you may want to map Fastly edge locations to the most appropriate origin location. However, if one of your origins is down, you might want to instead use the next closest one.

These same variables can also be used if you wish to create an API to query the health status of your backends. In the following example, requests to /api/origin-status will be intercepted and a dynamic response created in JSON to return the current health status of each backend:

The JSON returned includes the server.datacenter variable, which identifies the Fastly POP, since the health status of a particular backend may be different in each POP. For a convenient way to make the request to every Fastly POP at once, see the /content/edge_check API endpoint.