Serving stale

When your servers are down, or if they take a while to generate pages, end users should be able to benefit from being served cached content - even if it's slightly stale.

Illustration of pattern concept

Stale content is content stored by Fastly that has exceeded its freshness lifetime. Content will sometimes be served by Fastly even though it is stale, such as during a stale-while-revalidate window, or if an origin is sick and stale-if-error is set. For more background on stale content see staleness and revalidation.

Using VCL can modify standard stale-serving rules to improve outcomes in three scenarios:

  1. Where an origin is erroring (returning undesirable content such as a 500 Internal server error), and stale content exists, serve the stale content instead of the error.
  2. Where an origin is down, and stale content exists, serve the stale content instead of a Fastly-generated error page.
  3. Where an origin is erroring or down and no stale content exists, serve a branded, custom error page rather than an unfiltered error from origin or a Fastly generated error page.

These improved outcomes can be seen in the matrix below:

Content state

Either the user will see the content they want served from the edge (😀), they will get the content but it will include a blocking fetch to origin (😴), or they will see a custom branded error page (😐). Where the cell has a red triangle, this is a scenario where the user would previously have seen an unfiltered error (😡), which could be either something generated by Fastly or whatever your origin server returned - but thanks to this solution, will instead get the improved outcome.

Let's see how this works.


Allow stale serving only at the edge

First, if necessary, address the possibility that you have multiple layers of caching in your configuration. It's common for customers to route traffic from Fastly data centers not to origin, but to a single designated Fastly data center, before routing it to origin. This is a technique called shielding and is designed to increase cache hit ratio and reduce origin traffic, as well as providing performance gains for the user, but can also be responsible for undesirable behaviors.

In this case, if a request were to arrive at a data center (e.g., Paris) that is not the shield for your configuration and does not have the object requested, it will pass the request to the shield data center (say Los Angeles). If the shield has the object, but it's stale, and it meets the requirements for serving stale, then the shield will serve the stale object to the edge data center, which will then cache it as a fresh object. You should prevent this, to avoid accidentally classifying stale content as fresh.

The vcl_recv subroutine runs for every inbound request, so use that to detect and flag requests that come from other Fastly data centers:

sub vcl_recv { ... }
Fastly VCL
if (fastly.ff.visits_this_service > 0) {
set req.max_stale_while_revalidate = 0s;
set req.max_stale_if_error = 0s;

Setting req.max_stale_while_revalidate and req.max_stale_if_error will override any attempt to set either of the stale grace periods, whether automatically based on headers from the backend, or explicitly by setting beresp.stale_while_revalidate and beresp.stale_if_error. Setting the maximums to zero effectively disables stale serving for this request.

The fastly.ff.visits_this_service variable tells you the number of previous Fastly servers that have seen the current request in the context of the current service configuration. So if the value is non-zero, it indicates that the request may have come from another Fastly data center.

Use stale instead of undesirable responses from origin

The vcl_fetch subroutine runs when a response starts to be received from an origin server. Use this subroutine to detect responses from origin that are undesirable:

sub vcl_fetch { ... }
Fastly VCL
if (beresp.status >= 500 && beresp.status < 600) {
if (stale.exists) {
error 503;

Fastly will, by default, accept and serve any response that is syntactically valid HTTP, which includes error responses. We'll even do this in preference to serving a stale object that we have in cache. So when an origin server returns an undesirable response and a good version of the content still exists in cache (stale.exists), a simple improvement that can be made is to deliver that stale version (return(deliver_stale)) instead of the bad content .

IMPORTANT: An 'error' response from an origin server, that is, one with an HTTP status of 400 or higher, does not trigger an error within Fastly, since Fastly is designed to route HTTP messages, and a response such as a 404 Page not found is a perfectly valid HTTP message, as is 500 Internal Server Error. Fastly will only move to an error handling flow automatically if attempts to communicate with the origin fail at a network level. For this reason it can be helpful to refer to these kinds of responses as 'undesirable' or 'bad' content, rather than as errors.

If the origin serves an undesirable response, and a stale version does not exist, then to avoid serving the undesirable response to the end user, you can trigger an error with the error statement and pass control to the vcl_error subroutine.

Set some stale period defaults (optional)

The existence of stale content, and therefore the ability to discover it with stale.exists, depends on the content having a positive beresp.stale_if_error duration. An effective way to set this is by using the stale-if-error and stale-while-revalidate directives of the Cache-Control header, but if you're unable to do this at your origin, you could instead set them in VCL. Add this to the vcl_fetch subroutine:

sub vcl_fetch { ... }
Fastly VCL
if (beresp.ttl > 0s) {
set beresp.stale_while_revalidate = 60s;
set beresp.stale_if_error = 86400s;

It's a good idea to set a short revalidation window, and a longer error window, because while you may not want to serve out of date content for very long, if the origin server is down, it's either that or an error message.

Deal with errors

As discussed earlier, an undesirable response from origin, like a 500 Internal server error, is not an error to Fastly, because it's a valid HTTP message. But you can convert these undesirable responses into errors, as you did in the vcl_fetch subroutine via the error 503; statement, and when we encounter that statement fastly will create a synthetic error object and pass control to the vcl_error subroutine.

You will also end up in vcl_error directly (without running vcl_fetch) if Fastly encounters a network error while trying to reach the origin server. This can happen if the origin is offline, if it cannot negotiate an acceptable TLS session, if it times out (see bereq.first_byte_timeout), or if it responds with data that cannot be interpreted as HTTP. In this case, Fastly will trigger an error with a 503 status automatically.

sub vcl_error { ... }
Fastly VCL
if (obj.status >= 500 && obj.status < 600) {
if (stale.exists) {
set obj.status = 503;
set obj.response = "Service unavailable";
set obj.http.Content-Type = "text/html";
synthetic {"
<!DOCTYPE html>
Sorry, we are currently experiencing problems fulfilling
your request. We've logged this problem and we'll try to
resolve it as quickly as possible.

There are lots of possible reasons for ending up in vcl_error, including manually triggered errors that may have a custom status to trigger a specific synthetic response, so the first step is to isolate requests that are in vcl_error with a status code in the 5XX range.

If the request is here because the error statement was invoked from vcl_fetch, then you have already checked for stale content, but if the error was triggered by Fastly automatically, then the stale.exists check hasn't happened yet, and should be done now. If stale content exists, return(deliver_stale) to serve it. This caters for the scenario where an origin is down but not yet sick.

Where there is no stale content available, the only option now is to create content to serve to the user. This is often a better option than allowing the default Fastly-generated error content to be seen. This is done using the synthetic statement.

HINT: Another option at this point is to restart, and try a different backend server by changing req.backend, or to fetch error page content from a static object store by manipulating req.url as well.

Log the error (optional)

Requests that end up in a stale object being served will be flagged as such in the fastly_info.state variable, and will run vcl_log as normal, providing an opportunity to log the incident. However, you may want to log explicitly when a synthetic error is served, in which case add a line such as this to the vcl_error logic you already have:

if (obj.status >= 500 && obj.status < 600) { .. }
Fastly VCL
log "syslog " req.service_id " log-name :: We served a synthetic 503 for " req.url;

See also

Quick install

This solution can be added directly to an existing service in a Fastly account as a set of VCL snippets. The embedded fiddle below shows the complete solution. Feel free to run it, and click the INSTALL tab to customise and upload it to your service:

Once you have the code in your service, you can further customise it if you need to.

All code on this page is provided under both the BSD and MIT open source licenses.