Staleness and revalidation

When content is stored in Fastly's cache, it has a freshness lifetime, also known as a TTL or "time to live". This is the period of time during which Fastly will serve the content in response to a compatible request, without revalidating it with the origin server.

Once the freshness lifetime expires, the content will no longer be served directly and unconditionally from cache, but the expiry does not prompt the object to be deleted. Instead, it will be marked as stale. The way objects are treated once they transition to a stale state can be customized.

Objects may have one of three distinct types of stale state, in addition to being fresh. As an example, consider an object received by Fastly with the following response header:

Cache-Control: max-age=300, stale-while-revalidate=60, stale-if-error=86400

When processed by the readthrough cache this object will, unless evicted earlier due to lack of use, transit the following states:

stale phases

When requests are received that match a stale object, Fastly will process the object using the following steps:

  1. If the origin is sick (i.e. is currently failing a health check), and the object is within a stale-while-revalidate or stale-if-error period, then the stale object will be served.
  2. Otherwise, if the object has a validator (an ETag or Last-Modified header), then: a) If it is within a stale-while-revalidate period then it will be served immediately to the client and a conditional background fetch will be made to update it. See Revalidation. b) Otherwise, a blocking conditional fetch is made for the object. The revalidation process is exactly the same, but if the origin responds with a new object, it will be used for the pending client request as well as to update the cache.
  3. [VCL services only] Otherwise, if the object is within a stale-if-error period and your VCL explicitly returns deliver_stale from vcl_miss, vcl_fetch, or vcl_error, the stale object will be served. Learn more.
  4. Otherwise, the stale object will be ignored and Fastly will fetch a new object from origin exactly as if there was no match in the cache.

The HTTP Cache-Control header directives stale-while-revalidate and stale-if-error (which trigger scenarios 1 and 2 above) are HTTP standards defined in RFC 5861 - HTTP Cache-Control Extensions for Stale Content.

WARNING: Content is not guaranteed to be stored for the entire freshness lifetime and, especially in the case of large objects that are not frequently requested, we may evict them sooner to make space for more popular objects.

HINT: If your website were a radio station, then the CDN edge cache would be like your regional transmitter towers - essential to extend your reach to a huge audience, but useless without a signal to broadcast. Large state broadcasters have long realized this and placed local recordings of content at transmission sites "just in case" the transmitter loses its uplink to home base.

Serving content from Fastly to end users is very fast, and very reliable. But this only happens, by default, when that content is available (and fresh) in the cache. Use of stale-while-revalidate, stale-if-error and custom VCL that intercepts origin errors can significantly improve outcomes for end users

Staleness behaviors and revalidation are supported directly by the readthrough cache interface and can be explicitly configured using the low level core cache interface. The simple cache interface does not support staleness or revalidation.

Revalidation

When a backend fetch is triggered by a cache object being stale, and the object has a validator (an ETag or Last-Modified header), Fastly will make a conditional GET request for the resource, by sending an If-None-Match and/or If-Modified-Since header as appropriate (if both validators are present, both headers are sent). If the stale object does not have a validator, the backend request will be a normal fetch to load the entire object.

If the response to a revalidation request has status 304 (not modified), this will cause the lifetime of the existing object to be extended based on the standard rules on calculating cache TTL, however, no other aspects of the existing cached response will be modified. For example, this means that after successfully revalidating, future requests for the object will receive the headers that were attached to the original response from origin that populated the cache, not the headers present on the revalidation response. This kind of response does not trigger vcl_fetch, and any edge code intended to modify Fastly's behavior in the vcl_fetch stage of the VCL lifecycle will not run.

If the revalidation request elicits a response other than a 304, the response will be processed as normal, and will invoke vcl_fetch in VCL services.

Fastly honors staleness-related caching directives as indicated above on both the VCL and Compute platforms, but in VCL services, behavior may also be controlled more precisely using edge code.

Stale content can be explicitly selected by VCL code (scenario 4 above), if it is available, in the following subroutines:

  • In vcl_fetch: if the origin returns a response which is valid HTTP, then Fastly will by default serve the received object, and if cacheable, use it to replace the stale object in cache. However, if the response is nonsensical or an error, you may prefer in that scenario to serve the stale content instead, by using return(deliver_stale).
  • In vcl_error: if, during a fetch to origin, Fastly encounters a network level error, such as finding the origin unreachable or being unable to negotiate an acceptable TLS session, we will trigger an error and move the VCL control flow to vcl_error directly, without running vcl_fetch. By default this will result in serving a Fastly-standard error page to the end user, but if stale content exists in cache you can opt to use this instead by using return(deliver_stale) from vcl_error.

It is also possible to switch to a stale object in vcl_miss but there are few reasons to do so.

The existence of stale content is discoverable in vcl_fetch and vcl_error using the stale.exists variable, which will only be true if the object is within a stale-if-error period. Stale content in the expired stale state cannot be used from VCL.

Setting req.hash_always_miss or req.hash_ignore_busy variable to true invalidates the effect of the stale-while-revalidate Cache-Control directive.

Stale while revalidate: Eliminate origin latency

stale-while-revalidate tells caches that they may continue to serve a response after it becomes stale for up to the specified number of seconds, provided that they work asynchronously in the background to fetch a new one. For example, an origin server may provide a response to Fastly with the following headers:

Cache-Control: max-age=300, stale-while-revalidate=60
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"

Upon receiving this response, Fastly will store and reuse that response for up to 5 minutes (max-age=300) as normal. Once that freshness lifetime expires, the stale-while-revalidate directive allows Fastly to continue to serve the same content for up to another 60 seconds provided that we use that time to revalidate the cached content with the origin server in the background. As soon as a new response is available, it will replace the stale content and its own cache freshness rules will take effect. However, if the 60 second revalidation period expires and we haven't been able to get updated content, the stale version will no longer be usable in this manner (it may remain stale for a further period of time if it has an additional stale-if-error directive).

Background revalidation process flow

When a cache lookup results in a stale object that is within a stale-while-revalidate period (scenario 2 in the list above), Fastly will fork the request into two paths. One path will serve the stale object to the current client, while the second will asynchronously fetch the object from the origin.

IMPORTANT: If there is already a background revalidation in progress for the resource being requested, the stale object will be served but a new revalidation will not be triggered. See request collapsing.

In Compute services, this all happens within the fetch operation, while in VCL services each stage invokes VCL subroutines as follows:

Background revalidation flow

VCL subroutines executing in a background revalidation context are identifiable via the req.is_background_fetch VCL variable.

Background revalidations will only repopulate cache if the process above is successful and beresp.cacheable is set to true at the end of vcl_fetch. Otherwise, the stale object will continue to be used and background revalidation will continue to be attempted on subsequent requests, until the SWR period expires.

Background revalidations are eligible for request collapsing.

Stale if error: Survive origin failure

stale-if-error tells Fastly that if a backend is sick (i.e. is currently failing a health check), a stale response may be used instead of outputting an error - which helps us always guarantee a nice user experience even during periods of server instability.

Cache-Control: max-age=300, stale-if-error=86400

In the above example, Fastly will store and serve the fresh content for 5 minutes, just like the previous example, but this time, when the 5 minutes has expired, the next request for this content will block on a synchronous fetch to origin. Unlike stale-while-revalidate, stale-if-error doesn't allow for any asynchronous revalidation.

When an origin fetch is triggered, during a stale-if-error window:

  • if the origin is sick (i.e. is currently failing a health check), the stale content will be served automatically.
  • if the origin is erroring (i.e. responding with a valid HTTP response such as a 503 "service unavailable"), Compute services will receive the response as served by the origin, while VCL services will invoke vcl_fetch and set stale.exists to true. The stale content may be used in VCL services by explicitly selecting it.
  • if the origin is down/unreachable, Compute services will receive a response generated by Fastly, and VCL services will invoke vcl_error and set stale.exists to true. The stale content may be used in VCL services by explicitly selecting it.

Once the stale period expires, the content can no longer be used as a backup and, if the origin is sick or a failure is encountered in fetching from the origin, Fastly must serve an error.

If a response specifies both a stale-while-revalidate and a stale-if-error directive, the revalidation period comes first, and the error period is added on to it.

Applying staleness directives to Fastly only

stale-* directives apply to all caching HTTP clients, not just CDNs and other non-browser clients. While stale-serving in browsers is also useful, if you are trying to apply stale behaviors only to Fastly, consider using the Surrogate-Control cache-control header. It functions similarly to Cache-Control, but overrides it if the two are both present and is removed by Fastly automatically, so you can control the stale logic for Fastly independently of browsers.

Surrogate-Control: max-age=300, stale-while-revalidate=60, stale-if-error=86400
Cache-Control: max-age=60

Surrogate-Control has the same spec as Cache-Control but Fastly's implementation does not support the s-maxage directive (in the context of Surrogate-Control, s-maxage would mean the same thing as max-age, so use Surrogate-Control: max-age).

Summary table

To summarize, these are the four possible freshness states that a piece of cached content can be in when it is matched by an incoming request:

  • Fresh: We have a copy of the content, and it's within its initial freshness lifetime
  • SWR: We have a stale version of the content, and we're within a stale-while-revalidate period
  • SIE: We have a stale version, and it doesn't qualify for SWR (or that period has already expired), but we're within a stale-if-error period, allowing it to be used automatically if an origin is sick.
  • None: We don't have the content, or if we do, it's expired (content in this state will still allow for conditional fetches if it has an ETag or Last-Modified date)

And there are also four possible states that an origin server can be in:

  • Healthy: Origin is up and working
  • Erroring: Origin is returning syntactically valid HTTP responses with response status codes in the 5xx range (e.g., 503 "service unavailable")
  • Down: Origin is unreachable, or unable to negotiate a TCP connection
  • Sick: Fastly has marked this origin as unusable because we've been consistently unable to fetch a health check endpoint. Origins in a down or erroring state that have a health check will eventually be transitioned to sick by the health check.

This results in 16 possible permutations, which we can visualize as a grid to show where good things happen and where bad things happen:

Content state
FreshSWRSIENone
Origin
state
HealthyπŸ˜€πŸ˜€πŸ˜΄πŸ˜΄
ErroringπŸ˜€πŸ˜€πŸ˜‘πŸ˜‘
DownπŸ˜€πŸ˜€πŸ˜‘πŸ˜‘
SickπŸ˜€πŸ˜€πŸ˜€πŸ˜‘

The three possible outcomes are that the user will see the content they want served from the edge (πŸ˜€), they'll get the content but including a blocking fetch to origin (😴), or they'll see an unfiltered error (😑), which could be either something generated by Fastly or whatever your origin server returned.

HINT: Some of these scenarios can be improved in VCL services by adjusting the default Fastly configuration to be more aggressive about using stale content. For more details, see our serving stale tutorial.

Shielding considerations

If you have shielding enabled in your service, the shield POP may serve stale content to the edge POP, which should avoid caching that content as fresh. By default, the right thing happens because the shield POP will send an Age header along with the response to the edge POP, and the edge POP will not cache the response because the Age already exceeds the object's freshness TTL (specified by a max-age directive).

However, in some circumstances, stale content served from a shield POP to an edge POP may be cached as if fresh:

  • when it has been purged with soft purge enabled
  • where your service configuration has directly manipulated the object's TTL in edge code, such that it no longer matches the max-age defined on the object's response headers
  • where the edge POP has made a conditional GET to the shield POP and the shield POP has returned a 304 (Not Modified) response

The inadvertent caching of stale content at the edge POP due to these edge cases can easily be prevented in VCL services by disabling the use of stale content for asynchronous revalidation when a POP is acting as a shield:

sub vcl_recv { ... }
Fastly VCL
if (fastly.ff.visits_this_service > 0) {
set req.max_stale_while_revalidate = 0s;
}

This code will continue to allow stale content to be used when an origin is sick. To disable that as well, set req.max_stale_if_error to 0s.

Compute services do not support shielding so they do not experience this problem.

Best practices

We recommend the following best practices to get the most out of stale content:

  • Specify a short stale-while-revalidate and a long stale-if-error value. If your origin is working, you don't want to subject users to content that is significantly out of date. But if your origin is down, you're probably much more willing to serve something old if the alternative is an error page.
  • Always include a validator (an ETag or Last-Modified header) on responses from origin.
  • Use shielding (in VCL services) to increase the cache hit ratio and increase the probability of having stale objects to serve.
  • Always use soft purges to ensure stale versions of objects aren’t also evicted.