Staleness and revalidation
When content is stored in Fastly's cache, it has a freshness lifetime, also known as a TTL or "time to live". This is the period of time during which Fastly will serve the content in response to a compatible request, without revalidating it with the origin server.
Once the freshness lifetime expires, the content will no longer be served directly and unconditionally from cache, but the expiry does not prompt the object to be deleted. Instead, it will be marked as stale. The way objects are treated once they transition to a stale state can be controlled by VCL and standard HTTP headers. Objects may have one of three distinct types of stale state, in addition to being fresh. As an example, consider an object received by Fastly with the following response header:
Cache-Control: max-age=300, stale-while-revalidate=60, stale-if-error=86400
This object will, unless evicted earlier due to lack of use, transit the following states:
When requests are received that match a stale object, Fastly will process the object using the following steps:
- If the origin is sick, and the object is within a
stale-while-revalidate
orstale-if-error
period, then the stale object will be served. - Otherwise, if the object is within a
stale-while-revalidate
period, then it will be served and a background fetch will be made to update it. - Otherwise, if the object has an
ETag
orLast-Modified
header, then Fastly will block on a conditional fetch for the object, allowing the origin to issue a304 Not Modified
response if it wants to allow the object to continue to be used, or to serve a new object instead. - Otherwise, if the object is within a
stale-if-error
period and your VCL explicitly returnsdeliver_stale
fromvcl_miss
,vcl_fetch
, orvcl_error
, the stale object will be served. - Otherwise, the stale object will be ignored and Fastly will fetch a new object from origin exactly as if there was no match in the cache.
WARNING: Content is not guaranteed to be stored for the entire freshness lifetime and, especially in the case of large objects that are not frequently requested, we may evict them sooner to make space for more popular objects. Also, setting req.hash_always_miss
or req.hash_ignore_busy
variable to true
invalidates the effect of stale-while-revalidate
.
Stale content can be explicitly selected by VCL code (scenario 4 above), if it is available, in the following subroutines:
- In
vcl_fetch
: if the origin returns a response which is valid HTTP, then Fastly will by default serve the received object, and if cacheable, use it to replace the stale object in cache. However, if the response is nonsensical or an error, you may prefer in that scenario to serve the stale content instead, by usingreturn(deliver_stale)
. - In
vcl_error
: if, during a fetch to origin, Fastly encounters a network level error, such as finding the origin unreachable or being unable to negotiate an acceptable TLS session, we will trigger an error and move the VCL control flow tovcl_error
directly, without runningvcl_fetch
. By default this will result in serving a Fastly-standard error page to the end user, but if stale content exists in cache you can opt to use this instead by usingreturn(deliver_stale)
fromvcl_error
.
It is also possible to switch to a stale object in vcl_miss
but there are few reasons to do so.
The existence of stale content is discoverable in these VCL subroutines using the stale.exists
variable, which will only be true
if the object is within a stale-if-error
period. Stale content in the expired stale state cannot be used from VCL.
The HTTP headers stale-while-revalidate and stale-if-error (which trigger scenarios 1 and 2 above) are HTTP standards defined in RFC 5861 - HTTP Cache-Control Extensions for Stale Content.
HINT: If your website were a radio station, then the CDN edge cache would be like your transmitter towers - essential to extend your reach to a huge audience, but useless without the signal from HQ to broadcast. Large state broadcasters have long realized this and placed local recordings of content at transmission sites "just in case" the transmitter loses its uplink to home base.
Serving content from Fastly to end users is very fast, and very reliable. But this only happens, by default, when that content is available, and fresh, in the cache. Use of stale-while-revalidate
, stale-if-error
and custom VCL that intercepts origin errors can significantly improve outcomes for end users
Stale while revalidate: Eliminate origin latency
stale-while-revalidate
tells caches that they may continue to serve a response after it becomes stale for up to the specified number of seconds, provided that they work asynchronously in the background to fetch a new one. Here's an example:
Cache-Control: max-age=300, stale-while-revalidate=60
Upon receiving an upstream response with this header, Fastly will store and reuse that response for up to 5 minutes (300 seconds) as normal, following the instructions in the max-age
directive. Once that freshness lifetime expires, the stale-while-revalidate
directive allows Fastly to continue to serve the same content for up to another 60 seconds provided that we use that time to try and get a new one from the origin server. As soon as a new response is available, it will replace the stale content and its own cache freshness rules will take effect. However, if the 60 second revalidation period expires and we haven't been able to get updated content, the stale version will no longer be usable in this manner.
Background revalidation process flow
When a cache lookup results in a stale object that is within a stale-while-revalidate
period (scenario 2 in the list above), Fastly will fork the request into two paths. One path will serve the stale object to the current client, while the second will asynchronously fetch the object from the origin:
Subroutines executing in a background revalidation context are identifiable via the req.is_background_fetch
variable.
Background revalidations will only repopulate cache if the process above is followed successfully and beresp.cacheable
is set to true
at the end of vcl_fetch
. Otherwise, the stale object will continue to be used and background revalidation will continue to be attempted on subsequent requests, until the SWR period expires.
Background revalidations are eligible for request collapsing.
Stale if error: Survive origin failure
stale-if-error
tells Fastly that if an error is encountered while trying to talk to an origin server, a stale response may be used instead of outputting an error - which helps us always guarantee a nice user experience even during periods of server instability.
Cache-Control: max-age=300, stale-if-error=86400
In the above example, Fastly would store and serve the fresh content for 5 minutes, just like the previous example, but this time, when the 5 minutes has expired, the next request for this content will block on a synchronous fetch to origin. Unlike stale-while-revalidate
, stale-if-error
doesn't allow for any asynchronous revalidation, but it does allow the stale version to be used if the origin fails to respond (whether due to a timeout, connection error or malformed response). Once the stale period expires, the content can no longer be used as a backup and, if a failure is encountered in fetching from the origin, Fastly must serve an error even if the stale version is still in storage.
If a response specifies both a stale-while-revalidate
and a stale-if-error
directive, the revalidation period comes first, and the error period is added on to it.
Browser caching
stale-*
directives apply to all caching HTTP clients, not just CDNs and other non-browser clients. While stale-serving in browsers is also useful, if you are trying to apply stale behaviors only to Fastly, consider using the Surrogate-Control
cache-control header. It functions similarly to Cache-Control
, but overrides it if the two are both present and is removed by Fastly automatically, so you can control the stale logic for Fastly independently of browsers.
Surrogate-Control: max-age=300, stale-while-revalidate=60, stale-if-error=86400Cache-Control: max-age=60
Surrogate-Control
has the same spec as Cache-Control
but Fastly's implementation does not support the s-maxage
directive (in the context of Surrogate-Control
, s-maxage
would mean the same thing as max-age
, so use Surrogate-Control: max-age
).
Freshness-based serving decision permutations
These are the four possible freshness states that a piece of cached content can be in when it is matched by an incoming request:
- Fresh: We have a copy of the content, and it's within its initial freshness lifetime
- SWR: We have a stale version of the content, and we're within a
stale-while-revalidate
period - SIE: We have a stale version, and it doesn't qualify for SWR, but we're within a
stale-if-error
period, allowing it to be used automatically if an origin is sick. - None: We don't have the content, or if we do, it's expired (content in this state will still allow for conditional fetches if it has an
ETag
orLast-Modified
date)
And there are also four possible states that an origin server can be in:
- Healthy: Origin is up and working
- Erroring: Origin is returning syntactically valid HTTP responses with response status codes in the 5xx range (eg. 503 service unavailable)
- Down: Origin is unreachable, or unable to negotiate a TCP connection
- Sick: Fastly has marked this origin as unusable because we've been consistently unable to fetch a healthcheck endpoint. Origins in a down or erroring state that have a healthcheck will eventually be transitioned to sick by the healthcheck.
This results in 16 possible permutations, which we can visualize as a grid to show where good things happen and where bad things happen:
Content state | |||||
---|---|---|---|---|---|
Fresh | SWR | SIE | None | ||
Origin state | Healthy | π | π | π΄ | π΄ |
Erroring | π | π | π‘ | π‘ | |
Down | π | π | π‘ | π‘ | |
Sick | π | π | π | π‘ |
Without any VCL to customize the behavior of the platform, there are only three possible outcomes for the user: either they'll see the content they want served from the edge (π), they'll get the content but including a blocking fetch to origin (π΄), or they'll see an unfiltered error (π‘), which could be either something generated by Fastly or whatever your origin server returned.
There are some situations here that we could improve by adjusting the default Fastly configuration to be more aggressive about using stale content. For more details, see our serving stale tutorial.
Best practices
We recommend the following best practices to get the most out of stale content:
- Specify a short
stale-while-revalidate
and a longstale-if-error
value. If your origin is working, you don't want to subject users to content that is significantly out of date. But if your origin is down, you're probably much more willing to serve something old if the alternative is an error page. - Use shielding to increase the cache hit ratio and increase the probability of having stale objects to serve (in principle 'shielding' is the practice of placing one layer of Fastly behind another and focusing requests on a single location.
- Always use soft purges to ensure stale versions of objects arenβt also evicted.