vcl_fetch subroutine is executed just after the headers of a syntactically correct backend response have been received. If the request arrived in this subroutine from
vcl_miss, the fetched object may be cached. If, instead, the
vcl_fetch subroutine is called from
vcl_pass, the fetched object is not cached even if
beresp.ttl is greater than zero.
The value of
beresp.ttl is set prior to execution of
fetch, based on parsing the headers from the backend response and understanding the cache semantics desired by the upstream server. This TTL parsing does not take into account
Cache-Control directives such as
no-store, so the fetch subroutine is a good place to apply additional rules to implement caching semantics other than the TTL. If Fastly is unable to determine a TTL based on response headers,
beresp.ttl will be 2 minutes at the start of
Modifying headers such as
Cache-Control inside of
vcl_fetch will affect how those headers are presented on the response when it is delivered by Fastly (that is, they will affect browser caches or any other caches downstream of Fastly) but such modifications will not affect the TTL of the object in the Fastly cache (for which, modify
beresp.ttl instead). Note that if your service uses shielding, then requests may pass through two Fastly data centers, and therefore the delivered client response from the shield data center is considered the response to a backend fetch made by the edge data center. In this scenario modifications made to
Cache-Control headers at the shield data center may affect the TTL applied to the object at the edge data center.
Other common uses for the fetch subroutine are:
- setting specific TTLs using
beresp.ttlbased on inputs like file path or content-type
- enabling edge side compression, such as
- removing headers added by particular cloud provider backends, such as AWS S3 or Google Cloud Storage
- configuring rules for serving stale, using
- enabling streaming miss using
- flagging a response for Edge side includes processing using the
- adding variables not available in
vcl_logto response headers beresp.http.* so that they can be logged later
- detecting situations in which responses should not be cached, e.g. when a
Set-Cookieheader is present on the response, or when a
privatedirective is included in a
Last-Modifiedif you want to disable conditional revalidation of the object from the edge.
IMPORTANT: Any changes made to a response in this subroutine will become part of the object saved into the cache. Take care when attaching debug information or
Set-Cookie headers to cacheable responses in
fetch. Consider doing this in
There are two return states that are always available:
return(deliver), which will cache the object and then deliver it, or
return(pass), which will use the cache object to record the pass, saving future similar requests from having to queue due to the effects of request collapsing. However, if
vcl_fetch ends, the object is delivered without being cached and without creating a hit-for-pass, so any queued requests forced to dequeue may immediately form a new queue behind one of their number.
If a stale version of the object is in cache,
return(deliver_stale) is also available, which will discard the new response and use the cached one.
Due to the effects of clustering, this subroutine will normally run on a fetch node.
Fastly tracks the age of objects in cache and emits an
Age header on HTTP responses. If a backend response (or a response from a shield data center to an edge data center) includes an
Age header with a non-zero value, this will be considered the 'starting age' of the object when we cache it. If the value of
vcl_fetch is higher than
beresp.ttl, the object will be considered expired immediately. This doesn't necessarily mean it won't be saved into cache, since it may still be usable for conditional revalidation or serving as stale.
A common use for
vcl_fetch is to detect content that should not be cached, and intercede to prevent caching from happening. Since there are multiple ways to do this, consider the following best practices:
- If at all possible,
vcl_recvinstead. You can only do this if you know that a request will elicit a non-cacheable response before the request is sent to origin, but if you're in a position to know this, it will allow Fastly to avoid request collapsing, reducing spikiness and allowing more throughput to origin.
- Create a hit-for-pass object by setting
vcl_fetch. This will allow any pending requests that are queued on this fetch to dequeue and be sent to origin without delay, and for a short period will disable request collapsing automatically on future requests for the same object.
vcl_fetch. This will deliver the response but will create no entries or markers in the cache. Queued requests will be dequeued but may immediately form a new queue, resulting in only one request at a time being made to origin. This situation should normally be avoided.
If Fastly receives a syntactically invalid HTTP response or a timeout while trying to make a request to a backend, control passes to the
vcl_error subroutine without invoking
vcl_fetch. However, be aware that syntactically correct HTTP responses include HTTP
5xx error codes.
304 Not Modified response is received from a backend, and it is cacheable based on its caching headers, the cached object's
Age and TTL are updated to values based on the
304 response's headers, and the existing cached object is passed to
vcl_deliver. In this scenario, the
vcl_fetch subroutine is also not executed.
If segmented caching and streaming miss are both disabled, the maximum object size that can be cached is 2GB. With streaming miss enabled, this increases to 5GB. If segmented caching is enabled there is no limit on file size provided that the origin supports Range requests
Responses that include a
Vary header are limited to 200 variations per cache key, per data center. Exceeding 200 variants, newer variants will start to displace the oldest. A "Too many variants" error will be triggered if 400 variants is reached.
To see this subroutine in the context of the full VCL flow, see using VCL.
The code example Overriding TTLs based on content type is a good example of the
vcl_fetch subroutine in use:
Tokens available in this subroutine
The following limited-scope VCL functions and variables are available for use in this subroutine (those in bold are available only in this subroutine):