vcl_hash

The built-in vcl_hash subroutine is executed when Fastly needs to calculate the address of an object in the cache. The address of an object differentiates it from other objects in the cache and ensures that Fastly finds and delivers the correct object to the client. By default we include req.url and req.http.host in the hash, which means that the following URLs would be considered to be different:

  • https://www.example.com/hello.html
  • https://example.com/hello.html
  • https://example.com/hello.html?foo=42

Here, the differences in hostname and query string cause Fastly to treat these as separate objects. The default configuration takes no notice of HTTP method (e.g., GET, HEAD or POST), because typically responses to non-GET requests are not cached. We also ignore connection differences, so requests arriving over TLS will find the same object as those arriving over insecure HTTP.

The value of the HTTP Host header is converted to lowercase before being assigned to req.http.host to ensure consistent behavior of Fastly customer services and origins. This happens prior to vcl_recv and means that no matter how your site’s domain name is capitalized in the request, the hash function (and any other VCL that uses the host value) will result in the same cache address and therefore find the same cache object. This does not apply to req.url, which is not subject to normalization and will reflect the capitalization present on the request as it was received.

The "generation" value

Fastly also includes the value of the req.vcl.generation variable in the hash to support purging. This value is a constant specific to your service, and only changes when a purge all operation is performed, allowing a cache to be dropped quickly by preventing future requests from matching existing cache objects. Objects rendered inaccessible by a change to the generation value will eventually be dropped from cache as part of the normal eviction process.

Variations

While the hash produced by the vcl_hash subroutine is the primary object key, that object may contain multiple variations of the content that are subject to further, more specific criteria in HTTP headers on the request, as defined by the Vary header on the response. Using the Vary header mechanism is generally a better technique than modifying the hash, and would be suitable for use cases like variations based on requested language, user location, or login state. For more information see Getting the most out of the Vary header.

Variations are subject to a limit of 200 per object, so if the number of variations will exceed this significantly, modifying the cache key via the vcl_hash subroutine may be a better option.

Clustering handoff

To support caching at Fastly's scale, once a cache key has been calculated, the request is usually handed off from the server node that initially handled it (the "delivery node"), to a storage server that is responsible for fetching and caching the object (the "fetch node") in a process called clustering. The cache lookup is first performed on the delivery node, and then if the object is not found (which is likely) then the lookup is performed again on the object's designated fetch node. If the lookup operation does not find a matching object, it creates an empty one as a marker and inserts it in cache. Then, the request is sent to the vcl_miss subroutine.

If an object is found (either on the delivery or fetch node), it is loaded and the request passes to the vcl_hit subroutine, unless the object is empty because a similar request received moments earlier has already initiated a backend request for this object. In that case, the new request enters a waiting list and will block on the response to the earlier similar request, in a process called request collapsing. If possible, we will deliver the same response from origin to both client requests.

It is also possible for a cache lookup to find a 'pass' marker for the object, indicating that the object is not cacheable. In this situation we will not attempt to collapse concurrent requests for the same object and will send them all to origin independently. A hit-for-pass occurs when a prior request for the same object has performed a return(pass) from vcl_fetch.

Clustering is disabled automatically if there is a hit on the delivery node, after a restart, or if you return(pass) or error from vcl_recv. It can also be disabled manually by setting the Fastly-No-Shield HTTP header to "1" in vcl_recv. Where clustering is disabled, all VCL flow stages happen on the delivery node.

Return states

The only valid return from vcl_hash is return(hash). The subsequent behavior is determined by how the request entered the vcl_hash subroutine - if a return(pass) was made in vcl_recv, control moves from vcl_hash to vcl_pass. If an error was raised in vcl_recv, control moves from vcl_hash to vcl_error. Otherwise, the lookup process begins, starting on the delivery node and moving to the fetch node if needed, triggering the vcl_hit or vcl_miss subroutines as appropriate.

The exception statements restart and error cannot be used in vcl_hash.

State transitions

vcl_hash

To see this subroutine in the context of the full VCL flow, see using VCL.

Example

The code example Support caching of OPTIONS requests is a good example of the vcl_hash subroutine in use:

Tokens available in this subroutine

The following limited-scope VCL functions and variables are available for use in this subroutine (those in bold are available only in this subroutine, those available in *all* subroutines are not listed):