VCL best practices

Best practices in Fastly VCL have changed over time to help address expectation gaps and improve maintainability. This page covers some of the most common use cases in edge logic and demonstrates how to avoid bad code, reduce risk, improve safety, and take steps to make the codebase more maintainable for large teams.

Don't use regsub for data extraction

Often you have a large lump of data in a single string. For example, you might have a session cookie that looks like this:

uid=12345:sess=01234567-89abcdef-012:name=sjones:remember=1

It doesn't really matter what each of these tokens are or what format the string is in. The point is that you want to extract just one of them. A common but misguided way to do this is with the regsub function:

set var.result = regsub(
req.http.cookie:auth,
"^.*?:name=([^:]*):.*?$", "\1"
);

This replaces the entire string with the text that appears after 'name=', up to but not including the next colon, and in this case, it would yield the result "sjones". However, if the format of the string is not what you expected, for example, because it contains less than three colons or does not have a name= section, then the return value is the entire input string. This is dangerous because you might start to leak data that you don't intend to.

Solution 1: Use the if() function

The if() function provides a ternary construct that you can use along with regular expression capture variables:

set var.result = if(
req.http.cookie:auth ~ "^(?:.*:)?name=([^:]*):", # Expression
re.group.1, # Value to return if true
"" # Value to return if false
);

This time, if the regex fails to match, the return value is an explicitly declared default (the empty string in the example above). The regex is also simpler because it's no longer necessary to anchor the pattern to both the beginning and end of the string.

Solution 2: Use subfield()

If the input string is key-value pairs, as it is in the example, a better solution is to use subfield:

set var.result = subfield(req.http.cookie:auth, "name", ":");

Use the : operator for cookies or other header subfields

In the example above, the : operator is used to access the cookie (req.http.cookie:auth). While if and subfield are good at extracting subfields from the content of a single cookie, extracting the cookie itself from the inbound Cookie header can take advantage of subfield-accessor syntax. If an HTTP header is in the common format (key=value, key=value), then use the colon in combination with the header name to access the subfield.

This also works well with headers such as Cache-Control, and can be used to write individual header fields, as well as read them:

set resp.http.Cache-Control:max-age = "3600";

You can even use this syntax to add tokens to headers that are simple token lists (key, key, key), rather than key-value pairs, by setting the value to an empty string. For example, the Vary header is a comma separated list of other header names. You could add a header to the vary list without overwriting the existing ones:

# If the Vary header value is "My-Header", then after the
# following statement runs, it will be "My-Header, Accept-Encoding"
set resp.http.Vary:Accept-Encoding = "";

Empty strings are always truthy

In many programming languages, implicit casting of a string to a boolean will yield false if the string is one of a few 'falsey' values, like "0", an empty string, or null. In VCL, only strings that are not set are falsey, so, for example, this does not do what you might think it does:

if (req.url.qs) {
# Do something if the request has a query string
# (but actually this will ALWAYS EXECUTE)
}

The above code executes on all requests — because if the inbound request has no query string, req.url.qs is an empty string, which, when used in a BOOL context, is true.

Instead, check explicitly that it is not an empty string, or if you want a solution that considers both NULL and an empty string to be falsey, use the std.strlen function, which returns 0 in both cases:

if (std.strlen(req.url.qs) > 0) {
# Do something if the request has a query string
}

Use Accept-Language, not geo, for language selection

Often Fastly customers offer content in a number of languages, and want to make life easier for users by delivering their preferred language automatically. But we sometimes see that being done using our geolocation features:

# Assume badly that everyone in Mexico reads Spanish and everyone in
# the United Kingdom reads English
if (client.geo.country_code == "mx") { # Mexico
set req.url = querystring.add(req.url, "lang", "es"); # ...Spanish
} else if (client.geo.country_code == "gb") { # UK
set req.url = querystring.add(req.url, "lang", "en"); # ...English
}

It's a reasonable bet that a user connecting to your site from Mexico would be able to read Spanish, but maybe the user is visiting from somewhere else, and using their hotel's WiFi. Just because they moved to a different country doesn't mean they suddenly speak its language.

Instead, use the Accept-Language header, which is set by browsers based on the language settings of the user's computer, and benefit from Fastly's support for normalization of the Accept-Language format:

set req.url = querystring.add(
req.url,
"lang",
accept.language_lookup("en:de:fr:nl:es", "en", req.http.Accept-Language)
);

This code will read the end user's Accept-Language header, normalize it within a set of languages that your site supports, and then add the final choice to the query string.

Modifying the query string in this way will create separate cache keys for each language variant, and that can make content harder to purge from the cache. For a more complete solution to varying content by language, consider using a Vary header.

beresp.http.Cache-Control does not affect TTL in vcl_fetch

When Fastly receives a resource from an origin server, we parse the headers to determine how long we should store the object in the cache (the TTL). This is fairly complex because there are a number of different headers that can determine freshness. Whatever number we end up with is then applied to the cacheable object and exposed in VCL as beresp.ttl.

In the following VCL:

sub vcl_fetch { ... }
Fastly VCL
set beresp.http.Cache-Control = "max-age=3600";
set beresp.ttl = 60s;

The first line has no effect on how long Fastly will cache the object for. That's because we've already parsed the headers and decided on a TTL. To modify that decision, it's beresp.ttl that you need to change, so the second line does affect edge cache TTL.

However, setting beresp.ttl alone will have no effect on the cache behavior in downstream caches such as web browsers (or another layer of Fastly, if you are shielding).

Be aware of default catch-alls for TTL and backend

By default, Fastly configurations include a line in the vcl_recv subroutine that sets the backend to use for the request, and another in vcl_fetch that sets a default service-specific TTL. Often, customers will use UI configuration objects, or VCL snippets, to change these values under certain circumstances, and should take care that the overall logic still makes sense. It's quite easy to end up with something like:

if (req.url ~ "^/some-path") {
set req.backend = F_alternative_origin;
}
set req.backend = F_normal_origin;

This code is redundant: the backend will always end up set to F_normal_origin because it is unconditional. To view your service's full "generated VCL", click "Show VCL" in the configuration UI.

Don't assume custom headers are trustworthy

A common security issue with configurations happens when customers use a custom header to store some form of validation state, but fail to validate that the header didn't come from the client:

if (req.http.Paywall-State == "allow") { # Not safe!
return(lookup);
} else if (req.http.Paywall-State == "deny") {
error 403;
} else {
# Perform a paywall API call, set header, and restart
}

In this setup, an end user can bypass your paywall with a browser extension (like this one) that adds Paywall-State: allow to all requests made to your domain.

Additionally, some headers are set by Fastly but have variable levels of protection from client modification. For example, CDN-Loop and Fastly-FF are headers set by Fastly when requests pass though our data centers, making data visible for logging and analysis, and to prevent request forwarding loops within the platform. Modifying either of these is not permitted in VCL, but if inbound requests already have a value in that header, it will be preserved. Therefore using this header to determine whether the request has already passed through Fastly is not secure.

if (!req.http.Fastly-FF) {
call do_authentication;
# (but we'll skip this if the end-user knows
# to send a Fastly-FF header themselves!)
}

In the case of Fastly-FF, the signature is validated automatically and exposed as fastly.ff.visits_this_service, a count of the number of times the current request has been handled so far by the current Fastly service configuration. Additionally, your service may be using the restart statement to return control to the start of the VCL flow, so we should also account for that possibility and not wipe out any headers that you've set before the restart:

if (fastly.ff.visits_this_service == 0 && req.restarts == 0) {
# Here you are guaranteed to be dealing with a request
# for the first time, and there is no way for an end user
# to avoid hitting this condition, so it's a good place to
# perform one-time validation, and to ensure you start
# processing in a clean state by unsetting headers that
# should not be in the inbound request.
call perform_authentication;
unset req.http.My_Custom_Header;
}

With all this said, you might equally decide it's simpler, safer and more maintainable to run the same logic regardless of whether the request is being handled for the first time or not.

Make sure things happen only once

In Fastly configurations, there are several reasons why VCL code might run more than once for the same request. Primarily these repeats are caused by the restart command, and our shielding feature. It's therefore incredibly common for problems to be caused by not anticipating that a service configuration will run twice. If your configuration doesn't use shielding, this doesn't necessarily matter, but even if you don't, it's worth being "shield safe" in case you decide to turn it on in future.

Case 1: Modifying requests

Imagine that you have an Amazon S3 bucket as a backend, and you need to turn a request for /styles/main.css into /my-bucket-name/styles/main.css when sending it to the backend. You might do this:

sub vcl_miss { ... }
Fastly VCL
set bereq.url = "/my-bucket-name" + bereq.url;

This will work just fine until you turn on shielding, at which point you will get 403 or 404 errors because the logic runs both on the edge and shield POP, and the path requested from S3 will be /my-bucket-name/my-bucket-name/styles/main.css.

To prevent this, use the req.backend.is_origin variable to determine whether the request to origin is going outside of Fastly:

sub vcl_miss { ... }
Fastly VCL
if (req.backend.is_origin) {
set bereq.url = "/my-bucket-name" + bereq.url;
}

Or, in some cases, it might be clearer and easier to maintain your code if you simply make the operation idempotent (i.e., if you run it twice, nothing happens the second time):

sub vcl_miss { ... }
Fastly VCL
if (bereq.url !~ "^/my-bucket-name/") {
set bereq.url = "/my-bucket-name" + bereq.url;
}

Case 2: Restarts

When you want to restart a request, you might do this:

sub vcl_deliver { ... }
Fastly VCL
if (resp.status == 502) {
restart;
}

This will restart the request if the status code is 502 ("Bad gateway"). But if the restarted request again receives a 502 response, it will again restart, and may eventually cause a "Max restarts limit reached" error. This is another case where you'd want to ensure that this logic only runs once.

VCL provides the req.restarts variable which records the number of times the request has been restarted:

sub vcl_deliver { ... }
Fastly VCL
if (resp.status == 502 && req.restarts == 0) {
restart;
}

HINT: If your service also restarts for other reasons, checking the req.restarts variable might not work for you, so another way to solve this would be to set a flag indicating that the restart has happened:

sub vcl_deliver { ... }
Fastly VCL
if (!req.http.restarted-for-502) {
set req.http.restarted-for-502 = "1";
restart;
}

This technique adds a header to the request, which will also be transmitted in any origin requests unless removed in vcl_miss and vcl_pass.

Case 3: Modifying responses

It's common to manipulate HTTP response headers in the vcl_deliver subroutine. For example, you may want to add a Set-Cookie header or perhaps rewrite a Cache-Control header:

sub vcl_deliver { ... }
Fastly VCL
add resp.http.Set-Cookie = "foo=bar; path=/; max-age=3600";
set resp.http.Cache-Control = "no-store, private";

If shielding is enabled, these headers are added not just in the response to the client, but also in the response from the shield POP to edge POP. This may cause undesired behavior; in this particular example, the existence of a Set-Cookie header and the private directive in the Cache-Control header will most likely prevent the response from being cached by the edge POP (assuming a default VCL configuration).

To avoid this, you can use the fastly.ff.visits_this_service variable:

sub vcl_deliver { ... }
Fastly VCL
# only run this on the first Fastly node
# (i.e., the deliver node of the edge pop)
if (fastly.ff.visits_this_service == 0) {
add resp.http.Set-Cookie = "foo=bar; path=/; max-age=3600";
set resp.http.Cache-Control = "no-store, private";
}

In summary, consider where you want something to run, and guard the code appropriately:

  • If it's most important that the code runs only if the request is about to exit Fastly and be sent to your backend, use req.backend.is_origin.
  • If it's most important that the code only runs once, such as path prefixing, then use a conditional to make it idempotent.
  • If it's most important that the code only runs once, such as restart, then use req.restarts to avoid restart loop.
  • If it's most important that the code only runs on the first Fastly node, then use fastly.ff.visits_this_service.