Integration with backend technologies

Most requests to a Fastly service elicit a response that has been fetched from a third party backend server. Fastly interacts with thousands of varied backend technologies and supports any backend that is an HTTP/1.1 compliant web server. Backends can be configured using the CLI, our API, in the web interface or using VCL code.

Our support for backends in Fastly services is entirely standards based and non-proprietary, but in recognition of their popularity, we've assembled some best practices for the most common backend technologies. These fall into a few categories:

  • Static bucket storage: Services such as Amazon S3 or Google Cloud Storage are popular and relatively inexpensive ways to connect Fastly to a set of static resources to deliver on your public web domain.
  • Serverless platforms: It's common to use Heroku's platform-as-a-service (or equivalent products such as Google App Engine or DigitalOcean App Platform), to host backends. Serverless functions (such as Google Cloud Functions or AWS Lambda, known as 'functions as a service') are also common choices to provide backends.
  • Traditional web servers: Whether you use your own hardware, or a virtualized infrastructure provider such as AWS's EC2 or Google Compute Engine, you will need to install and run your own operating system and web server, such as Apache, nginx or Microsoft IIS.

HINT: We maintain an addon in the Heroku elements marketplace that can be used to provision a Fastly service for your Heroku app, with no need to register separately for Fastly, and will also allow you to pay for Fastly via your Heroku bill. However, using the addon is not required. For example, it may be beneficial to create your Fastly services independently of Heroku, in case you want to have more than one Heroku backend attached to the same Fastly service. For other serverless providers, you must always create a Fastly account separate from your account with the hosting provider.

There are numerous configuration patterns for Fastly services that make integration with backends of various types possible, easier, or more elegant. Here we cover some of the most common patterns that are used with backends.

Overriding the Host header

By default the Host header supplied with the inbound request is copied to any requests Fastly makes to a backend, but the IP address to which the request will be sent will be resolved from the backend's hostname. Thus, if you have a public website at www.example.com which points to Fastly, and a backend at origin.example.com that points to your origin, then the web server running on your origin must be correctly configured to handle requests that have a Host: www.example.com header.

This is often not possible with static bucket hosts and serverless platforms, and it's often convenient to match the Host header of the backend request to the backend server's DNS name even for your own servers.

To set a host override for a backend, use the override_host field when creating the backend in the API or CLI. You can also do this when creating a backend in the web interface.

Static bucket providers

Most bucket providers require a Host header that identifies the bucket, and often the region in which the bucket is hosted:

ServiceHost header
Amazon S3{BUCKET}.s3.{REGION}.amazonaws.com
Alibaba Object Storage Service{BUCKET}.{REGION}.aliyuncs.com
Backblaze (S3 compat mode){BUCKET}.s3.{REGION}.backblazeb2.com
DigitalOcean Spaces{SPACE}.{REGION}.digitaloceanspaces.com
Google Cloud Storage{BUCKET}.storage.googleapis.com
Microsoft Azure Blob Storage{STORAGE_ACCOUNT_NAME}.blob.core.windows.net
Wasabi Hot Cloud Storage{BUCKET}.s3.{REGION}.wasabisys.com

Serverless platforms

Most PAAS providers require that requests carry a Host header with the hostname of your app, not the public domain of your Fastly service.

ServiceHost header
Heroku{app-name}.herokuapp.com

Modifying the request path

In some cases, you may need to modify the path of the request URL before it is passed to a backend. There are a few possible reasons for this, all specific to static bucket providers:

  • Bucket selection: Where the bucket provider requires the URL path to be prefixed with the bucket name.
  • Directory indexes: Some providers do not support automatically loading directory index files for directory-like paths. For example, the path /foo/ may return an "Object not found" error, even though /foo/index.html exists in the same bucket. If your provider doesn't support automatic directory indexes, you can add the appropriate index filename to the path.

The following providers require path modifications to select the right bucket:

ServicePath modification
Backblaze (B2 mode)/file/{BUCKET}/{PATH}

HINT: Some bucket providers may allow the path to be used to select a bucket but if they also support selecting a bucket via the Host header we recommend that you choose that method.

Path modifications are best performed in vcl_miss, which has access to the bereq object, to avoid mutating the client request:

sub vcl_miss { ... }
Fastly VCL
declare local var.bucketPathPrefix STRING;
set var.bucketPathPrefix = "/file/YOUR_BUCKET_NAME"; # For many providers this can be an empty string
if (req.method == "GET" && req.backend.is_origin) {
set bereq.url = var.bucketPathPrefix req.url if(req.url ~ "/$", "index.html", "");
}

Care should be taken to do path modifications only once, especially when shielding. To ensure that the modification only affects the request just before it is sent to the origin, check the value of the req.backend.is_origin variable. Also note that the req.url variable in VCL already contains a leading /, so does not require an additional delimiter to be appended to the prefix.

Redirecting for directory indexes

Some static bucket providers do not support automatically redirecting a directory request that doesn't end with a /. For example, a request for /foo where the bucket contains a /foo/index.html object, will often return an "Object not found" 404 error. If you wish, you can configure Fastly so that in such cases, we retry the origin, theorising that 'foo' might be a directory, and if we find an object there, redirect the client to it. To implement this, add the following VCL to vcl_deliver:

sub vcl_deliver { ... }
Fastly VCL
if (resp.status == 404 && req.url !~ "/$" && !req.http.restart-for-dir) {
set req.http.restart-for-dir = "1";
set req.url = req.url "/";
restart;
}

Customizing error pages

When a backend is not working properly or a request is made for a non-existent URL, the backend may return an error response such as a 404, 500, or 503, the content of which you may not be able to control (or predict in advance). If you wish, you can replace these bad responses with a custom, branded error page of your choice. You can encode these error pages directly into your Fastly configuration or, if your service has a static bucket origin, you could use an object from your static bucket to replace the platform provider's error page.

This code example demonstrates both of these mechanisms:

HINT: Some static bucket providers will allow you to designate a particular object in your bucket to serve in the event that an object is not found. If they don't support that, this is a good way to implement the same behavior using Fastly and get support for a range of other error scenarios at the same time.

IMPORTANT: If you are implementing directory redirects and custom error pages, ensure the directory redirect happens first.

Setting cache lifetime (TTL)

In general, it makes sense for the server that generates a response to attach a caching policy to it (e.g., by adding a Cache-Control response header). This allows the server to apply precise control over caching behavior without having to apply blanket policies that may not be suitable in all cases. However, if you do prefer to apply caching policies based on patterns in the URL or content-type, or indeed a blanket policy for all resources, you can use your Fastly configuration to set the TTL. See cache freshness for more details.

Static bucket providers

Static bucket providers often allow caching headers to be configured as part of the metadata of the objects in your bucket. Ideally, use this feature to tell Fastly how long you want to keep objects in cache. For example, when uploading objects to Google Cloud Storage, use the gsutil command:

$ gsutil -h "Content-Type:text/html" -h "Cache-Control:public, max-age=3600" cp -r images gs://bucket/images

Setting caching metadata in this way, at the object level, allows for precise control over caching behavior, but you can often also configure a single cache policy to apply to all objects in the bucket

HINT: If your bucket provider can trigger events when objects in your bucket change, and you can attach a serverless function to those events, consider using that mechanism to purge the Fastly cache when your objects are updated or deleted. This allows you to set a very long cache lifetime across the board, and benefit from a higher cache hit ratio and corresponding increased performance. We wrote about how to do this for Google Cloud Platform on our blog.

Web servers

If using your own hardware, or an infrastructure provider on which you install your own web server (such as AWS's EC2, or Google Compute Engine), you will have a great deal more flexibility than with a static bucket host, and somewhat more than with a platform as a service provider. The most important thing to consider when using your own web server installation is the caching headers that you set on responses that you serve to Fastly, most commonly Cache-Control and Surrogate-Control.

  • Apache: Consider making use of the mod_expires module. For example, to cache GIF images for 75 minutes after the image was last accessed, add the following to a directory .htaccess file or to the global Apache config file:

    ExpiresActive On
    ExpiresByType image/gif "access plus 1 hours 15 minutes"
  • NGINX: Add the expires directive:

    location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ {
    expires 1h;
    }

    Alternatively, if you need more flexibility in modifying headers you can try the HttpHeadersMore Module.

Programming languages and frameworks

Setting caching policy in your own application software is generally the most flexible and powerful way to apply the correct cache rules for a particular response. Here are some examples in a few common languages and frameworks.

  • ExpressJS (NodeJS): Use the set method of the response to create the header in either a route handler or app-level middleware:
    app.use(function (req, res, next) {
    res.set('Cache-control', 'public, max-age=300')
    })
  • PHP: The header() function can be used to add response headers to an HTTP response:
    header('Cache-Control: max-age=3600');
  • Django (Python): The HttpResponse defines response headers as properties:
    response = HttpResponse()
    response['Cache-Control'] = 'max-age=3600'
  • Sinatra (Ruby): Use the cache_control helper:
    get '/' do
    cache_control :public, :max_age => 36000
    "hello"
    end

Removing metadata

Some hosting providers, particularly static bucket providers include additional headers when serving objects over HTTP. You may want to remove these before you serve the file to your end users. This is best done in vcl_fetch, where the changes to the object can be made before it is written to the cache:

sub vcl_fetch { ... }
Fastly VCL
# Proprietary to Google Cloud
unset beresp.http.x-goog-generation;
unset beresp.http.x-goog-hash;
unset beresp.http.x-goog-metageneration;
unset beresp.http.x-goog-storage-class;
unset beresp.http.x-goog-stored-content-encoding;
unset beresp.http.x-goog-stored-content-length;
unset beresp.http.x-guploader-uploadid;
unset beresp.http.x-goog-meta-goog-reserved-file-mtime;
# Proprietary to Amazon Web Services
unset beresp.http.x-amz-delete-marker;
unset beresp.http.x-amz-id-2;
unset beresp.http.x-amz-request-id;
unset beresp.http.x-amz-version-id;
# Proprietary to Microsoft Azure
unset beresp.http.x-ms-creation-time;
unset beresp.http.x-ms-tag-count;
unset beresp.http.x-ms-content-crc64;
unset beresp.http.x-ms-blob-sequence-number;
unset beresp.http.x-ms-blob-type;
unset beresp.http.x-ms-copy-completion-time;
unset beresp.http.x-ms-copy-status-description;
unset beresp.http.x-ms-copy-id;
unset beresp.http.x-ms-copy-progress;
unset beresp.http.x-ms-copy-source;
unset beresp.http.x-ms-copy-status;
unset beresp.http.x-ms-lease-duration;
unset beresp.http.x-ms-lease-state;
unset beresp.http.x-ms-lease-status;
unset beresp.http.x-ms-request-id;
unset beresp.http.x-ms-version;
unset beresp.http.x-ms-blob-committed-block-count;
unset beresp.http.x-ms-server-encrypted;
unset beresp.http.x-ms-encryption-key-sha256;
unset beresp.http.x-ms-encryption-scope;
unset beresp.http.x-ms-blob-content-md5;
unset beresp.http.x-ms-client-request-id;
unset beresp.http.x-ms-last-access-time;
# Standard headers that may reveal origin provider
unset beresp.http.server;

Ensuring backend traffic comes only from Fastly

Putting Fastly in front of your backends offers many resilience, security and performance benefits, but those benefits may not be realized if it is also possible to send traffic to the backend directly. Depending on the capabilities of your backend, there are various solutions.

IP restriction

We publish a list of the IP addresses that make up the Fastly IP space.

Restricting access to requests coming only from Fastly IPs is not by itself an effective way to protect your origin because all Fastly customers share the same IP addresses when making requests to origin servers. However, since IP restriction can often be deployed at an earlier point in request processing, it may be useful to combine this with one of the other solutions detailed in this section.

Shared secret

A simple way to restrict access to your origin is to set a shared secret into a custom header in the Fastly configuration

sub vcl_recv { ... }
Fastly VCL
set req.http.Edge-Auth = "some-pre-shared-secret-string";

To make this solution work, you must configure your backend server to reject requests that don't contain the secret header. This is an effective but fragile solution: if a single request is accidentally routed somewhere other than your origin, the secret will be leaked and is then usable by a bad actor to make any number of any kind of request to your origin.

Per-request signature

Consider constructing a one-time, time-limited signature within your Fastly service, and verify it in your origin application:

sub vcl_miss { ... }
Fastly VCL
declare local var.edge_auth_secret STRING;
set var.edge_auth_secret = table.lookup(config, "edge_auth_secret"); # Consider using a private edge dictionary
# Should be called in both vcl_miss and vcl_pass subroutines
if (!bereq.http.Edge-Auth) {
declare local var.data STRING;
set var.data = strftime({"%s"}, now) + "," + server.datacenter;
set bereq.http.Edge-Auth = var.data + "," + digest.hmac_sha256(var.edge_auth_secret, var.data);
}

This is slightly harder to verify than a constant string, but if a request leaks and a signature is compromised, it provides only short term access to make a single kind of request.

Amazon AWS Signature version 4

Static bucket providers like Amazon S3 cannot be programmed to support arbitrary signature algorithms like the one above, but they do support a type of signature for authentication to protected buckets and individually protected objects.

Although Amazon's signature was created for its S3 service, it is widely supported as a compatibility convenience by many other bucket providers including Backblaze (in S3 compat mode), DigitalOcean Spaces, Google Cloud Storage, and Wasabi Hot Cloud Storage. See the AWS documentation for more details. The v4 signature can be implemented in VCL using the digest.awsv4_hmac function:

Azure Blob Storage

Microsoft Azure Blob storage uses a signature that is similar, but not identical, to the AWS signature v4. See the Azure documentation for more details.

Alibaba Object Storage Service signature

Alibaba's OSS allows object authentication using a similar signature mechanism:

sub vcl_miss { ... }
Fastly VCL
declare local var.ali_bucket STRING;
declare local var.ali_region STRING;
declare local var.ali_access_key_id STRING;
declare local var.ali_access_key_secret STRING;
declare local var.ali_expires INTEGER;
declare local var.ali_canon STRING;
declare local var.ali_sig STRING;_
set var.ali_bucket = "test123";
set var.ali_region = "oss-cn-beijing";
set var.ali_access_key_id = "decafbad";
set var.ali_access_key_secret = "deadbeef";
set var.ali_expires = std.atoi(now.sec);
set var.ali_expires += 60;
set req.http.Host = var.ali_bucket "." + var.ali_region + ".aliyuncs.com";
set req.http.Date = var.ali_expires;
set var.ali_canon = if(req.method == "HEAD", "GET", req.method) LF LF LF
req.http.Date LF "/" var.ali_bucket req.url.path;
set var.ali_sig = digest.hmac_sha1_base64(var.alibaba_access_key_secret, var.ali_canon);
set req.url = req.url.path;
set req.url = querystring.set(req.url, "OSSAccessKeyId", var.alibaba_access_key_id);
set req.url = querystring.set(req.url, "Signature", var.ali_sig);
set req.url = querystring.set(req.url, "Expires", var.ali_expires);