Service chaining

If one Fastly service is configured to be the backend for another (different) Fastly service, this setup is known as service chaining. This is conceptually similar to shielding except that instead of being processed by the same service configuration in multiple POPs, requests are processed by multiple service configurations in the same POP.

Service chaining illustration

Scenarios involving service chaining are complex, especially when combined with clustering and shielding. Simpler solutions are often appropriate, but service chaining remains a useful tool in a variety of cases:

  • Routing: Where traffic to a domain (such as www.example.com) maps to more than one Fastly service, a service chain can facilitate the selection of the correct service that depends on some additional request characteristics (e.g., URL path prefix or cookies).
  • Segmenting developer access: Where different elements of your Fastly configuration need to be accessible to different groups of engineers, a service chain allows each service to have different permissions.
  • Cache sharing: Each Fastly service has a separate cache. Service chaining may be useful if you want requests that are handled by different services to have access to the same cache.
  • ESI generation: Where Edge Side Includes tags are used in a response generated at the edge using synthetic, the ESI tags are not parsed and processed within that service. Placing another service downstream allows edge-generated content that includes ESI directives to process the ESI directives at the edge.
  • C@E/VCL handoff: Customers with existing VCL services that are migrating to the Compute@Edge platform, or who want to use features of both platforms, may choose to put one type of service in a chain with the other type.
  • Internal redirection: If a request is received on a domain belonging to service A, but should be redirected to a domain belonging to service B, chaining to service B is an alternative to instructing the client to redirect their request.
  • Unintentional: Some third party services that are intended to be deployed as part of your application architecture are Fastly customers, and if you are also a Fastly customer you might end up chaining your Fastly service to another Fastly service in a different customer account, unknowingly. This is unlikely to be a problem unless it triggers loop detection

Where services are chained, the service that initially handles a request from an end user is the first service. When this service passes traffic to another service, that's the second service within the service chain. The second service will normally then pass the request to a customer origin - but may also (rarely) chain to a third service.

Enabling service chaining

Fastly selects the service that will process a request by using the Host header. The service that should receive public traffic (the 'first' service) and the service that should receive requests from the first one (the 'second' service) need to be assigned different, publicly resolvable domains. For example, the first service may be handling the public domain www.example.com and chain to a service attached to chained-service.example.com.

To make one service the backend for another, follow these steps:

  1. Configure your DNS settings to point both domains to the correct Fastly CNAME.
  2. In the first service:
    • Add a domain using the hostname you want to use for public traffic (e.g., www.example.com).
    • Add a backend using the hostname of the second service (e.g., chained-service.example.com).
    • Configure the backend to use a host header override (override_host in the backend API or the advanced panel of the web interface) and set it to the same hostname you used to define the backend (e.g., chained-service.example.com).
  3. In the second service:
    • Add a domain using the hostname that you have assigned (e.g., chained-service.example.com).
    • Add an backend using the hostname of your own origin infrastructure. (e.g., aws-elb-2.example.com).

In addition to specifying a host override on the backend definition, it's also possible to do so in VCL, by setting the value of req.http.host in both vcl_miss and vcl_pass:

set bereq.http.host = "chained-service.example.com";

IMPORTANT: Unless the second service's domain resolves to a service-pinned IP, the host override is essential. If you skip it, you will end up triggering a loop. See loop detection.

A single service can be chained to more than one other service. Likewise, a single service can act as the backend in more than one service chain. In fact, most service chaining scenarios involve a many-to-one relationship between services.

Service chaining and shielding

When a request is made from the first service to the second one, Fastly will route the request internally within the cache server, so that there is no network latency (unless local route bypass is enabled). This means that while service chaining provides the benefit of composing distinct packages of edge logic together, it does not consolidate requests into a single POP. To do this requires the use of shielding alongside service chaining.

The routing of requests when both chaining and shielding are active is complex and multiple strategies are possible.

The most straightforward way to combine shielding with service chaining is to enable shielding only on the second service's backend definition (or the last service in a chain in the rare instance of more than two services). This creates a chain-first strategy, in which the request will move from the first to the second service within the edge POP, and then (if the shield location is different from the edge POP for that request) the request will be transferred to the shield POP and processed by the second service again.

Alternatively, enabling shielding on the first service instead of the second one will create a shield-first strategy, in which the request will be processed by the first service in the edge POP, and then be transferred to the shield POP where it will be again processed by the first service, but also by the second service too.

These strategies can be visualized like this:

Shielding configurations

The effects of choosing one of these strategies over the other include:

  • Cache performance: A chain-first strategy may increase the probability of a cache hit at the first Fastly POP, if the two services are configured to cache different resources.
  • Cost: In most cases Fastly services are billed based on Fastly egress, so the two strategies may result in different billing implications for your Fastly services.
  • Fragmentation: A shield-first strategy results in fewer copies of cached objects since the second service runs in only one POP.

Shielding and chaining in one operation

With custom VCL, it is possible to tell Fastly to switch to a different service as part of a shielding request. This means that when using a shield-first strategy, the number of copies of cache objects can be further reduced by avoiding the need to pass the request through the first service at the shield POP.

This strategy also results in each request being processed by each service in the chain exactly once, which can simplify the logic present in each of the service configurations and also make it easier to reason about.

Shielding configurations

To switch services while shielding:

  1. Set up the two services in a normal shield-first strategy, as outlined above (i.e., enable shielding on the first service).
  2. In vcl_miss and vcl_pass, set bereq.http.host to the hostname of your second service when a request is being transferred to the shield POP.

For example, the VCL code for the vcl_miss and vcl_pass subroutines could be:

sub vcl_miss { ... }
Fastly VCL
if (req.backend.is_shield) {
set bereq.http.host = "chained-service.example.com";
}

WARNING: Shielding to a different service must target the destination service using a Host header, and is incompatible with IP-to-service pinning.

Chaining with shielding in Compute@Edge

Currently, Compute@Edge does not support shielding, making service chaining a useful mechanism for adding shielding to C@E services. For a shield-first strategy, the C@E service must be the second service, and for a chain-first strategy, the C@E service is the first service - since the shielding is performed in the VCL service.

IMPORTANT: When using VCL as a first service forwarding requests to a Compute@Edge second service (a shield-first strategy), enable local route bypass to ensure that the request will be handled by a server that is Compute@Edge-capable.

Preventing direct access to chained services

While the first service is intended to receive direct traffic from end user clients, there is (by default) nothing to stop end users making requests directly to the second service as well. However, you may want the second service to only ever receive traffic from the first service.

A simple but insecure way to do this is to set a shared secret into a custom header in the first service and check that it is present in the second one:

First service
sub vcl_recv { ... }
Fastly VCL
set req.http.Edge-Auth = "some-pre-shared-secret-string";
Second service
sub vcl_recv { ... }
Fastly VCL
if (req.http.Edge-Auth != "some-pre-shared-secret-string") {
error 403;
}

While this technique will prevent clients from inadvertently accessing the second service, it is possible for the client to intentionally set the necessary header in order to masquerade as a request forwarded from your first Fastly service. Secrets that are constants can also be easily leaked if a request is ever forwarded to the wrong host.

Why not restrict access to Fastly IPs? Expand

For a secure solution, construct a one-time, time-limited signature in the first service, and verify it in the second service.

First service
sub vcl_miss { ... }
Fastly VCL
declare local var.edge_auth_secret STRING;
set var.edge_auth_secret = table.lookup(config, "edge_auth_secret"); # Consider using a private edge dictionary
# Should be called in both vcl_miss and vcl_pass subroutines
if (!bereq.http.Edge-Auth) {
declare local var.data STRING;
set var.data = strftime({"%s"}, now) + "," + server.datacenter;
set bereq.http.Edge-Auth = var.data + "," + digest.hmac_sha256(var.edge_auth_secret, var.data);
}
Second service
sub vcl_recv { ... }
Fastly VCL
declare local var.edge_auth_secret STRING;
set var.edge_auth_secret = table.lookup(config, "edge_auth_secret"); # Consider using a private edge dictionary
if (
req.http.Edge-Auth ~ "^(([0-9]+),[^,]+),(0x[0-9a-f]{64})$" &&
digest.secure_is_equal(digest.hmac_sha256(var.edge_auth_secret, re.group.1), re.group.3)
) {
declare local var.time TIME;
set var.time = std.time(re.group.2, std.integer2time(-1));
# Verify the timestamp is not off by more than 2 seconds
if (!(time.is_after(var.time, time.sub(now, 2s))
&& time.is_after(time.add(now, 2s), var.time))) {
error 403; # Expired
}
} else {
error 403; # Invalid
}

It's a good idea to store the secret outside your VCL, in a private edge dictionary. In the example above, the code assumes the existence of a dictionary called config, with an item called edge_auth_secret, which contains the string you want to use as the HMAC secret to construct and verify the signature.

Advanced chaining

Bypassing local routing

By default, Fastly cache servers will internally route and handle any request to a Fastly IP within the same machine, except for shielding requests. This means that a request from a first service to a second one, unless it is going inter-POP as part of a shielding request, will be handled within the same server.

This is normally a good approach, because it means the handoff from one service to another incurs no latency. However, in some situations, it may be preferable to resolve the second service's domain publicly and route to the resulting cache server:

  • VCL to Compute@Edge chaining requires local routing bypass because not every cache server that handles VCL services has the capability to handle Compute@Edge (however, every C@E-capable machine can also handle VCL services, so chaining in the opposite direction can use the default, local routing).
  • Hotspots may arise where clustering in the first service focuses requests on one server. While this is normal clustering behavior which increases cache efficiency, requests passed from the first service to a second one will end up distributed across the available servers in the POP based on the distribution of objects in the cache, rather than using our normal load balancing strategy. This is rarely a problem but if it causes a performance degradation in your service, consider bypassing local routing.

IMPORTANT: Local route bypass is a protected feature which must be explicitly allowed on your service by a Fastly employee before the route bypass setting will take effect. Contact support@fastly.com to make a request.

To enable local route bypass, set .bypass_local_route_table = true in the backend declaration in VCL. For example:

backend F_example_com {
.bypass_local_route_table = true; # <-- Local routing bypass
.always_use_host_header = true; # <-- Override host
.host_header = "chained-service.example.com"; # <-- Override host
.host = "chained-service.example.com";
.port = "443";
# ... other normal backend properties ...
}

The bypass_local_route_table option is not available in the web interface or API, so backends that require this feature must be defined in VCL. Since the backend will be defined in VCL, take care to ensure that it also has the always_use_host_header and host_header options set, which implement the host header override required for service chaining and would otherwise be set using the API or web interface as part of a standard service chaining setup.

Compute@Edge to VCL chaining

Compute@Edge services currently offer a fetch API that performs a backend fetch through the Fastly edge cache, and stores cacheable responses. There is no way to adjust cache rules for objects received from a backend before they are inserted into the cache within a Compute@Edge service. As a result, if you need to process received objects before caching them, or to set custom cache TTLs, a solution is to place a VCL service in a chain with a Compute@Edge one.

In this scenario it is usually advisable to configure the Compute@Edge service to pass all backend fetches, ignoring the cache within the Compute@Edge service in order to delegate caching concerns to the VCL service.

  1. Rust
  2. AssemblyScript
req.set_pass();

While setting the Compute@Edge request to pass will normally provide the desired behavior, in edge cases it may be necessary to configure the C@E service to skip the cache layer entirely. This behavior can only be enabled with a flag by a Fastly employee, and can be requested by contacting support@fastly.com.

Chaining more than two services

In general, there should rarely be any reason to chain more than two Fastly services together. Currently the only use case we encourage this pattern for is to create a VCL to Compute@Edge to VCL "sandwich", in order to make use of features in both the VCL and Compute@Edge platforms, on both sides of the compute logic.

Purging caches

When services are chained, a request to origin might result in objects being saved into caches belonging to multiple services. One way to avoid the challenges created by this is to ensure that only one of the services participating in the chain stores responses in cache.

However, if multiple services within the chain are using their cache, then each service must be purged individually in order to entirely remove the object from the edge cache, and they must be purged in downstream order, i.e. from the origin to the client. This is because if the first service in a chain purges an object, and then receives a request for that resource, it is possible that it will pull a cached version of the object from the second service in the chain, and write it to its own cache as a fresh entry.

If any of the services in the chain has a unique way of calculating the cache key (vcl_hash in VCL services) that is not present in the other services in the chain, or if any service in the chain manipulates the request in a way that would change its cache key before passing it on to the next service, then the object to be purged may not be labelled in the same in each cache. Consider using purge all or surrogate keys to make it easier to refer to objects in a consistent way regardless of which service they are being purged from.

Troubleshooting

Service chaining can be a complex solution. Here are some of the more common problems that can arise:

Loop detection

When Fastly receives a request that we believe is going around in a loop, we will terminate processing of the request and return a 503 error with the response status Loop detected. If you receive this error, follow these steps to resolve the problem:

  1. Ensure that the first service is overriding the Host header. If the first service does not change the Host header before forwarding the request to the second service, then the request will be processed by the first service again, because both services' domains resolve to Fastly.

  2. Check that the number of services and Fastly hops in the service chain is less than the limits enforced by Fastly. The limits are counted separately for VCL and Compute@Edge platforms:

    • VCL: Max 11 hops, and 3 unique services
    • Compute@Edge: 5 hops, and 3 unique services

    HINT: A hop is counted when a cache server starts to process a request for a service, so for example, a single VCL service with clustering and shielding enabled may experience up to 4 hops before the request is forwarded to origin.

Transiting more than two POPs

Shielding is intended to consolidate requests into a POP close to your infrastructure, and often hugely improves performance, but there is typically no benefit to passing a request through more than two Fastly POPs. If a response shows three or more POP identifiers in the X-Served-By or Fastly-Debug-Path headers, and you are using service chaining, then it may be that the chained services are both configured to shield, but in different locations.

In general, enable shielding on only one of the services in a service chain. If you want to enable shielding on all of them (perhaps because the second service also receives direct end-user traffic) then make sure that all the shielding configurations are set to the same shield POP.

Cached content appearing despite purging

In chained services, cached content can be hard to remove, because it is saved into multiple independent caches. If you experience cached content and no origin traffic even after purging all the services in a service chain, ensure that the services are purged in the correct order. See purging caches for more details.