Purging describes the act of explicitly removing content from the Fastly edge cache, rather than allowing it to expire or to be evicted. Once content has been purged, subsequent requests for that content will not be able to be satisfied from the edge cache and, in most cases, will trigger a request to an origin server.

Single object purges and surrogate key purges are completed across the entire global edge cache in around 150ms. Purge-all operations take longer, and are comparable to the time required to deploy a service configuration, around 10-20 seconds.

Purging use cases

Purging may be used in a variety of ways:

  • As an occasional, manual exercise to remove content that has been inadvertently served with the wrong cache settings (or which should not be public at all). Typically this is done via the web interface or fastly purge CLI command.
  • Incorporated into your application's deployment process such that, when you release a new version of your site, the edge cache is purged. Typically this makes use of the purge API or the fastly purge CLI command.
  • As a way to purge content when it changes in your content management system. This approach is particularly useful when combined with setting a very long TTL in a Cache-Control header so that traffic to origin is minimized, but end users are still assured of up to date content.

HINT: Our blog has more on purging content when it changes in the origin server: The rise of event-driven content (or how to cache more at the edge).

Surrogate Key tagging

While it is possible to purge both a single item (identified by URL), and also to purge an entire Fastly service in a single request, it is often useful to be able to perform a single purge operation for a group of content that shares a common trait; for example, all pages that mention a specific product. This is possible using surrogate key purges.

To tag content in preparation for purging it by key, add a Surrogate-Key header to the response when it's served from your origin server to Fastly:

HTTP/1.1 200 OK
Surrogate-Key: template-product product-id-724253 product-id-242129 offers
Content-Type: text/html

Surrogate keys can also be added to objects using edge logic before they are written to cache:

sub vcl_fetch { ... }
Fastly VCL
set beresp.http.Surrogate-Key = "template-product product-id-724253 product-id-242129 offers";

Fastly automatically removes any Surrogate-Key headers present on a response before delivering it to the end user (unless Fastly-Debug is included in the request).

Surrogate keys are subject to size limitations. Individual tokens may not exceed 1KB in length and Surrogate-Key header values (comprising one or more space-separated tokens) may not exceed 16KB in length. If either of the key or key header value limits are reached while parsing a Surrogate-Key header, the token currently being parsed and all tokens following it within the same header will be ignored.

Invalidation, eviction, and purging

Content stored in the Fastly cache is ephemeral by design; our systems manage the content that we store so that we maximize the proportion of requests that can be successfully served from the edge cache. When an object is in cache and available to be used, it is "fresh". Purging causes an object to be "invalidated", which means although it may still exist on disk, it cannot be used. "Eviction" describes the automatic invalidation of content from the cache by Fastly, as part of our cache management process.

The amount of time that objects remain fresh in the cache is affected by multiple things:

  • The instructions you give us via HTTP headers like Cache-Control, or in edge code. See freshness rules.
  • The size and popularity of the object. Larger objects, and those accessed less frequently, are more susceptible to being evicted from cache before the expiry of their cache lifetime.
  • Whether the cache server storing the object is the primary storage node for that object. See clustering.
  • Whether PCI compliance mode is enabled. Enabling PCI mode prevents Fastly from storing content on disk. It can still be cached in memory but is likely to be evicted sooner.

Content that is no longer fresh, either because it has reached the end of its allotted lifetime (TTL) or because it has been invalidated by a purge, may become stale.

Performing purges

There are three different types of purge, each accessible via a variety of interfaces to suit your use case.

Single purge

A single purge invalidates a single object from all cache servers in the Fastly edge cloud. This can be done via the web interface or by making an HTTP PURGE request directly to the URL you wish to purge. This is listed in the API reference as an API endpoint, but the request should be sent to the domain associated with your Fastly service, not to api.fastly.com. For example, using cURL:

$ curl -X PURGE http://www.example.com/path/to/object-to-purge

Single purges may also be executed using the fastly purge command in the fastly CLI.

WARNING: Single purges are unauthenticated by default. To require a valid Fastly API token for single purges, configure your service to add a Fastly-Purge-Requires-Auth header to the client request:

sub vcl_recv { ... }
Fastly VCL
set req.http.Fastly-Purge-Requires-Auth = "1";

Adding authentication to single purges will disable the ability to purge single objects via the web interface.

Single purges take around 150ms to complete and support soft purging as an option.

Surrogate key purge

A surrogate key purge invalidates all objects with a specified token in their Surrogate-Key header from all cache servers in the Fastly edge cloud. This purge is available via the web interface, via the API, or the fastly purge command in the CLI.

Surrogate key purges take around 150ms to complete, and support soft purging as an option. Surrogate key purges can also be done in bulk.

Purge all

The purge-all operation invalidates the entire cache for a single Fastly service. This purge is available via the web interface, via the API, or the fastly purge command in the CLI.

WARNING: Purging a large amount of content from a high traffic service is likely to result in a rapid increase in traffic to origin.

Purge-all operations take around 15-20 seconds to complete. Purge-all is not compatible with soft purge or bulk purge.

Soft vs hard purging

By default, purge operations will cause content to be invalidated, making it immediately unusable for future requests. It may not be immediately removed from disk but will be inaccessible and will in time be overwritten by new content. This is a hard purge.

In contrast, a soft purge marks the content as stale. This means that it will not automatically be available to serve in response to a request, but remains available to use in some circumstances:

  • If the stale content has a validator header (ETag or Last-Modified), and Fastly is able to successfully revalidate it with a conditional GET request to origin.
  • If the stale content has a stale-while-revalidate period and has been stale for less than that amount of time.
  • If the stale content has a stale-if-error period, has been stale for less than that amount of time, and the origin is sick or not responding.
  • If you instruct Fastly to use the stale content at runtime in edge code (such as VCL's return(deliver_stale)).

For more information on staleness and revalidation see our dedicated concept guide.

Soft purging is available for single purges and surrogate key purges, but not for purge-all. Options to enable it are available in all purge interfaces:

  • In the web interface, select the 'Soft purge' option.
  • When using the API, send a Fastly-Soft-Purge: 1 HTTP header.
    $ curl -X PURGE -H "Fastly-Soft-Purge:1" http://www.example.com/path/to/object-to-purge
  • In the CLI pass the --soft flag to the fastly purge CLI command.
    $ fastly purge --soft --url=http://www.example.com/path/to/object-to-purge

Soft purging, especially when combined with stale-while-revalidate, is a great way to reduce origin traffic spikes and to provide more reliable performance for end users. For more information about the benefits of stale content, see the blog post "Prevent application and network instability by serving stale content".

Bulk purges

A bulk purge accepts more than one content identifier in a single request and purges them all at once. Bulk purge is available only for surrogate key purges, via the API and the --file flag on the fastly purge CLI command. Bulk purges can purge multiple keys, but still operate on a single service.

Advanced and best practices

The following sections describe special considerations that may apply to your service depending on your use case.


If your service uses shielding, then a request from an end user may traverse two Fastly POPs before being forwarded to your origin server. This can cause a number of issues that prevent purging from working as expected.

Race conditions

Since the order in which caches are purged is not deterministic, it is possible that a request might reach a purged edge POP first, and be forwarded to a shield POP that has not yet been purged. The shield POP will deliver the cached content to the edge POP, which will cache it, resulting in the pre-purge content remaining in Fastly cache.

One solution to this race condition problem is simply to purge twice. For purge-all operations, the two purges should be around 30 seconds apart and, for single object and surrogate key purges, around 2 seconds apart.

However, a more elegant solution is possible with the addition of some edge logic. A purge-all operation will cause the cache generation value to change. This can be read in VCL using req.vcl.generation and in Compute@Edge using the FASTLY_CACHE_GENERATION environment variable. This value can be added to a response at the shield POP and then compared with the local value at the edge POP. If they differ, the edge POP should not cache the response. The following example shows how this can be implemented in VCL:

There is no way to mitigate the race condition issue in the case of surrogate key or single purges, but these take far less time to propagate and as a result encounter the problem much more rarely.

Stale content becoming fresh

If you use soft purging, a request for a recently purged object will encounter stale content in the cache and, if that content supports stale-while-revalidate, it will be served immediately to the client while Fastly sends a revalidation request upstream. With shielding enabled, the upstream server can be another Fastly cache, which also may have a stale copy of the object that also supports stale-while-revalidate. In this scenario, the shield POP will serve the stale content to the edge POP. For expired content, this is not a problem, because the content's Age header will show it to be already older than its max-age, and the edge POP will decline to cache it. For soft purged content, however, the edge POP may recache the object for the remainder of its max-age-defined TTL as fresh.

To learn more about this effect and how to mitigate it see staleness and revalidation.


The Vary header is used on responses to indicate that the response can only be used for requests with a specific header value. Commonly this is used to differentiate compressed responses from uncompressed ones by varying on Accept-Encoding, but can also be used in a variety of other situations, including caching logged-in and logged-out variants of a page separately.

In the case of single purges, the purge targets objects by cache key, and will therefore invalidate all variants of the object.

Surrogate key purges target objects by the surrogate key, not the cache key, so will invalidate only variants that have the key. In general, distinct variants of the same cache object usually have the same surrogate keys, but this is not a requirement. If an object has multiple variants in cache and only some of them match the key you have requested to purge, only those variants will be invalidated.

Purge-all invalidates all content in the service, so implicitly all variants of all objects are purged.

Versioned URLs

As well as explicitly purging content from the Fastly cache, an effective means of delivering an updated version of content is to publish it at a new URL.

When an HTML page is updated, its URL typically does not change, to avoid breaking inbound links and optimize SEO. However, assets that are loaded by the page, such as videos, images, scripts and stylesheets, are typically only linked to from pages that you control and don't need to maintain a constant URL. Therefore, when a page is updated, it is often possible to change the URLs of all the assets loaded by the page. The page you are reading right now uses this technique.

This is an optimal caching strategy because Fastly will send requests related to the updated page directly to origin to get a new version of the assets, whereas requests from old versions of the page will continue to benefit from assets cached at the edge.

Versioned URLs can take many forms. Here are some examples:

  • /script.js?v=2021-01-01T00:00:00: Using the time when the site was updated is a simple mechanism, but if you don't update the entire site in one atomic operation, you may create cache fragmentation.
  • /script.js?v=12: Using a version number works well if your site has a single overall version number.
  • /script.195bb21f.js: Putting a hash of the content in the filename is an effective stateless solution and favored by many web frameworks.

All these URL paths would be considered separate objects by Fastly. Serve these kinds of assets from your origin server with a long cache TTL, and consider setting the immutable cache directive:

Cache-Control: max-age=31556952, immutable

HINT: When using versioned URLs for assets, avoid performing purge-all operations, since this will also remove all versioned assets. Keeping older versions of versioned assets in the cache can improve user experience during deployments, especially if the older versions are not retained on your origin servers. Instead, consider using a surrogate key purge.

Surrogate key techniques

While we offer three types of purge, the surrogate key purge is the most powerful, flexible, and performant. It is, on average, 100x faster than a purge-all, supports soft purging, and unlike single purge, offers a way to purge multiple keys in one operation.

With the addition of some edge code, it is possible to purge all objects (the equivalent of a purge-all) or purge a single URL (the equivalent of a single purge) using surrogate key purge simply by ensuring that all objects share a single common key and that each has a key representing its URL or cache digest.

HINT: If using versioned URLs as described above, implementing a custom 'purge all' using surrogate keys offers the opportunity to not purge versioned assets - for example, by not adding the "all" surrogate key to responses that include an immutable Cache-Control directive.

With this code in place you can replace a purge-all with a surrogate key purge for the key "all", and you can replace any individual single purge with a surrogate key purge for a key matching the URL path of the object you want to purge (e.g., "/products/t-shirt").


Purge-all operations are logged automatically in the event log. Surrogate key purges and single purges are not recorded by default.

If your service has a logging endpoint connected, you can log single purges to that endpoint by emitting log events in edge code. For example, this code would log a querystring-formatted log line with the service ID when a purge is received.

sub vcl_recv { ... }
Fastly VCL
if (req.method == "FASTLYPURGE") {
log "syslog " req.service_id " my-log-endpoint :: event=purge&service=" req.service_id;

There is no way to log surrogate key purges.

Limitations and constraints

  • Purge-all operations contribute to global API rate limits.
  • Surrogate key and single purges are not counted as part of API rate limits but are separately limited to an average of 100,000 purges per hour, per customer. In the case of surrogate key purges, each key targeted counts as one purge regardless of how many objects are tagged with that key.
  • Upon creation, API tokens are propagated across the Fastly network. If you create an API token and then immediately attempt to use it for purging, you may encounter a 401 (Unauthorized) response. Token propagation may take up to 90 seconds.

Purge mechanism

When you send a purge request to Fastly, it is received and initially processed by a cache server in a POP close to your location.

Single purges and surrogate key purges are distributed from that machine out to the entire Fastly cache network using a variant of a gossip protocol, and take around 150ms to reach every cache server in the network.

Purge-all requests are different. Every Fastly service includes a hash function that generates the cache key for a given request. In addition to request properties such as the URL, query string and request method, this function also includes a unique identifier for the service, and a number we call the 'cache generation'. In response to a purge-all, Fastly will recompile your service configuration, incrementing the cache generation, and redeploy it to the network. This means that two identical requests received before and after the purge has completed will result in different cache keys. By this mechanism the previous cache generation is placed 'out of reach' and the space it occupies will be reclaimed by the normal cache management process.

The current cache generation is readable in edge code as req.vcl.generation in VCL and the FASTLY_CACHE_GENERATION environment variable in Compute@Edge.


Reverting purges

Because purge-all works by changing a property of the service, not by modifying cached content itself, it is possible to revert a purge-all, by decrementing the cache generation. This operation can only be executed by a Fastly employee. If you purge-all by accident you can request a revert by reaching out to support@fastly.com. Reverting a purge does not guarantee that all previously cached content will return. The sooner a purge is reverted, the more of the cached content will be restored.

It is not possible to revert a single purge or a surrogate key purge.

Diagnosing purge failures

If a purge appears to have no effect, consider the following:

Single purges:

  • Hash algorithm: If you have changed your service's hashing algorithm (vcl_hash in VCL services), take care to ensure that your purge request is causing the same lookup in cache as the request that caused the content to be cached.
  • Rewriting URLs: If your service configuration rewrites URLs, or changes any other property of the request in a way that affects the cache key, you may not be targeting it correctly.
  • Shielding: If your service has shielding enabled, ensure that the request forwarded to the shield POP has the same request properties as the one originally received at the edge POP (i.e., a POP acting as an edge in a shielding configuration should avoid modifying the request)
  • Stale content: If your purge was done as a soft purge, you may continue to be served the pre-purge content for some time after the purge is completed. To understand why stale content is being served see Staleness and revaldiation.

Surrogate key purges:

  • Capitalization: Surrogate keys are case sensitive. Ensure the case of the key specified in your purge request matches that on the content.
  • Checking keys present: To determine which keys are present on a piece of cached content, request the content URL with an additional Fastly-Debug: 1 header. The response will include the Surrogate-Key header, helping you to understand which keys would cause the content to be purged.
  • Stale content: If your purge was done as a soft purge, you may continue to be served the pre-purge content for some time after the purge is completed. To understand why stale content is being served see Staleness and revaldiation.