Content Delivery API > Cache Tags

Cache Tags

DatoCMS Cache Tags help optimize your website or app's caching. They allow developers to simply tag webpages with unique identifiers, so when the content from the CMS is updated, these tags can trigger an immediate and precise cache invalidation only for the pages that actually include that content, and need to be regenerated.

The main benefits include:

  • Visitors can instantly view the most updated version of the content, while maintaining the benefits of completely static and cached content.

  • Hosting expenses and DatoCMS resource usage can be dramatically reduced thanks to a precise caching mode that does not rely on time-based invalidation methods, or a total invalidation of the entire site when anything changes.

  • It entirely relieves the developer of the duty to manage cache invalidation, a task which is instead taken care of by DatoCMS itself.

For a more comprehensive understanding of DatoCMS cache tags and the problem it solves, we recommend reading the feature's announcement which provides some additional background.

How does it work?

Implementing cache tags on your app is a three-step process:

  1. Modify your existing GraphQL queries by adding a new X-Cache-Tags header;

  2. Tag your frontend pages with cache tags received from DatoCMS;

  3. Implement a specific endpoint to invalidate the tags that DatoCMS sends you via webhook.

All three steps are designed to be quite straightforward to implement, allowing you to benefit from the advantages this method offers in a very short time. Let's look at them in detail.

Step 1: Retrieve cache tags

Every response from the Content Delivery API has the capability to return a list of cache tags associated both with the query, and its results. To access these cache tags, simply add the following header to your existing GraphQL POST requests:

X-Cache-Tags: true

With this new header included (and the use of the --include flag to show HTTP headers), a CURL request would look like this:

$ curl 'https://graphql.datocms.com/' \
-H 'Authorization: YOUR-API-TOKEN' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'X-Cache-Tags: true' \
--include \
--data-binary '{ "query": "query { allPosts { title } }" }'

The X-Cache-Tags is one of many headers you can use to shape up the behavior of the Content Delivery API. Refer to the related section for more information on the other available headers in the Content Delivery API endpoint.

The response (omitting what's not related to cache tags) will include a new X-Cache-Tags header:

HTTP/2 200
...
X-Cache-Tags: BQD?* 2.a*q f7e N*r;L 6-KZ@ t#k[uP t#k[ub t#k[uU
...
{
"data": {
"allPosts": [ ... ]
}
}

The X-Cache-Tags that appears in the response is a space-separated list of strings: each string represents a cache tag, carefully generated to cover all possible invalidation scenarios.

Cache tags are not readable, and that's a good thing!

DatoCMS provides cache tags that are intentionally opaque, to prevent misinterpretation and misuse on your end. Cache invalidation is a complicated process with a high possibility of errors and overlooking specific edge-cases. Our cache tags help us handle these complexities for you. Their non-transparent nature also allows us the flexibility to improve our tagging strategies in the future, without necessitating changes on your frontend.

Step 2: Apply the tags to your website pages

This step strongly depends on both the frontend framework and hosting solution you use. However, the fundamental concept is that each artifact that your website produces (such as HTML pages, API responses, etc.) that uses content coming from DatoCMS, should be marked with the cache tags provided in the GraphQL response.

Read the section Integrating DatoCMS cache tags on your project below for more details.

Step 3: Implement the "Invalidate cache tag" webhook

After tagging your frontend artifacts, we need a method to invalidate them when necessary. In implementing a caching mechanism, this is traditionally the most complex step to tackle.

Fortunately, DatoCMS handles the complex job of tracking every possible alteration in your schema, text, images, and videos for you. When any change happens, DatoCMS can immediately send a list of tags that need invalidation to your frontend through a single webhook.

Within your Project Settings, create a new webhook. Choose the "Invalidate" event of the "Content Delivery API Cache Tags" entity as the trigger:

The requests that the webhook will send will be in this JSON format:

POST /your/invalidation/endpoint HTTP/1.1
Content-Type: application/json
{
"entity_type": "cda_cache_tags",
"event_type": "invalidate",
"entity": {
"id": "cda_cache_tags",
"type": "cda_cache_tags",
"attributes": {
"tags": ["N*r;L", "6-KZ@", "t#k[uP"]
}
},
"related_entities": []
}

The final step is to implement the endpoint that will receive incoming requests from the webhook. The task of this endpoint will be to execute cache invalidation based on the received cache tags.

Just like Step 2, the ways in which you can perform cache invalidation through tags greatly depend on the frontend framework and hosting solution you use. In some instances, it's an API call, whereas some frameworks offer specific helper functions. Read the next section to learn more.

Integrating DatoCMS cache tags on your project

Depending on the chosen stack, the practical implementation of cache tags can vary significantly, and in some instances, it may not be entirely feasible.

To aid you in navigating the possibilities, we can differentiate between two main paradigms:

Case 1: Origin server + CDN

This first paradigm is more versatile and relies on well-known web standards.

If your website or application can define custom HTTP headers in the response on a per-page basis, then regardless of the specific language or framework used, you use DatoCMS Cache Tags by placing a CDN on top of your website that supports Tag-Based Cache Invalidation.

What is Tag-Based Cache invalidation?

Tag-based cache invalidation is a method where keywords (tags) can be assigned to cached pages. This technique is provided by all the major content delivery services such as Netlify, Fastly, Bunny and Cloudflare. In a nutshell:

  • Assign Tags: When your application delivers a page, it can specify a series of tags in a specific response header (the header's name depends on the CDN). These tags serve as labels, that represent the content within that page.

  • Caching: The response is stored in the CDN cache with its primary cache key — the URL — plus the associated tags.

  • Purging: If any content linked to a particular tag is updated, instead of searching through all cached pages, the CDN can quickly identify and remove all items associated with that specific tag.

It's important to know that different services use different names for the same underlying concept technology. For example, Fastly refers to cache tags as "Surrogate Keys". The header with which your application can declare the tags to the CDN also varies depending on the service. With Netlify and Cloudflare, the name is Cache-Tag, while Bunny refers to it as CDN-Tag. What we in this documentation call "cache invalidation," other services refer to as "cache purge".

Make sure to refer to the specific documentation of your CDN to know the details, format, and any potential limitations.

A practical example: Remix + Fastly ✨

To illustrate a combination of tools that fit into this category, we have put together a tutorial on implementing DatoCMS Cache Tags with Remix as the framework and Fastly as the cache-tags-capable CDN on top of the Remix app.

Case 2: Framework-centric approach

Some frameworks are created to protect the developer from the complexities of HTTP and architectural stack issues associated with tag-based caching methods. By using platform-specific adapters, they aim to handle all the implementation details for you. Developers are provided with a more abstract and general level of control over tag-based caching, in the form of helpers and functions that can function across different hosting environments.

A practical example: Next.js ✨

A prominent example in this category is Next.js, whose fetch() and revalidateTag() function are the founding blocks for using cache tags, together with the framework inner logic.

We have covered in detail how to implement DatoCMS Cache Tags on a Next.js project in the relevant section of our documentation.

What will be the final cache hit ratio?

It is very difficult to answer this question precisely, as it is connected to a large number of factors including the type of site traffic, the frequency of content updates, the content present in your pages, the GraphQL queries you execute, and the reliability of the cache in the selected framework and hosting.

Sometimes, it's simple to guess which pages will be invalidated when a content change occurs: for instance, if a blog's homepage showcases the latest posts, it's clear that adding a new post on DatoCMS will invalidate the homepage. Another straightforward example: let's say you have a query that pulls content for your website's navigation bar: any pages including that navigation bar need to be invalidated when the query creates new content.

Other cases are less obvious to grasp: suppose that a post can belong to some categories, maybe more than one category. Which are the pages invalidated when an editor changes a post's categories?

So, without being able to predict the actual result in terms of hit ratio, it is certainly possible to say this:

  • Regardless of the frequency of invalidation, a superior result will still be achieved with DatoCMS Cache Tags, compared to redeploying the entire website, invalidating all pages for each individual content change.

  • The benefits of cache tags increase as the number of pages on a website grows.

Encoding of cache tags

Cache tags are supposed to be opaque to the user, which means you don't have to know the meaning conveyed by each tag to use it. However, it may be useful to know and consider the encodings of the tags so that you can make sure they work properly across your tech stack.

Each tag is a string encoded using an alphabet of 83 symbols, which allows them to be considerably shorter than they would be if the encoding used a smaller set of symbols. Here is the list of all the symbols you can find in a tag:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz#$%*+,-.:;=?@[]^_{|}~

As you can see, we omitted the characters usually used to mark the beginning and end of strings, such as single and double quotes. Since we employ both uppercase and lowercase symbols, case sensitivity is important. Therefore, if your system isn't case-sensitive, some transcoding may be necessary.