The DatoCMS Blog

Introducing DatoCMS Cache Tags: Effortless, surgical page regeneration on content change

Posted on July 15th, 2024 by Stefano Verna

Today, we're so excited to announce a groundbreaking improvement for our users. DatoCMS Cache Tags enable any web project to achieve the perfect balance of performance, efficiency, and real-time updates.

Sophisticated caching techniques have been around for a while, but they have always been incredibly complex to implement. With DatoCMS Cache Tags, we're empowering projects and teams of all sizes to finally leverage best-in-class, surgical caching, with zero effort on their end.

This means:

  • You can achieve top-notch performance and response times thanks to completely static and cached content.

  • At the same time, visitors can access the latest version of any page, seconds after the changes have been published.

  • As a direct effect of fully cached content and minimal cache invalidation, hosting expenses and DatoCMS resource usage can be dramatically reduced.

Say goodbye to costly and inefficient time-based invalidation methods — or worse, complete site invalidation/rebuilds whenever changes are made: with DatoCMS, you can achieve significantly better results.

How do DatoCMS Cache Tags work?

Starting today, every response from the Content Delivery API can expose its own list of associated cache tags:

The X-Cache-Tags header offers all the tags associated to a specific GraphQL request to the CDA

Cache invalidation is a complex procedure with a high likelihood of mistakes and missing certain exceptional scenarios. These tags have been carefully designed and evaluated over several years to manage all possible invalidation situations. They are purposely non-transparent to avoid misunderstanding and accidental misuse on your part.

Your task is simply to take these tags and apply them to the final page. Depending on your framework or hosting solution, this often involves just a couple of lines of code (more details below).

Automatic Tag Invalidation

DatoCMS manages the complex task of tracking every potential change in your schema, text, images, and videos. When any change happens, it will instantly send a list of tags that require invalidation to your frontend via a single webhook:

An example of a webhook you'll receive from DatoCMS, informing you of the tags that need invalidation.

In your frontend, you only need to implement a single API route to receive such requests. Simply pass these tags to the invalidateTags() function provided by your framework or hosting solution.

Done. ✅

Production-ready, effective immediately

We've been using cache tags for years in DatoCMS to handle billions of responses from our own Content Delivery API, consistently achieving a cache hit-ratio of over 90% — very often surpassing this value by a significant margin.

After years of testing and verification, we can confidently state that this technology is ready to be shared. Now it's time for your web projects to reap the same benefits.

The holy grail of any web experience: Static, Pre-Rendered Content

The ultimate goal for web experiences is to serve static, pre-rendered content directly on the edge. This approach ensures optimal performance and efficiency.

The foundational technologies for this are already within reach. The core concept is to generate the page the first time it's visited — or, if it's crucial, during deployment — and cache the result for subsequent visitors, directly on the edge.

Time-Based Invalidation: A common but inefficient solution

The most commonly used solution is time-based invalidation, where a "validity time" is set — say, 60 seconds? — after which the cached page is purged and regenerated upon the next visit.

Although very simple to implement, this technique suffers from a significant problem: websites with a lot of content rarely update every page every minute. In fact, most pages stay the same for months. The product pages of an e-commerce site are a perfect example: the PLP (Product Listing Pages) change quite frequently, but the PDP (Product Detail Pages) remain unchanged for very long periods.

Indiscriminately refreshing every page every 60 seconds is a monumental waste of resources and money.

Why Time-Based Invalidation falls short

Increasing the validity time — whether to an hour or a week — is not a viable solution. If the validity time is too long, any publishing error would remain online too long, and content editors would face delays seeing their changes reflected online.

Time-based invalidation is almost never the right answer; it's a compromise that never fully satisfies the need for timely updates and efficient resource use.

Tag-Based Cache Invalidation: A superior approach

Tag-based cache invalidation is a superior alternative. Once a page is generated, it stays valid indefinitely and continually serves visitors until explicitly invalidated. This method avoids unnecessary regenerations, while allowing instant updates when needed.

What is Tag-Based Cache invalidation?

Tag-based cache invalidation is a feature supported by all major content delivery services — ie. Netlify, Fastly, Bunny, Cloudflare, Vercel via Next.js — where keywords (tags) can be assigned to cached pages. When the content changes, it is easy to invalidate or remove all cached pages associated with that particular tag. In a nutshell:

  • Assign Tags: When your application delivers a page, it can specify a series of tags in a specific response header (the header's name depends on the CDN). These tags serve as labels, that represent the content within that page.

  • Caching: The response is stored in the CDN cache with its primary cache key — the URL — plus the associated tags.

  • Purging: If any content linked to a particular tag is updated, instead of searching through all cached pages, the CDN can quickly identify and remove all items associated with that specific tag.

Let's take an example. A single piece of content, like an article, can often be found on several pages of a website. It's most visible on its dedicated page, but it's also likely to appear on the paginated archive, the author’s detail page, and so forth. By instructing the CDN to tag all these pages with, for example, #article-1337, the moment the article is updated in the CMS, all these pages can be instantly refreshed with a single API call to the CDN.

So, tag-based cache invalidation simplifies the process of updating cached content by grouping related items together under tags for efficient and targeted purging when changes occur.

The complexity of managing Explicit Cache Invalidation

"So why," you might wonder, "is time-based invalidation still the go-to despite its glaring inefficiency? Why not opt for a superior solution that could slash costs, boost performance, and offer real-time updates for our users?"

Because managing explicit cache invalidation is incredibly complex, especially if you want to do it accurately, reliably, and consistently.

It’s the developers who are tasked with the challenging job of meticulously identifying which specific “content elements” populate each page, and which data queries contribute to its result, so that each page can be tagged accurately. They're the ones on the hook to monitor any event on the CMS, to stay informed about when or which resource gets updated.

Building this kind of system is no walk in the park. It's a maze of complex scenarios you need to navigate without a single misstep, or the end result becomes entirely unpredictable. You might find yourself over-tagging, lowering the cache hit rate to such an extent that everything turns almost dynamic. On the other hand, you may tag too little, running the risk of providing outdated content without even realizing it.

How confident are you, or your developer team, in building such a system? More importantly, does your team have the resources and bandwidth to embark on such an endeavor?

DatoCMS Cache Tags: The solution

With DatoCMS Cache Tags, you can leverage the power of tag-based invalidation, without the complexity. Thanks to our internal experience spanning years, we've made it simple and reliable, allowing you to focus on your content while we take care of all the intricacies of cache management.

Get started with DatoCMS Cache Tags!

To help you begin using this technology right away, we've prepared all the essential resources:

If you have any questions or need further assistance, as always, join our Community Forum, where we'll provide all the answers you need.

Ready to simplify your cache management while turbocharging it at the same time? Get started with DatoCMS Cache Tags today, and enjoy the benefits without the complexity. Happy coding! 🚀