Caching at the Edge – CDNs for everyone – Why?



Caching at the Edge – CDNs for everyone – Why?

1 0


talk-caching-at-the-edge-cdns-for-everyone

Presented together with Fabian Franz!

On Github wimleers / talk-caching-at-the-edge-cdns-for-everyone

Caching at the Edge

CDNs for everyone

Motivation

The real challenge: making the entire web fast

wimleers.com/article/performance-calendar-2013-making-the-entire-web-fast

Silos are fast, the long tail is slow.

Let's fix that.

Why?

Why do this: a quick reminder

The edge?

Your origin usually lives in one data center. Not all your end users live close that data center. The further away the end user is, the slower your site will be — this is latency. The edge are servers very close to your end users. CDNs are an easy way to host content at the edge: without CDNs, everybody would have to set up servers close to all of their end users all over the world.

Using a CDN

assets ⟷ HTML

There are essentially two ways of using a CDN: Use it just for assets; the HTML is still served from your origin. The advantage is that this is easy; you don't have to deal with problems like cache invalidation or authenticated user traffic. Use it for HTML, i.e. for serving the actual pages of your site. Greater benefits, but also requires a lot of complex setup.
Why?

Because closer to the end user (less latency)

This talk is about using a CDN to cache the HTML.

Drupal 7

Invalidation: simple

max-age = 300

… but stale content!

Invalidation: advanced

Purge affected URLs

… but brittle!

This shows the Acquia Purge module in action, which automates the purging of affected URLs. But this is also not at all a comprehensive solution: some content may still be missed.

In practice

  • Combination of both
  • Allow content editors to purge manually
  • … site-specific hacks
In practice, the combination of both ensures that max-age handles the URLs that we forgot to purge. When content editors want to see the updated or new content immediately, they're often given the ability to purge manually. And beyond that, many sites have their own hacks/work-arounds that match their specific needs.

Authenticated users

Separate domain ⟷ passthrough

Nothing is cached!

Since we cannot reliably know which parts of the page are personalized, and personalized to which extent (per role, per permission or even per user?), and we can definitely not know this for parts of the page generated by contributed modules, we have very few options. Hence 99% of Drupal 7 sites that use a CDN for their HTML choose to not cache it at all.

Drupal 8

Better architecture.

No more hacks.

The goal was simple: remove the need for all these hacks.
invalidation ➞ cache tags authenticated users ➞ cache contexts Two new concepts, two solved problems. Fixing the problems with architecture. We don't want stale content. We don't want to determine all the URLs that a piece of content appears on.
cache tags ≈ data dependencies cache contexts ≈ request context dependencies More details later, for now this is all you need to know. For invalidation, we have cache tags, which reflect data dependencies: when data changes, cache tags are invalidated, which ensure correct invalidation on the CDN. For authenticated users — or really, personalized pages in general — we have cache contexts. They reflect request context dependencies. This ensures the CDN can still cache pages that are not actually personalized, or for advanced setups, it can actually cache personalized parts of the page right in the CDN.

Demo 1: anon traffic

Demo!

Trivial.

<10 €/month.

… yet serve millions of pages/month.

Hopefully some hosting providers will step up and provide this as a nicely integrated service!

And now… the demo you've all been waiting for!

Demo 2: auth traffic

Demo!

Standardized solution for a complex problem.

Out of Within reach for 99%

Unmatched

Some hosting providers will step up

So, Drupal 8 can be super fast!

It can be cached at the edge.

Without the pain of other CMSes (including Drupal 7).

It just requires all of us to think of a few things while developing custom modules or contributed modules.

The thought process

The theory of how we make Drupal fast

Dependencies, dependencies, dependencies!

  • Drupal 7 didn't track any dependencies
  • e.g. drupal_add_css(), drupal_add_js() …
  • ⇒ impossible to cache
  • ⇒ #attached asset libraries solve that

Dependencies, dependencies, dependencies!

  • Drupal 7 didn't track any dependencies
  • e.g. url()'s output depended on:
    • <front> configuration
    • HTTPS configuration
    • clean URL configuration
    • current site in multisite
    • current host name
    • path processing
      • negotiated interface language
      • negotiated URL language
  • ⇒ impossible to cache invalidate
  • … yet many of us did it anyway!

Correct invalidation in Drupal 8

  • Cache tags (data dependencies)
  • Cache contexts (context dependencies)
  • Cache max-age (time dependencies)

+

Cacheability bubbled during rendering!

In practice

Try to make this thought process a habit:

1.

I'm rendering something. That means I must think of cacheability!

2.

Is this something that's expensive to render, and therefore is worth caching?

↪︎ If "yes": cache keys

$build['#cache']['keys'] = ['node', 5, 'teaser'];

3.

Does the representation of the thing I'm rendering vary per combination of permissions, per URL, per interface language, per … something?

↪︎ If "yes": cache contexts

$build['#cache']['contexts'][] = 'user.permissions';
$build['#cache']['contexts'][] = 'url';

~ HTTP's Vary header

4.

What causes the representation of the thing I'm rendering become outdated?

↪︎ If "yes": cache tags

$build['#cache']['tags'][] = 'node:5';
$build['#cache']['tags'][] = 'user:3';
$build['#cache']['tags'][] = 'taxonomy_term:23';

5.

When does the representation of the thing I'm rendering become outdated? ↪︎ If "yes": cache max-age
$build['#cache']['max-age'] = Cache::PERMANENT;

~ HTTP's Cache-Control: max-age header

To make it easier:

Renderer::addCacheableDependency($build, $dependency)

$site_config = $this->config->get('system.site');

$build = [
  '#markup' => t('Welcome to @site, @user!', $site_config->get('name')),
];
$this->renderer->addCacheableDependency($build, $site_config)

If Drupal pages were ships…

(Drupal rendering a page ~ building a ship)

… then this could be Drupal 8…

Assembled from components. Clear dependencies.

… and this would be Drupal 7

Assembled from seemingly random pieces. But it is a boat!

BigPipe, hybrid render strategies

Come see our other session tomorrow!

Making Drupal fly - The fastest Drupal ever is here!

Peek at the future

Service workers

w3.org/TR/service-workers/

html5rocks.com/en/tutorials/service-worker/introduction/

A client-side reverse proxy!

Logic defined in JavaScript!

Cache pages on the client Send cache tag invalidations to clients Pages cached on the client invalidated

Zero latency

You can't get any closer to the user!
caniuse.com/serviceworkers

…allowed Fabian & I to focus on this!

Acquia employs me full time and has allowed me to work on this and related things full time for two years. Acquia has sponsored Fabian's work that culminated in the demo he did.

Questions?

wimleers.com/talk/caching-at-the-edge-cdns-for-everyone

Docs

d.o/developing/api/8/cache + /tags + /contexts

Caching at the Edge CDNs for everyone Fabian Franz (Tag1 Consulting — @fabianfranz) Wim Leers (Acquia — @wimleers — wimleers.com)