Advanced HTTP Caching

47
Spreadshi rt Advanced HTTP Caching Martin Breest

Transcript of Advanced HTTP Caching

Page 1: Advanced HTTP Caching

Spreadshirt

Advanced HTTP CachingMartin Breest

Page 2: Advanced HTTP Caching

Spreadshirt 2

Agenda

• Recap HTTP Caching Basics Expiration Revalidation Variation

• Advanced HTTP Caching Decomposition Stale Content Delivery Purging Caching User Data

Page 3: Advanced HTTP Caching

Spreadshirt 3

Recap HTTP Caching Basics

Page 4: Advanced HTTP Caching

Spreadshirt 4

Expiration with Cache-Control

Browser Origin Server

Resource/index.html

CachedRepresentation

RequestGET /index.html HTTP/1.1

ResponseHTTP/1.1 200 OKDate: Fri, 16 Sep 2016 12:15:00 GMTCache-Control: max-age=2700

Page 5: Advanced HTTP Caching

Spreadshirt 5

Revalidation with Last-Modified and/or ETag

Browser Origin Server

Resource/index.html

CachedRepresentation

RequestGET /index.html HTTP/1.1

ResponseHTTP/1.1 200 OKDate: Fri, 16 Sep 2016 12:15:00 GMTCache-Control: max-age=2700Last-Modified: Fri, 15 Sep 2016 12:00:00 GMT

Browser Origin Server

Resource/index.html

CachedRepresentation

ResponseHTTP/1.1 304 Not ModifiedDate: Fri, 16 Sep 2016 13:00:00 GMTCache-Control: max-age=2700Last-Modified: Fri, 15 Sep 2016 12:00:00 GMT

RequestGET /index.html HTTP/1.1If-Modified-Since: Fri, 15 Sep 2016 12:00:00 GMT

45 minutes later …

Page 6: Advanced HTTP Caching

Spreadshirt 6

ResponseHTTP/1.1 200 OKVary: Accept-Encoding

Variation with Vary

Browser Origin Server

Resource/index.html

Representation

RequestGET /index.html HTTP/1.1Accept-Encoding: gzip

ResponseHTTP/1.1 200 OKVary: Accept-Encoding

Intermediate Cache

RequestGET /index.html HTTP/1.1Accept-Encoding: gzip

Resource/index.html

Representation

CachedRepresentation

Accept-Encoding: gzip Empty Accept-Encoding

Page 7: Advanced HTTP Caching

Spreadshirt 7

Advanced HTTP Caching

Page 8: Advanced HTTP Caching

Spreadshirt 8

Can this page be cached?

Page 9: Advanced HTTP Caching

Spreadshirt 9

Hard to say, right?

Page 10: Advanced HTTP Caching

Spreadshirt 10

Problem 1 - User data Login, basket and wish list are user-specific and make it difficult to cache page

Page 11: Advanced HTTP Caching

Spreadshirt 11

Problem 2 – Many moving parts CMS Header

Breadcrumb

Article Detail

Related Articles

ProductType Details

Design Details

Related Designs

Related Tags

CMS Footer

Page contains many different parts from different sources with different TTLs

Page 12: Advanced HTTP Caching

Spreadshirt 12

Decomposition

Page 13: Advanced HTTP Caching

Spreadshirt 13

Problem description

• Hard to determine good cache time

• No reusable, cacheable parts on edge node

• Full page needs to always be fetched from source (latency)

• Page gets delivered by one service that glues everything together (which might be a bottleneck)

• Javascript on certain mobile devices costly

• Use of Javascript can create SEO problems

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

Page 14: Advanced HTTP Caching

Spreadshirt 14

Solution: Divide and conquerDecompose page into more manageable parts

without using Javascript

Page 15: Advanced HTTP Caching

Spreadshirt 15

Decomposing a pageRequestGET /regenbogenbaer-A103766209 HTTP/1.1

RequestGET /cms/header HTTP/1.1

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

… prefix ...

… suffix ...

RequestGET /breadcrumb/t-shirts HTTP/1.1

RequestGET /relatedArticles/103766209 HTTP/1.1

template

part

part

part

Page 16: Advanced HTTP Caching

Spreadshirt 16

<html> <head> … </head> <body> <esi:include src="/cms/header"/> <esi:include src="/breadcrumb/t-shirts"/> <div> ... the article html ... </div> <esi:include src="/relatedArticles/103766209"/> ... </body></html>

Glueing it together again - Edge Side Includes (ESI)

RequestGET /cms/header HTTP/1.1

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

… prefix ...

… suffix ...

RequestGET /breadcrumb/t-shirts

RequestGET /relatedArticles/103766209 HTTP/1.1

GET /regenbogenbaer-A103766209 HTTP/1.1

see https://www.w3.org/TR/esi-lang

Load template

Include part

Include part

Include part

Page 17: Advanced HTTP Caching

Spreadshirt 17

Many cache/CDN providers implement ESI

• Varnish Cache supports a minimal ESI set (include only)

• Fastly CDN supports ESI (include, comment, remove)

• AKAMAI CDN supports full ESI set

Page 18: Advanced HTTP Caching

Spreadshirt 18

Pros & Cons

Pros• Template and parts are own resources with own URL and

response headers

• Parts can be reused

• Parts can be purged individually

• Cache times can be configured separately

• Template and parts get cached at the edge (low latency)

Cons• ESI include executed sequentially usually

• Error handling is a problem

• Javascript and CSS for parts not combined

Page 19: Advanced HTTP Caching

Spreadshirt 19

Stale Content Delivery

Page 20: Advanced HTTP Caching

Spreadshirt 20

Problem description

ResponseHTTP/1.1 200 OK

Browser Origin Server

Resource/index.html

ResponseHTTP/1.1 200 OK

Intermediate Cache

Resource/index.html

RequestGET /index.html HTTP/1.1

RequestGET /index.html HTTP/1.1

ResponseHTTP/1.1 503 Service Unavailable

Browser Origin Server

Resource/index.htmlResponse

HTTP/1.1 503 Service Unavailable

Intermediate Cache

Resource/index.html

RequestGET /index.html HTTP/1.1

RequestGET /index.html HTTP/1.1

Problem 1: Full server processing time on revalidation

Problem 2: Error on revalidation if origin is down

Origin server response time determines cache response time

If origin is down cache delivers errors as well

Page 21: Advanced HTTP Caching

Spreadshirt 21

Fresh vs. stale

Fresh Stale

T_Origin TTL GraceReceived response and added response representation to cache

Time until response representation can be served from cache

Time until response representation that requires revalidation canbe served from cache

KeepTime representationmight stay in cache

Page 22: Advanced HTTP Caching

Spreadshirt 22

Solution: Stale now is better than fresh

later or temporarily downDeliver stale content temporarily to bridge

cache refresh and origin outages

Page 23: Advanced HTTP Caching

Spreadshirt 23

Deliver stale content on revalidation with stale-while-revalidate

ResponseHTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=600, stale-while-revalidate=600

Browser Origin Server

Resource/index.htmlResponse

HTTP/1.1 200 OKCache-Control: max-age=60

Intermediate Cache

(e.g. Varnish)Resource/index.html

RequestGET /index.html HTTP/1.1

RequestGET /index.html HTTP/1.1

Browser

ResponseHTTP/1.1 200 OKCache-Control: max-age=60

Intermediate Cache

(e.g. Varnish)Resource/index.html

RequestGET /index.html HTTP/1.1

ResponseHTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=600, stale-while-revalidate=600

Origin Server

Resource/index.html

Intermediate Cache

(e.g. Varnish)

RequestGET /index.html HTTP/1.1

Spawn asynchronous request process and return stale content for request that triggered it

15 Minutes LaterCache 1 minute in browser, 10 minutes in intermediate cache and allow to deliver stale content for 10 minutes

see https://tools.ietf.org/html/rfc5861

Page 24: Advanced HTTP Caching

Spreadshirt 24

Deliver stale content on origin problems with stale-if-error

ResponseHTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=600, stale-if-error=600

Browser Origin Server

Resource/index.htmlResponse

HTTP/1.1 200 OKCache-Control: max-age=60

Intermediate Cache

(e.g. Varnish)Resource/index.html

RequestGET /index.html HTTP/1.1

RequestGET /index.html HTTP/1.1

15 Minutes Later

Cache 1 minute in browser, 10 minutes in intermediate cache and allow to deliver stale content on origin error for 10 minutes

see https://tools.ietf.org/html/rfc5861

ResponseHTTP/1.1 503 Service Unavailable

Browser Origin Server

Resource/index.htmlResponse

HTTP/1.1 200 OKCache-Control: max-age=60

Intermediate Cache

(e.g. Varnish)Resource/index.html

RequestGET /index.html HTTP/1.1

RequestGET /index.html HTTP/1.1

Because of stale-if-error config return stale content instead of error

Page 25: Advanced HTTP Caching

Spreadshirt 25

Different implementations per cache/CDN

• Varnish cache Supports stale-while-revalidate and stale-if-error

• Fastly CDN Uses Varnish Supports stale-while-revalidate and stale-if-error

• AKAMAI CDN Supports similar behavior to stale-while-revalidate via “Cache

Prefreshing” feature, although this is an active refresh Supports similar behavior to stale-if-error via “Force

Revalidation of Stale Objects” configuration to ”Serve stale if unable to validate” value

Page 26: Advanced HTTP Caching

Spreadshirt 26

Pros & Cons

Pros• Decouple browser requests from actual revalidation with

origin

• Improve response times in general

• Bridge origin outages

• Improve overall resilience

Cons• Might deliver stale content to browser (clients)

• Might not notice errors when they occur

Page 27: Advanced HTTP Caching

Spreadshirt 27

Purging

Page 28: Advanced HTTP Caching

Spreadshirt 28

Problem description

• We actually choose a short cache time, because we do not know when modification occurs

• In most cases requested page does not change

• Create useless basic load on our system

• It would be better to inform cache about page changes proactively

RequestGET /regenbogenbaer-A103766209 HTTP/1.1ResponseHTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=600ETag: "6e35-240-2672fbbc"

… prefix ...

… suffix ...

Start

10 Minutes LaterRequestGET /regenbogenbaer-A103766209 HTTP/1.1If-None-Match: "6e35-240-2672fbbc"ResponseHTTP/1.1 304 Not ModifiedCache-Control: max-age=60, s-maxage=600ETag: "6e35-240-2672fbbc"

10 Minutes LaterRequestGET /regenbogenbaer-A103766209 HTTP/1.1If-None-Match: "6e35-240-2672fbbc"ResponseHTTP/1.1 304 Not ModifiedCache-Control: max-age=60, s-maxage=600ETag: "6e35-240-2672fbbc"

Short browser cache time

Short intermediate cache time

Page 29: Advanced HTTP Caching

Spreadshirt 29

Solution: Don’t call me, I call you

Invert expiration mechanism

through replacing pull through push

Page 30: Advanced HTTP Caching

Spreadshirt 30

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

Tag content and purge individually if required

30

IntermediateCache

(e.g. Varnish with XKEY)

Origin Server

ResponseHTTP/1.1 200 OKCache-Control: max-age=60, s-maxage=86400XKey: a103766209;XKey: articlePage;

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

RequestXKEY / HTTP/1.1XKey-Purge: a103766209;

Browser

ResponseHTTP/1.1 200 OK

Purge content on actual content modification …

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

ResponseHTTP/1.1 200 OKCache-Control: max-age=60

BrowserResponseHTTP/1.1 200 OKCache-Control: max-age=60

Browser

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

RequestGET /regenbogenbaer-A103766209 HTTP/1.1

Long intermediate cache time

Content tag

Purge tag

2 hours later

Page 31: Advanced HTTP Caching

Spreadshirt 31

Different implementations per cache/CDN

• Varnish cache Supports purging on content tags via XKey module and XKey

header Instant purge time

• Fastly CDN Uses Varnish and supports it as well via Surrogate-Key header ~500ms purge time

• AKAMAI CDN Has announced to support content tags via Edge-Content-Tag

header and purging based on that via FastPurge in Q1/2017 ~5sec purge time

31

Page 32: Advanced HTTP Caching

Spreadshirt 32

Invalidation is better than removal

• Purge usually has two modes invalidation and removal Varnish XKey supports that with purge and softpurge Fastly CDN supports it with purge and softpurge as well AKAMAI CDN’s FastPurge supports removal and invalidation

mode on staging and production environment

• Removal physically removes content from cache (one useful use case is a removal for legal reasons)

• Invalidation sets TTL to 0 and marks content for revalidation

• Invalidation usually preferred solution Invalidation request might lead to overload on origin Cache can still serve stale content even if origin is down

Page 33: Advanced HTTP Caching

Spreadshirt 33

Pros & Cons

Pros• Cache times can be increased to much higher TTLs

• Cache hit rates improve

• Response times improve as most responses can be served from cache

• Scalability improves as most content can be served from cache and traffic peaks can be handled by cache

• Basic load on service due to continuous “polling” gets reduced

• More control over cache state

Cons• Need to implement scalable purge service

• Complexity might increase

Page 34: Advanced HTTP Caching

Spreadshirt 34

Caching User Data

Page 35: Advanced HTTP Caching

Spreadshirt 35

Problem descriptionRequestGET /regenbogenbaer-A103766209 HTTP/1.1

• User-specific data usually makes pages uncacheable

• Workaround is often to use Javascript to include user-specific parts on the client-side

• Problem is that it requires Javascript

• Most user-specific parts would actually be cacheable with a high hit rate

• ESI actually allows to include user-specific parts

Page 36: Advanced HTTP Caching

Spreadshirt 36

Solution: Make user data cacheable through smart variation

Use Vary header and versioning mechanism to make user-specific data cacheable

Page 37: Advanced HTTP Caching

Spreadshirt 37

Login and logout creates or removes security session and Cookie

ResponseHTTP/1.1 200 OKSet-Cookie: sprd_auth_token=12345678;

Browser Origin Server

Resource/auth/loginResponse

HTTP/1.1 200 OKSet-Cookie: sprd_auth_token=12345678;

Intermediate Cache

(e.g. Varnish)Resource/auth/login

RequestPOST /auth/login HTTP/1.1

RequestPOST /auth/login HTTP/1.1

ResponseHTTP/1.1 200 OKSet-Cookie: sprd_auth_token=; Expires=Tuesday, 13-Dec-16 09:00:00 GMT

Browser Origin Server

Resource/auth/logoutResponse

HTTP/1.1 200 OKSet-Cookie: sprd_auth_token=; Expires=Tuesday, 13-Dec-16 09:14:57 GMT

Intermediate Cache

(e.g. Varnish)Resource/auth/logout

RequestPOST /auth/logout HTTP/1.1

RequestPOST /auth/logout HTTP/1.1

Login creates Cookie

Logout removes Cookie

Page 38: Advanced HTTP Caching

Spreadshirt 38

Making login state cacheable with VarnishRequestGET /auth/loginstate HTTP/1.1

ResponseHTTP/1.1 200 OKCache-Control: no-cache, s-maxage=600Vary: CookieXKey: session123;ETag: “123”

Browser Origin Server

Resource/auth/loginstate

ResponseHTTP/1.1 200 OKCache-Control: private, no-cacheETag: “123”

Intermediate Cache

(e.g. Varnish)Resource/auth/loginstate

RequestGET /auth/loginstate HTTP/1.1Cookie: sprd_auth_token=123; ….

RequestGET /auth/loginstate HTTP/1.1Cookie: sprd_auth_token=123;

CachedRepresentation

Cookie: sprd_auth_token=123 Empty Cookie

cookie sprd_auth_token exists?

login state for user

no loginno

yes

Page 39: Advanced HTTP Caching

Spreadshirt 39

Node.js implementation

router.get('/auth/loginstate.html', function (req, res, next) { res.setHeader('Cache-Control', ’no-cache, max-age=600');   res.setHeader('Vary', 'Cookie');   var session = authService.getSession(req.cookies.sprd_auth_token);   if (session) {    res.setHeader('XKey', sessionTag(session.sessionId));      res.render('loginstate', session);   } else {      res.render('nologin');   }});   

Page 40: Advanced HTTP Caching

Spreadshirt 40

// remove cookie for everything but auth contextif (req.url !~ "/auth/") { unset req.http.Cookie;} // filter sprd_auth_tokenelse { if (req.http.Cookie) {    set req.http.Cookie = ";" + req.http.Cookie; set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";"); set req.http.Cookie = regsuball(req.http.Cookie, ";(sprd_auth_token)=", "; \1="); set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", ""); set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", ""); if (req.http.Cookie == "") { unset req.http.Cookie; } }}

Varnish VCL configuration

Page 41: Advanced HTTP Caching

Spreadshirt 41

Caching baskets works in a similar way using versioning

ResponseHTTP/1.1 200 OKSet-Cookie: basket_id=12345678/v1;

Browser Origin Server

Resource/basketResponse

HTTP/1.1 200 OKSet-Cookie: basket_id=12345678/v1;

Intermediate Cache

(e.g. Varnish)Resource/basket

RequestPOST /basket HTTP/1.1

RequestPOST /basket HTTP/1.1

ResponseHTTP/1.1 200 OKSet-Cookie: basket_id=12345678/v2;

Browser Origin Server

Resource/basketResponse

HTTP/1.1 200 OKSet-Cookie: basket_id=12345678/v2;

Intermediate Cache

(e.g. Varnish)Resource/basket

RequestPOST /basket HTTP/1.1Cookie: basket_id=12345678/v1;

RequestPOST /basket HTTP/1.1Cookie: basket_id=12345678/v1;

Create basket

Update basketStart with version 1 on basket creation

Update to next version on basket modification

Page 42: Advanced HTTP Caching

Spreadshirt 42

Fetch different cached baskets

ResponseHTTP/1.1 200 OKCache-Control: no-cache, s-maxage=600Vary: CookieETag: “12345678/v1”

Browser Origin Server

Resource/basketResponse

HTTP/1.1 200 OKCache-Control: private, no-cacheETag: “12345678/v1”

Intermediate Cache

(e.g. Varnish)Resource/basket

RequestGET /basket HTTP/1.1Cookie: basket_id=12345678/v1; …

RequestGET /basket HTTP/1.1Cookie: basket_id=12345678/v1;

ResponseHTTP/1.1 200 OKCache-Control: no-cache, s-maxage=600Vary: CookieETag: “12345678/v2”

Browser Origin Server

Resource/basketResponse

HTTP/1.1 200 OKCache-Control: private, no-cacheETag: “12345678/v2”

Intermediate Cache

(e.g. Varnish)Resource/basket

RequestGET /basket HTTP/1.1Cookie: basket_id=12345678/v2; …

RequestGET /basket HTTP/1.1Cookie: basket_id=12345678/v2;

Get first version

Get second versionCache basket versionfor 10 minutes in Varnish but not in browser

Cache new basket versionfor 10 minutes in Varnish but not in browser

Page 43: Advanced HTTP Caching

Spreadshirt 43

Different implementations per cache/CDN

• Varnish Cache Supports Vary header Implementation will require custom VCL

• Fastly CDN Uses Varnish Supports Vary header Implementation will require custom VCL

• AKAMAI CDN Does not support Vary header But allows to configure custom cache id modifications (cid)

Page 44: Advanced HTTP Caching

Spreadshirt 44

Pros & Cons

Pros• Allows to cache pages that contain user-specific data without

using Javascript

Cons• Might cache data and make it available to the public via CDN

that is confidential

Page 45: Advanced HTTP Caching

Spreadshirt 45

Conclusion

Page 46: Advanced HTTP Caching

Spreadshirt 46

Conclusion

• Everything can be made cacheable

• Edge Side Includes (ESI) allow to decompose pages into templates and reusable and cacheable parts

• Stale content delivery allows to improve response time and handle origin outages

• Purging allows to further increase cache times and proactively remove items from cache

• Caching user data allows to cache even pages with user-specific data without using Javascript

Page 47: Advanced HTTP Caching

Spreadshirt 47

Q&A