Almost Everything You Always Wanted to Know About Http Cache

(but were afraid to ask)

Szymon Kulec

@Scooletz

http://blog.scooletz.com

Outline

RFC 2616 and beyond

All the verbs

in http

  • GET
  • PUT
  • POST
  • DELETE
  • ...

All the verbs

using cache

  • GET
  • PUT
  • POST
  • DELETE
  • ...

GET headers example

HTTP/1.1 200 OK
Date: Fri, 30 Oct 1998 14:19:41 GMT
Server: Apache/1.3.3 (Unix)
Cache-Control: max-age=3600, no-cache, public
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:10:12 GMT
ETag: "3e85-410-3596fbbc"
Content-Length: 1000
Content-Type: text/html

Cache models

  • Expiration
  • Validation

Expiration (cache control)

header meaning
max-age=[seconds] maximum time when cache is considered fresh
public response is cacheable
private response is cacheable but only for the end client
no-cache must revalidate every time
no-store don't store it!

Expiration examples

headers meaning
max-age=3600, private keep up to 1h but only on the client (no proxy)
max-age=3600, no-cache, public keep up to 1h but validate every time the resource is needed

Validation

header sender meaning
Last-Modified server date of last mofification
If-Modified-Since client date from the cached version
ETag server logical version of the resource
If-None-Match client logical version from the cached version

Validation (examples)

# sender data
1 client GET
2 server body + ETag: 12312
3 client GET + If-None-Match: 12312
4 server 304: Not Modified

GET headers example (again!)

HTTP/1.1 200 OK
Date: Fri, 30 Oct 1998 14:19:41 GMT
Server: Apache/1.3.3 (Unix)
Cache-Control: max-age=3600, no-cache, public
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:10:12 GMT
ETag: "3e85-410-3596fbbc"
Content-Length: 1000
Content-Type: text/html

Who & why uses http cache for data?

Why

  • http is a standard
  • plenty of tools: proxies, caching servers
  • leave resolving peformance/scaling problems for the others

EventStore

  • a db for event sourced apps
  • events are only appended
  • immutable events -> cache forever
  • streams of events -> mutable but can be versioned

EventStore

		
						GET http://127.0.0.1:2113/streams/mystream

						Cache-Control: max-age=0, no-cache, must-revalidate
						ETag: "1;248368668"
						{
						  "entries": [
						    {
						      "title": "1@mystream",
						      "id": "http://127.0.0.1:2113/streams/mystream/1"
						    },
						    {
						      "title": "0@mystream",
						      "id": "http://127.0.0.1:2113/streams/mystream/0"
						    }
						  ]
						}

RavenDB

  • a document database
  • uses http for communication
  • ETags for client side caching (and more)

RavenDB

		
							// one can easily obtain etag of the given document
							var eTag = (Guid)session.Advanced.GetEtagFor(objectToSave);

							// optimistic concurrency with retrieved tag value
							session.Store(objectToSave, eTag);
						

Demo

Security

JSON Hijacking

  • browser issues requests to the domain with this domain cookies
  • the malicious site has a request embedded
  • it will be executed with an error but just before it setter will be called

JSON Hijacking

<html> 
						...
						<body> 
						    <script type="text/javascript"> 
						        Object.prototype.__defineSetter__('Id', function(obj){alert(obj);});
						    </script> 
						    <script src="http://yourbank.com/using/get/forJSON"></script> 
						</body> 
						</html>
						

GET url & business ids

  • What if GET url contains some business critical data like a credit card number?
  • It may leave a trace in the browser history even if it is not cached
  • POSTing queries for fragile is the way to go (visit your bank and check)

Think about security

Summary

If you're into web apps & services development, you should know about http cache and probably use it.

Summary

There are only two hard things in Computer Science: cache invalidation and naming things.

Phil Karlton

Better summary

There is only one hard things in Computer Science: naming things.

me

Questions?

Szymon Kulec @Scooletz http://blog.scooletz.com