Overview
- HTTP Caching
- Varnish
- Profiling with xhprof
- HTTP 2.0
HTTP Caching
- Client-side cache
- Controlled via headers set by the server
-
Rules for end and intermediate caches defined by:
- Cache-Control
- Expires
- ETag
-
Vary headers specify which request headers are used to vary responses, e.g.:
- "Vary: Cookie" says the user's cookie could change response
- All well and good for longer usage sessions and frequent users
- Mobile networks may impose further caching in violation of restrictive rules - can cause issues!
Varnish
- Makes websites fast
- Primarily a HTTP accelerator, also a reverse proxy and load balancer
- Provides additional functionality such as ESIs, forced compression
-
Helps with resilience:
- Grace mode serves cached, but stale, content if backend is down
- Saint mode blacklists backend for a period of time if errors occur
- Combination of both allow servers to recover from high load whilst reducing end user impact
- Whole raft of tools to help tune and monitor - even in production environments
How Varnish works
- Stores caches in memory, with clever virtual memory paging
- Avoids writing to the disk (except for virtual memory)
- Extremely fast as a result
- Configured through a compiled language: Varnish Configuration Language (VCL)
- VCL defines exactly how to handle backends, allows you (almost) unlimited control
- Varnish is most effective when configured for a particular application
An Example Architecture
- Group of backends usually hooked up to one Varnish instance
- Varnish handles rudimentary load balancing between backends
- Client makes request to Varnish, which then decides which backend to use
- Varnish may serve straight from cache or forward the request to the backend
VCL and Hashing
- Cached responses are identified by a hash
- This hash comes from the request which generated the response
- By default, hashes only use the Host and URL of the request
GET /funny-cat-pictures HTTP/1.1
Host: www.theinternet.com
is different from...
GET /funny-cat-pictures/ HTTP/1.1
Host: theinternet.com
Obviously this isn't quite enough...
VCL
- C-like syntax
- Actually compiled down into bytecode, which makes it fast
- Very powerful, allows complete control of both handling requests and responses
- Small standard library of functionality, including regular expressions
-
Most common uses:
- Rewrite incoming requests to make them more homogenous
- Rewrite responses
- Define how to handle backend errors
- Define load balancing strategy
- Enable ESIs
VCL Functions
Some standard hooks to attach to:
- vcl_recv: handles incoming requests
- vcl_hash: called when hashing the request to identify if it's in the cache or not
- vcl_fetch: called after a response has been retrieved from a backend server
- vcl_hit, vcl_miss: handle cache hits and misses, respectively
- vcl_error: handles errors from backend servers or Varnish itself
Useful functions:
- hash_data() adds some arbitrary data into the hash
- regsub()/regsuball() allow you to use RegEx to rewrite things, commonly headers
VCL example
Stuff you'd usually do in Apache configs
sub vcl_recv {
// normalize host name by removing the "www."
set req.http.Host = regsub(req.http.Host, "www\.", "");
if (req.http.Content-Type ~ "(image|audio|video)") {
// disable compression for pre-compressed files
remove req.http.Accept-Encoding;
// also remove cookies from them altogether
remove req.http.Cookie;
}
}
Another VCL example
Rewrite Cookie header to only include 'location' and 'options' cookies
sub vcl_recv {
if (req.http.Cookie) {
set req.http.Cookie = ";" + req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(location|options)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
}
if (req.http.Cookie == "") {
remove req.http.Cookie;
}
}
Why Rewrite Requests At All?
- You might want to ignore certain cookies but not all
-
Ordering of cookies changes between clients, even though identical content-wise -
these variances are treated as different strings, thus, different hashes
- Other headers may change the response (Accept-Language etc)
- Misbehaving backends or applications - headers not always ideal
- Power to override cache policy set by the backend
Tools
varnishlog
- View requests as they come through
- Turn on the firehose!
- Can filter using command line options (or sed)
- Cycled log kept by Varnish in memory - never written to disk
- Can be a good debugging tool: see what happens to requests as they come through
- Can even use it in live (although it may be like The Matrix)
- Realtime
... glorious tools
varnishstat
- Numbers! See what Varnish is up to
- Monitor cache hits, misses, cache entries...
- Good for measuring and comparing performance, especially if load/stress testing
- Realtime
... more tools
varnishtop
- Collates similar log entries
- A combination of detail from varnishlog with metrics from varnishstat
- Similar set of filters to varnishlog
- Realtime
... tools everywhere
varnishreplay
- Replays traffic from logs
- Good for warming caches or testing
- Can be used in conjunction with varnishstat for comparing VCL configs
Profiling
- Observing how a program runs
- See function calls, memory usage, CPU usage
- Identify bottlenecks and optimisation targets
- Only used in development or testing environments as it makes things very slow
- In PHP-land, XDebug is the most common profiler
xhprof
- Written by Facebook
- More performant than XDebug
- Hierarchical profiler, shows call graphs (unlike XDebug)
- Available on your local Sandbox
xhprof
Time for a live demo! This can't possibly go wrong...
HTTP 2.0
Currently going through the standardisation process, should be submitted in late 2014
Approach being chosen is a straight copy of Google's SPDY prototcol
SPDY basically improves request pipelining and batching
Also enforces header compression
Largely sits on top of HTTP 1.1
Browsers already support it (well, not IE)