Web Performance – Geoff Adams



Web Performance – Geoff Adams

0 0


presentation-http-performance

FM K&L Knowledge Sharing - HTTP Performance presentation

On Github geoffadams / presentation-http-performance

Web Performance

Geoff Adams

@geoffadams

github.com/geoffadams

Overview

  • HTTP Caching
  • Varnish
  • Profiling with xhprof
  • HTTP 2.0

HTTP Caching

HTTP Caching

  • Client-side cache
  • Controlled via headers set by the server
  • Rules for end and intermediate caches defined by:
    • Cache-Control
    • Expires
    • ETag
  • Vary headers specify which request headers are used to vary responses, e.g.:
    • "Vary: Cookie" says the user's cookie could change response
  • All well and good for longer usage sessions and frequent users
  • Mobile networks may impose further caching in violation of restrictive rules - can cause issues!

Varnish

Varnish

  • Makes websites fast
  • Primarily a HTTP accelerator, also a reverse proxy and load balancer
  • Provides additional functionality such as ESIs, forced compression
  • Helps with resilience:
    • Grace mode serves cached, but stale, content if backend is down
    • Saint mode blacklists backend for a period of time if errors occur
    • Combination of both allow servers to recover from high load whilst reducing end user impact
  • Whole raft of tools to help tune and monitor - even in production environments

How Varnish works

  • Stores caches in memory, with clever virtual memory paging
  • Avoids writing to the disk (except for virtual memory)
  • Extremely fast as a result
  • Configured through a compiled language: Varnish Configuration Language (VCL)
  • VCL defines exactly how to handle backends, allows you (almost) unlimited control
  • Varnish is most effective when configured for a particular application

An Example Architecture

  • Group of backends usually hooked up to one Varnish instance
  • Varnish handles rudimentary load balancing between backends
  • Client makes request to Varnish, which then decides which backend to use
  • Varnish may serve straight from cache or forward the request to the backend

VCL and Hashing

  • Cached responses are identified by a hash
  • This hash comes from the request which generated the response
  • By default, hashes only use the Host and URL of the request
                    GET /funny-cat-pictures HTTP/1.1
                    Host: www.theinternet.com
                

is different from...

                    GET /funny-cat-pictures/ HTTP/1.1
                    Host: theinternet.com
                

Obviously this isn't quite enough...

VCL

  • C-like syntax
  • Actually compiled down into bytecode, which makes it fast
  • Very powerful, allows complete control of both handling requests and responses
  • Small standard library of functionality, including regular expressions
  • Most common uses:
    • Rewrite incoming requests to make them more homogenous
    • Rewrite responses
    • Define how to handle backend errors
    • Define load balancing strategy
    • Enable ESIs

VCL Functions

Some standard hooks to attach to:

  • vcl_recv: handles incoming requests
  • vcl_hash: called when hashing the request to identify if it's in the cache or not
  • vcl_fetch: called after a response has been retrieved from a backend server
  • vcl_hit, vcl_miss: handle cache hits and misses, respectively
  • vcl_error: handles errors from backend servers or Varnish itself

Useful functions:

  • hash_data() adds some arbitrary data into the hash
  • regsub()/regsuball() allow you to use RegEx to rewrite things, commonly headers

VCL example

Stuff you'd usually do in Apache configs

sub vcl_recv {
    // normalize host name by removing the "www."
    set req.http.Host = regsub(req.http.Host, "www\.", "");

    if (req.http.Content-Type ~ "(image|audio|video)") {
        // disable compression for pre-compressed files
        remove req.http.Accept-Encoding;

        // also remove cookies from them altogether
        remove req.http.Cookie;
    }
}
                

Another VCL example

Rewrite Cookie header to only include 'location' and 'options' cookies

sub vcl_recv {
  if (req.http.Cookie) {
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
    set req.http.Cookie = regsuball(req.http.Cookie, ";(location|options)=", "; \1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
  }

  if (req.http.Cookie == "") {
    remove req.http.Cookie;
  }
}
                

Why Rewrite Requests At All?

  • You might want to ignore certain cookies but not all
  • Ordering of cookies changes between clients, even though identical content-wise - these variances are treated as different strings, thus, different hashes
  • Other headers may change the response (Accept-Language etc)
  • Misbehaving backends or applications - headers not always ideal
  • Power to override cache policy set by the backend

Tools

varnishlog

  • View requests as they come through
  • Turn on the firehose!
  • Can filter using command line options (or sed)
  • Cycled log kept by Varnish in memory - never written to disk
  • Can be a good debugging tool: see what happens to requests as they come through
  • Can even use it in live (although it may be like The Matrix)
  • Realtime

... glorious tools

varnishstat

  • Numbers! See what Varnish is up to
  • Monitor cache hits, misses, cache entries...
  • Good for measuring and comparing performance, especially if load/stress testing
  • Realtime

... more tools

varnishtop

  • Collates similar log entries
  • A combination of detail from varnishlog with metrics from varnishstat
  • Similar set of filters to varnishlog
  • Realtime

... tools everywhere

varnishreplay

  • Replays traffic from logs
  • Good for warming caches or testing
  • Can be used in conjunction with varnishstat for comparing VCL configs

Profiling with xhprof

Profiling

  • Observing how a program runs
  • See function calls, memory usage, CPU usage
  • Identify bottlenecks and optimisation targets
  • Only used in development or testing environments as it makes things very slow
  • In PHP-land, XDebug is the most common profiler

xhprof

  • Written by Facebook
  • More performant than XDebug
  • Hierarchical profiler, shows call graphs (unlike XDebug)
  • Available on your local Sandbox

xhprof

Time for a live demo! This can't possibly go wrong...

The Future

HTTP 2.0

Currently going through the standardisation process, should be submitted in late 2014

Approach being chosen is a straight copy of Google's SPDY prototcol

SPDY basically improves request pipelining and batching

Also enforces header compression

Largely sits on top of HTTP 1.1

Browsers already support it (well, not IE)

Questions?