Web Scalability
Review of a scaling web architecture
Darren Wurf
Scale-up vs scale-out
Scale-up: Buy bigger machines
Scale-out: Buy more machines
Scale-up is easier if you are lucky enough to be able to do it
resource sliders (CPU/memory)
Scale-up not always possible (resource contention, HA, c20k)
Scale-out must be designed into the productHTTP: the stateless protocol
- A stateless architecture is easily scalable
- Cookies are used to maintain state between requests
- Sites with a login/password are typically stateful (session credentials)"stateless": client makes request web server responds and forgets*
Examples of state: shopping cart, member area, facebook wall
HTTP: the stateless protocol
Requests could go to any web server
Client doesn't know or care which server handled the request
* Next request could go to a different serverSessions and caching
- Multiple web application servers need to share session data
- Store authentication data in a central server (eg LDAP/DB)
- Cache session data on the web server (memcached)
Other caching
- Client-side response caching: Last-Modified / E-Tags
- Server-side response caching: memcached
High availability
- Stateless architectures can also be made highly available
- Remove all single points of failure
Reference architecture
Rackspace Cloud reference architecture, from Rackspace's documentation (CC by/nc/nd)
Web appliation tier can scale out, failures do not take down system
Highly available load balancer and database are provided by rackspaceCase Study: LatLonGO
Product developed by my employer
Utility GIS data on your tablet
* For field construction / repair crewsLatLonGO: Overview
- Render layers in corporate GIS
- Serve data (usually as deltas) to tablets in the field
- Office workers use web interface
Utilities want to be able to find assets in the field
Solves the problem of "secure, disconnected access"
* I don't care about the tablet aspect at allLatLonGO: Web interface
- Started as a "preview" for uploaded layers
- Clients liked it
- Web interface must scale to 100s of concurrent users
Must be highly available,
100s of concurrent users == 1000s of requests/secLatLonGO: Technology stack
- Nginx
- Django
- Some C++ magicExposed via ctypes
LatLonGO: Bottleneck
Django
tiles
Django can be easily scaled to 100s of req/sLatLonGO: Scaling up
- Expose C++ code as FastCGI
- Cache sessions in memcached
- Cache tiles in memcached (optional)Problem: tile access is not authenticated
nginx module called auth_request - delegate authentication
configuration trickery: can delegate authentication to memcached
LatLonGO: Scaling out
- HTTP is stateless
- Django stores session details centrally (DB)
- FastGCI/C++ is stateless
- memcached is stateless
- Copy-paste servers!we can use a load balancer or round-robin DNS to distribute load
For consistent updates, use an ip-hashing load balancer
LatLonGO: High availability
- Load balancer
- Django database
- LDAP
- Layer/tile storage