Large-Scale Web Applications

Lecture Notes for CS 142
Spring 2012
John Ousterhout

Additional reading for this topic: none.
Scale of Web applications: 1000x anything previously built.
Load-balancing routers: replicate Web front ends.
- DNS (Domain Name System) load balancing:
  - Specify multiple targets for a given name
  - DNS servers rotate among those targets
- HTTP redirection (HotMail, now LiveMail):
  - Front-end machine accepts initial connections
  - Redirects them among an array of back-end machines.
- Load-balancing switch ("Layer 4-7 Switch"):
  - All incoming packets pass through one switch, which dispatches them to one of many servers; once TCP connection established, load balancer will send all packets for that connection to the same server.
  - In some cases the switches are smart enough to inspect session cookies, so that the same session always goes to the same server.
How to handle session data?
- Different requests may go to different servers
- Individual servers may crash
- Need for session data to move from server to server as necessary.
- Solution #1: keep all session data in shared storage:
  - File system
  - Database
  - May be expensive to retrieve for each request
- Solution #2: keep session data in cookies
  - No server storage required
  - Cookies limit the amount of data that can be stored
- Solution #3: cache session data in last server that used it
  - Store server map in shared storage
  - If future request goes to different server, use map to find server holding session data, retrieve data from previous server.
Scaling the storage system:
- Almost all Web applications start off using relational databases.
- A single database instance doesn't scale very far.
- Applications must partition data among multiple independent databases, which adds complexity.
  - Example: Facebook had 4000 MySQL servers by 2009
- Memcache: main-memory caching system
  - Key-value store (both keys and values are arbitrary blobs)
  - Used to cache results of recent database queries
  - Much faster than databases: 500-microsecond access time, vs. 10's of milliseconds
  - Problems:
    - Writes must still go to the DBMS, so no performance improvement for them
    - Must manage consistency in software (e.g., flush relevant memcache data when database gets modified)
  - Example: Facebook had 2000 memcache servers by 2009
Because of scalability problems, we are seeing many new approaches to storage:
- RAMCloud: new storage system under development in a research project here at Stanford:
  - Store all data in DRAM permanently
  - Aggregate thousands of servers in a datacenter
  - Use disk to backup data for high durability and availability
  - 32-256 GB per server
  - 100-500 TB per system
  - Very high performance:
    - 5-10 microsecond access time
    - 1 million operations/second/server
Scaling issues make it difficult to create new Web applications:
- Initially, can't afford expensive systems for managing large scale.
- But, application can suddenly become very popular ("flash crowd"); can be disastrous if application can't scale quickly.
- Can take weeks or months to buy and install new servers.
- Must become expert in datacenter management.
- Each 10x growth in application scale typically requires new application-specific techniques.

Cloud computing:
- Separate scalability issues from application development.
- Specialized providers offer scalable infrastructure.
- Just pay for what you need.
Example #1: Amazon Web Services
- Elastic Compute Cloud (EC2): rent CPUs in an Amazon datacenter for < $0.10/CPU/hour
- Scale up and down by hundreds of CPUs almost instantly
- Simple Storage Service (S3): stores blobs of data inexpensively (about $0.10/GB/month).
- AWS provides low-level facilities; users still have to worry about various management issues ("how do I know it's time to allocate more CPUs?")
Example #2: Google AppEngine
- Much higher level interface:
  - You provide pieces of Python or Java code, URLs associated with each piece of code.
  - Google does the rest:
    - Allocate machines to run your code
    - Arrange for name mappings so that HTTP requests find their way to your code
    - Scale machine allocations up and down automatically as load changes
  - AppEngine also includes a scalable storage system
- More constrained environment
  - Must use Python or Java
  - Must use specialized Google storage system

In the future we are going to see more systems like AWS and AppEngine, with more and more convenient high-level interfaces.

CS 142: Web Applications (Spring 2012)

Large-Scale Web Applications