Large-Scale Web Applications

Lecture Notes for CS 142
Spring 2012
John Ousterhout

  • Additional reading for this topic: none.
  • Scale of Web applications: 1000x anything previously built.
  • Load-balancing routers: replicate Web front ends.
    • DNS (Domain Name System) load balancing:
      • Specify multiple targets for a given name
      • DNS servers rotate among those targets
    • HTTP redirection (HotMail, now LiveMail):
      • Front-end machine accepts initial connections
      • Redirects them among an array of back-end machines.
    • Load-balancing switch ("Layer 4-7 Switch"):
      • All incoming packets pass through one switch, which dispatches them to one of many servers; once TCP connection established, load balancer will send all packets for that connection to the same server.
      • In some cases the switches are smart enough to inspect session cookies, so that the same session always goes to the same server.
  • How to handle session data?
    • Different requests may go to different servers
    • Individual servers may crash
    • Need for session data to move from server to server as necessary.
    • Solution #1: keep all session data in shared storage:
      • File system
      • Database
      • May be expensive to retrieve for each request
    • Solution #2: keep session data in cookies
      • No server storage required
      • Cookies limit the amount of data that can be stored
    • Solution #3: cache session data in last server that used it
      • Store server map in shared storage
      • If future request goes to different server, use map to find server holding session data, retrieve data from previous server.
  • Scaling the storage system:
    • Almost all Web applications start off using relational databases.
    • A single database instance doesn't scale very far.
    • Applications must partition data among multiple independent databases, which adds complexity.
      • Example: Facebook had 4000 MySQL servers by 2009
    • Memcache: main-memory caching system
      • Key-value store (both keys and values are arbitrary blobs)
      • Used to cache results of recent database queries
      • Much faster than databases: 500-microsecond access time, vs. 10's of milliseconds
      • Problems:
        • Writes must still go to the DBMS, so no performance improvement for them
        • Must manage consistency in software (e.g., flush relevant memcache data when database gets modified)
      • Example: Facebook had 2000 memcache servers by 2009
  • Because of scalability problems, we are seeing many new approaches to storage:
    • RAMCloud: new storage system under development in a research project here at Stanford:
      • Store all data in DRAM permanently
      • Aggregate thousands of servers in a datacenter
      • Use disk to backup data for high durability and availability
      • 32-256 GB per server
      • 100-500 TB per system
      • Very high performance:
        • 5-10 microsecond access time
        • 1 million operations/second/server
  • Scaling issues make it difficult to create new Web applications:
    • Initially, can't afford expensive systems for managing large scale.
    • But, application can suddenly become very popular ("flash crowd"); can be disastrous if application can't scale quickly.
    • Can take weeks or months to buy and install new servers.
    • Must become expert in datacenter management.
    • Each 10x growth in application scale typically requires new application-specific techniques.
  • Cloud computing:
    • Separate scalability issues from application development.
    • Specialized providers offer scalable infrastructure.
    • Just pay for what you need.
  • Example #1: Amazon Web Services
    • Elastic Compute Cloud (EC2): rent CPUs in an Amazon datacenter for < $0.10/CPU/hour
    • Scale up and down by hundreds of CPUs almost instantly
    • Simple Storage Service (S3): stores blobs of data inexpensively (about $0.10/GB/month).
    • AWS provides low-level facilities; users still have to worry about various management issues ("how do I know it's time to allocate more CPUs?")
  • Example #2: Google AppEngine
    • Much higher level interface:
      • You provide pieces of Python or Java code, URLs associated with each piece of code.
      • Google does the rest:
        • Allocate machines to run your code
        • Arrange for name mappings so that HTTP requests find their way to your code
        • Scale machine allocations up and down automatically as load changes
      • AppEngine also includes a scalable storage system
    • More constrained environment
      • Must use Python or Java
      • Must use specialized Google storage system
  • In the future we are going to see more systems like AWS and AppEngine, with more and more convenient high-level interfaces.