Datacenters

Lecture Notes for CS 142
Fall 2010
John Ousterhout

  • Readings for this topic:
    • YouTube video on Google datacenter: http://www.youtube.com/watch?v=bs3Et540-_s\
  • Evolution of datacenters:
    • 1960's, 1970's: a few very large time-shared computers
    • 1980's, 1990's: heterogeneous collection of lots of smaller machines.
    • Today and into the future:
      • Large numbers of nearly identical machines.
      • Individual applications use thousands of machines simultaneously.
  • Typical specs for a new datacenter today:
    • 50,000-200,000 machines
    • 15-30 MW power
    • $0.5B construction cost
    • Onsite staff (security, administration): 15
  • Typical organization of a datacenter:
    • Individual machine:
      • 4-8 cores
      • DRAM: 4-16 GB @ 100ns access time
      • Disk: 2 TB @ 10ms access time
    • Rack:
      • 50 machines
      • DRAM: 200-800GB @ 300 microseconds access time
      • Disk: 100 TB @ 10ms access time
    • Row/cluster:
      • 30+ racks (1500 machines, 6000-12000 cores)
      • DRAM: 6-24TB @ 500 microseconds access time
      • Disk: 3PB @ 10ms access time
  • New modular structure for datacenters: cargo containers.
  • Networking within a datacenter:
    • Hierarchically organized:
      • Top-of-rack switch
      • End-of-row router
      • Core router
    • Ideal: "full bisection bandwidth"
    • In practice today: 100x oversubscription.
    • Assumes applications have locality, but this is hard to achieve in practice.
  • Power Usage Effectiveness (PUE):
    • Ratio of (Total Facility Power)/(Server/Network Power)
    • Typical ratios: 1.7-2.0.
    • Best-known (Google): 1.15
    • Anything above 1.0 is wasted in power distribution and cooling.
    • Power is about 25% of monthly operating cost
    • Locate new datacenters near cheap power?
  • Fault-tolerance:
    • At the scale of new datacenters, things are breaking constantly.
    • Every aspect of the datacenter must be able to tolerate failures.