Architecture of Internet Datacenters

How many of you would like to attend a course on Architecture of Internet Data Centers?  This course is part of RAD Lab who wrote the Adove the Clouds paper.


Well in Fall of 2007 UC Berkeley (my alma mater) had the following course for graduate students.

CS 294-14: Architecture of Internet Datacenters (RADLab Research Seminar 2.0)

Instructor: Randy H. Katz
Time: MW 2:30-4:00 PM
Place: 310 Soda
Units: 3 (2-4, but you had better sign up for 3!)

Course Description

Internet Datacenters have recently emerged as a significant new computing platform, designed to provide high capacity processing for large numbers of web clients. Major web properties like Google have designed their own building-scale computer facilities, integrating processing, storage, internal and external networking, along with integral power and cooling infrastructures. The resulting datacenters typically deploy 100,000 to 1,000,000 computers within a single facility.

In this research seminar, we will read and discuss the very recent literature on the design and implementation of processor clusters, virtual machines, virtual storage, and datacenter networking organization. Architectural approaches to deal with failures, effective sharing of processing/storage/network resources, and efficient management of power across the systems stack will be considered. Some class meetings will be dedicated to meeting with and discussing issues with industrial leaders from Google, IBM, Cisco, and Network Appliances.

Here are the first two weeks.

Week 1: Course Organization, Overview, and Technology Trends

  • Monday, August 27
    1. [Randy] Randy H. Katz, “Internet-scale Computing: The Berkeley RADLab Perspective,” IWQoS 2007, Evanston, IL, (June 2007). [pdf]
    2. [Randy] Stephen Alan Herrod, VMWare, “The Future of Virtualization Technology,” ISCA 2006. [pdf]
  • Wednesday, August 29
    1. [Randy] Raj Yavatkar, Intel, “Platforms Design Challenges with Many Cores,” HPCA-12, 2006. [pdf]
    2. [Randy] Renato Recio, IBM, “System IO Network Evolution: Closing the Requirement Gaps,” HPCA-12, 2006. [pdf]
    3. [Randy] Steve Kleiman, NetApp, “Trends in Managing Data at the Petabyte Scale,” FAST 2007, San Jose, CA, (February 2007). [pdf]
Week 2: Applications Software Infrastructure
  • Monday, September 3: Labor Day Holiday
  • Wednesday, September 5
    1. [Matei] S. Ghemawat, H. Gobioff, S.-T. Leung, “The Google File System,” Proc. SOSP’03, 2003. [pdf] [Notes].
    2. [Kuang] J. Dean, S. Ghemawat, “Mapreduce: Simplified Data Processing on Large Clusters,” Proc. OSDI’04, pages 137 – 150, (December 2004). [pdf] [Notes].
    3. [Michael] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, R. E. Gruber, “Bigtable: A Distributed Storage System for Structured Data,” Proc. OSDI'06, 2006. [pdf]

    1. [Randy] Intel and Sun White Papers on Multicore Architectures [Notes]
      • Intel, "Intel Multi-Core Processors: Making the Move to Quad-Core and Beyond." [pdf]
      • Intel, "Inside Intel Core Microarchitecture: Setting New Standards for Energy-Efficient Performance." pdf
      • Intel, "Preparing for Peta-scale." [pdf]
      • Harlan McGhan, "Niagara 2 Opens the Flood Gates," Microprocessor, 11/6/2006. [pdf]
    2. [Ari] L. A. Barroso, J. Dean, U. Holzle, “Web Search for a Planet: The Google Cluster Architecture,” IEEE Micro, 23(2):22–28, March/April 2003. [pdf] [Notes]
    3. [Henry] L. A. Barroso, “The Price of Performance: An Economic Case for Chip Multiprocessing," ACM Queue, 3(7), September 2005. [html] [pdf].

What I found interesting was the student project proposals.

Student Project Proposal Presentations