One of my software buddies sent me a video link pumped that the presentation discussed the power of immutable distributed systems. When I saw the presentation I saw it was by ex-Microsoft Software Architect Pat Helland. In 2010, Pat moved to the Bing team to work on back end infrastructure to support the search environment.
Last Fall, I switch to work on Bing Infrastructure and have been very, very busy (and having a wonderful time). The projects I’m working on include COSMOS and Autopilot. COSMOS is a petabyte store (working towards being an exabyte store) which runs over tens of thousands of inexpensive computers. In addition to reliable storage, COSMOS supports Dryad based computation with application development in SCOPE which is a SQL-like language. Some public papers include: SCOPE and COSMOS, and Partitioning and Parallel Plans in SCOPE and COSMOS. The Autopilot team in OSD (Online Services Division which includes Bing) makes hardware selections for our ever-increasing bunch of servers, networking, systems support, automatic deployment and load balancing. See Autopilot. I have been having a blast working with the team in Bellevue and a team in Beijing with lots of talented people.
FYI, Pat now works as a software architect at Salesforce and this video got me to reconnect to Pat through LinkedIn.
Pat is a guy who could definitely design a DCIM system. Below is the presentation my developer friend got pumped about. I watched it too and agree Pat describes the ideas it takes to build a system for a complex data environments.
Warning this video can be hard to watch if you don't already think about software designs and believe immutability changes everything. Other great points are "normalization is for sissies" and "accountants don't have erasers."
I hadn't chatted with Pat for probably 5 years to discuss data centers. He was just getting started studying data centers, and he gave a presentation on green data centers in 2008.
Green Computing through Sharing
Reducing both Cost AND Carbon
Data centers consumed 1.5% of the total electricity in the US in 2006 and are on track to double as a percentage every five years. It is about 2% of the US total in 2008. Western Europe’s use is increasing at a slightly faster rate (from a slightly lower base percentage). The consumption of electicity within data centers is of significant financial and environmental importance.
Where the heck is all this power going? Why is the electrical load increasing so much? What can be done about it?
This talk will examine both traditional and emerging data center designs. We will start by examining how a data center is laid out, constructed, and managed. We will show two emerging trends: the change to designing data centers for the optimization of power and the emergence of new economies of scale in data centers which is contributing to the drive towards cloud computing. Microsoft is actively moving to compete in the space of cloud computing as we are seeing at the PDC (Professional Developers Conference) a few weeks before TechEd EMEA Developer.
Next, we will examine the sources of waste in the system today and examine why so many of our resources are underutilized. Because we are reluctant to share computing resources, they are left idle much of the time. Why is this currently the dominant choice? What can be done in the design of applications, systems, and data centers to make them more green (both carbon and cash)? What can developers do to make a difference?
It was a pleasure chatting with Pat 5 years ago, and I look forward to connecting with him again, and discuss how immutability changes everything. :-)