An Application Architect who Cares about Energy Efficiency

Pat Helland put up his post about a week at Foo Camp. 

What is Foo Camp? Pat gives an entertaining summary.

A Weekend at Foo Camp

Well...  I was lucky enough to get invited to Foo Camp (which was last weekend) and I figure "What the heck!  Let's do it!".  

Foo Camp is held at the O'Reilly headquarters in Sebastopol, CA which is 1 to 2 hours (or more depending on traffic) north of San Francisco.  It is an invitation only event whose name stands for "Friends Of O'Reilly" and involves about 300 diverse and interesting individuals from different walks of the computer industry (and related industries).  Tim O'Reilly host it at their headquarters and supplies very nice buffet food, showers, rest rooms, and meeting rooms.  I was informed that the best way to enjoy the event is to camp there which means either pitching a tent on their lawn or finding an available space in a meeting room or hallway to throw a sleeping bag.  It was an option to get a hotel room in town.  Now... I haven't camped in about 20 years so this required some thinking... 

I concluded I had two options:

  1. Stay in a hotel room and ensure I remained sober enough to drive at the end of the evening, OR
  2. Buy a tent at REI in Santa Rosa and then hit the high-end liquor store in Santa Rosa to buy enough whiskey to lubricate a serious subset of the 300 attendees.

Naturally, I chose the second option and bought six bottles (some of my favorites and others I hadn't tried). 

It turns out James Hamilton and Pat Helland are friends and are putting their heads together on Energy Efficiency software. James is left, Pat is center, and Jesse Robbins is right.

JesseRobbins_PatHelland_James

 

But, back to the Energy Efficiency software.  Here are Pat's comments.

Sometimes, it was hard to select between the various cool sessions.   I remember the following ones:

  • Data center power -- James Hamilton (my friend from Microsoft) and Jeff Hammerbacher who leads the Facebook data team (but will be leaving soon).  Both James and Jeff were filled with information about running large and dynamic data centers.   The power issues for data centers have been on my mind the last few years and I find that James is a wonderful font of knowledge.   I most definitely love that he is at Microsoft and my friend... I plan to come pepper him with additional questions in the months to com.   Jeff, also, has tons of knowledge from supporting the data needs of Facebook as it has undergone its explosive growth.  This was a fun and invigorating discussion in which I met an attendee, Roger Magoulas who is a research director at O'Reilly.   I have a feeling that there will be opportunities for me to work with Roger, too.
  • Parallel Programming -- Kerry Hammil of Microsoft Research.   We had a fun discussion of the difficulties of getting applications (and, indeed, their libraries and OSes) to be parallel.  There were about 25 great and interesting people participating in this group and, not surprisingly, I participated, too.     This was such a lively discussion for me that it ended up in the hallway and we skipped the next session.

Here is an application architect who cares about data center power and parallel (multicore) programming. I hope we see more people like Pat show up.

Jesse Robbins in the above picture has an interesting background.

  • Technical Program Manager at Amazon.com
  • Manager - IT Operations at Amazon.com
  • Task Force Leader at World Shelters Task Force 1
  • Systems Engineer at Amazon.com
  • Firefighter/EMT (Intern) at Palo Alto Fire Department
Read more

Microsoft's James Hamilton Presents "Where Does Power go in Data Centers and How to get it Back?"

James Hamilton has a blog post about his attending O'Reilly's Foo Camp and his presentation on "Where does Power go in Data Centers and How to get it Back?"

The title for my session was Where Does the Power go in Data Centers and How to get it Back?  I didn’t show slides but much of what we covered is posted at: http://mvdirona.com/jrh/TalksAndPapers/JamesRH_DCPowerSavingsFooCamp08.ppt.  In the session, we talked through how contemporary large data centers work first looking at power distribution. We tracked the power from the feed to the substation at 115,000 volts through numerous conversions before arriving at the CPU at 1.2 volts. We then talked about power saving server design techniques.  And then the mechanical systems used to get the heat back out.  In each section we discussed what could be done to improve the design and how much could be saved.

Our conclusion from the session was that power savings of nearly 4x where both possible and affordable using only current technology.  For those participated in the session, thanks for your contribution and  for your help. It was fun.

James comes to the conclusion there is a power savings of 4x. If  you are curious as to how he comes to this look at his slides.  One of his ideas that flies in the face of high density computing is Thin Slice Computing.

image

Read more

Microsoft Research Paper on SSD Performance and Design Tradeoffs

StorageMojo has a post about Microsoft Research’s paper on Design TradeOffs of SSD.

Design Tradeoffs for SSD Performance

July 15th, 2008 by Robin Harris in Architecture, Future Tech, SSD/Flash Disk

A new Usenix paper looks at NAND flash SSD performance. From a team at Microsoft Research and the University of Wisconsin, including Ted Wobber who worked on last year’s A Design for High-Performance Flash Disks [see Flash chance for the StorageMojo take on that excellent paper - a post Ted was kind enough to review and comment on].

Design Tradeoffs for SSD Performance (by Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse and Rina Panigrahy) makes a deep dive in flash translation layer (FTL) issues. As the authors note, flash vendors keep their FTL designs secret, so the team developed a NAND flash simulator to look at how design choices affected performance.

What they found
They ran several workloads on their trace-based simulator, including TPC-C, Exchange and some file system benchmarks. They found several critical issues in SSD design.

  • Data placement Needed for wear leveling and load balancing.
  • Parallelism Single flash chips aren’t very fast so they need to work together.
  • Write ordering Small random writes are a killer.
  • Workload management You can optimize for sequential or random workloads, but managing both well is hard.

and as StorageMojo closes you need to read the Microsoft Research paper to get a full understanding.

The StorageMojo take
This paper is too rich in detail to summarize well. If understanding SSD controller design is important there is no substitute for a careful read.

The net is that engineers have many options in configuring and managing flash devices inside a solid state disk. The interaction of these design choices with applications is likely to remain a fruitful area of study for years to come.

Expect to see many performance oddities as new solid state disk designs are released. This is a different world than disk drives. There is much innovation and much to learn.

A macro longer-term trade-off is the extent to which SSD vendors should attempt to alter operating system behavior to better match SSDs. In the short term designers must conform to today’s disk I/O oriented operating systems. In the long term however, there must be major opportunities to tweak operating systems to enhance solid-state disk performance.

For this reason SSDs is may find their best short term market to be inside storage arrays where array vendors have complete control over the interface to the array software. This will be no small advantage as array vendors struggle to remain relevant in a world where high performance solid state disks have the potential to replace midsize arrays.

James Hamilton also has a post referring to Spansion’s Flash Memory announcement.

Read more

Microsoft Grabs Leadership Positioning in NetworkWorld Article

After 2 days at the National Data Center Energy Efficiency Workshop and Energy Star, I was looking for a way to summarize some of the issues covered for the 150 attendees.  NetworkWorld reported on the workshop, and did the work for me.  So, let me highlight some parts.

Good incentives boost data-center energy efficiency

By Nancy Gohring , IDG News Service , 07/09/2008

A Microsoft executive shared techniques the company has used, including new kinds of employee incentive programs and internally created automation tools, to reduce the energy consumption of its growing data centers.

The methods he described could help other companies that use or operate data centers reduce costs, said experts who also spoke at the data-center efficiency strategy conference put on by the U.S. Department of Energy and the Environmental Protection Agency in Redmond, Washington, on Tuesday.

While there are plenty of technology solutions for improving data-center energy efficiency, not many companies are using them, said Christian Belady, principal power and cooling architect at Microsoft. "It boils down to a behavioral problem, not necessarily a technology problem," he said.

Microsoft decided to change the incentives for workers as a way to encourage them to use the most energy-efficient techniques. Traditionally, the various business groups within the company were charged for using the company's data centers based on the amount of floor space required to stack the servers that their services used. That spurred a drive within the business units to minimize the space they used, often through the use of extremely dense servers. Those servers, however, sucked power and required more cooling, Belady said.

Now, Microsoft charges business units based on the amount of energy consumed by the servers that host their services. "We moved from cost as a function of space to cost being a function of power," he said.

That shift made individual business units conscious of the number of DIMMs (dual in-line memory modules) they had at their disposal, for example. "Now those DIMMs are costing you power, and you're getting a year-over-year chargeback for those DIMMs," he said. Such charges make the business units less likely to require more memory then their services actually need, he said.

Other industry speakers are quoted.

Incentives are also changing at utility companies in ways that can benefit enterprises. "As a facility manager my incentive isn't to sell you more electricity, but to give you the tools to be more efficient," said Francois Rongere, segment supervisor with PG&E's high technology energy-efficiency team. "My bonus is based on how much savings you have done in my territory."

Those energy savings often translate into real money from the utility. Ray Pfeifer, who works with the Silicon Valley Leadership Group, was recently involved in a series of experiments with companies to try to quantify how certain changes to their data centers affected energy usage. He said that many of the implementations were 100 percent funded by utilities, which often offer incentives to companies for investments that can cut their energy usage.

While that may be good news to the enterprise, the utility incentives only show how behind the curve businesses are in general, said Brill. "It's appalling to me that we have to have utilities offering incentives to do what's good business sense," he said.

Thanks to Nancy Gohring for writing a good article.

Read more

Working at Google vs. Microsoft

Here is an article about developers who have chosen Microsoft over Google, and even some developers who worked at Microsoft went to Google, and went back to Microsoft. I looked at this for maybe some hints to why Google needs 200 employees for data centers vs. Microsoft’s 50. I would bet there are ex-Google data center employees working at Microsoft, but it is almost impossible to find them, let alone get them to share their insights.

Working at Google vs. Working at Microsoft

I have theory that Google's big problem is that the company hasn't realized that it isn't a startup anymore

By: Dare Obasanjo

Jul. 2, 2008 09:15 AM

Dare Obasanjo's Blog

Recently I've been bumping into more and more people who've either left Google to come to Microsoft or got offers from both companies and picked Microsoft over Google. I believe this is part of a larger trend especially since I've seen lots of people who left the company for "greener pastures" return in the past year (at least 8 people I know personally have rejoined). However in this blog post I'll stick to talking about people who've chosen Microsoft over Google.

Interesting parts.

Google software business is divided between producing the "eye candy" - web properties that are designed to amuse and attract people - and the infrastructure required to support them. Some of the web properties are useful (some extremely useful - search), but most of them primarily help people waste time online (blogger, youtube, orkut, etc)

This orientation towards cool, but not necessarilly useful or essential software really affects the way the software engineering is done. Everything is pretty much run by the engineering - PMs and testers are conspicuously absent from the process. While they do exist in theory, there are too few of them to matter.

On one hand, there are beneficial effects - it is easy to ship software quickly…On the other hand, I was using Google software - a lot of it - in the last year, and slick as it is, there's just too much of it that is regularly broken. It seems like every week 10% of all the features are broken in one or the other browser. And it's a different 10% every week - the old bugs are getting fixed, the new ones introduced. This across Blogger, Gmail, Google Docs, Maps, and more

The culture part is very important here - you can spend more time fixing bugs, you can introduce processes to improve things, but it is very, very hard to change the culture. And the culture at Google values "coolness" tremendously, and the quality of service not as much. At least in the places where I worked.

After reading this, maybe Google really does need 200 people per data center to fix all the problems.

And, the article closes with

The fact that Google is having problems retaining employees isn't news, Fortune wrote an article about it just a few months ago. The technology press makes it seem like people are ditching Google for hot startups like FriendFeed and Facebook. However the truth is more nuanced than that. Now that Google is just another big software company, lots of people are comparing it to other big software companies like Microsoft and finding it lacking.

Read more