Why is it important for Google and others to uncloak? Star Trek's Gene Roddenberry provides a view on human nature.
Gene Roddenberry indicated in various interviews that "our heroes don't sneak around", indicating that the Federation made a conscious decision to not develop cloaking technology.
We have all read about Google's PUE data center announcement, and I was waiting for the news to die down. Then, Google's PR group offered me the chance to discuss their PUE announcements in more detail, being a a curious guy I said sure. 1-1 discussions are always useful.
Well they must have really wanted something more to be written, because they set up my meeting with Google's Sr. VP of Operations, Urs Hoelzle to discuss details on Google's PUE data center details.
Senior Vice President, Operations & Google Fellow
Urs Hölzle served as the company's first vice president of engineering and led the development of Google's technical infrastructure. His current responsibilities include the design and operation of the servers, networks and datacenters that power Google.
In 1996, Urs received a CAREER award from the National Science Foundation for his work on high-performance implementations of object-oriented languages. He was also a leading contributor to DARPA's National Compiler Infrastructure project. Urs has served on program committees for major conferences in the field of programming language implementation, and is the author of numerous scientific papers and U.S. patents.
At the start of conversation, Urs expressed his concern to be credible with his group's PUE calculations and not be viewed as a marketing exercise. Urs being Google's #10 employee he can hopefully say what he thinks, and he knows marketing hype hurts credibility in the long term.
So, to test his premise of wanting to be credible, I told him it doesn't make sense that Google said it couldn't accurately measure data centers under 5mW.
It is worth noting that we only show data for facilities with an actual IT load above 5MW, to eliminate any inaccuracies that can occur when measuring small values. This section is aimed at data center experts, but we have tried to make it accessible to a general technical audience as well.
Urs said it was a mistake in how the 5mW was communicated. There are no Google built data centers below the 5mW. The less than 5mW data centers are all in colocation faciltiies, and Google has the same difficulty we all have in getting PUE numbers let alone accurate ones from their colocation facilities. Urs also brought up an example of an issue in small facilities; concrete floors can be a heat sink in small facilities, contributing to inaccurate measurements. Add up issues like this with the unknown of the colocation's instrumentation, then the accuracy of Google has in its built data centers are magnitudes better than their colocation facilities. Yes, he does have high PUE numbers in his colocation facilities,and as the leases expire the plan is to move the capacity into Google built data centers. What percentage is in colocation facilities is a question I forgot to ask, but I won't hold my breath for an answer to that question.
Urs made the point of providing one year of PUE data as shown below. The PUE performance is across a range of environmental and load conditions, measured year round. As expected the PUE #'s are higher during the summer and lower in the winter given Google's use of economizers.
But, what I find is more credible is a chart like below as Microsoft did on their PUE blog entry. So for any of you sharing PUE data, please provide more data. PUE is a dynamic #, and is expected to fluctuate throughout time.
Next we discussed Google's PUE of 1.15 which many have questioned. Urs explained this is not possible in a general purpose data center, and this PUE was achieved by designing specifically for the Google HW in the facility. And, 0.05 of the PUE savings came from designing the data center specifically for Google's server design. As an example, Google designed the server fans to be efficient removing heat, and not waste energy moving excess air. Which Urs pointed out is a waste in many server designs, but important to Google.
It is ironic that if you chose to remove all fans from server HW and make it part of the cooling infrastructure, then your PUE would go up. Or can you count these fans as part of the IT load as it is if it is replacement for server HW? This is another detail I am curious to understand Google's approach.
We then moved on to issues of standardization for data center equipment. A specific Urs pointed out is a generator rated at 2mW peak capacity. Yet it is not recommended to run the the generator at peak capacity. What is the rated capacity in operation, 80%? So, if you do run the generator at 80% what is the life expectancy? What loads were run on the generator to provide maintenance recommendations? All these are different for each vendor, requiring Google to come up with its own standards to measure equipment equally and representative of how the equipment will run in their environment. This lack of standards burdens all companies.
Given I blogged about Sun's PUE efforts as well, Urs discussed some of his concerns on how Sun calculated PUE. This discussion went on for awhile. Jumping to the end of this discussion, I threw out the idea do we need a PUE auditing/compliance tests. Should there be an independent company like auditors who certify the PUE results? Checking the accuracy of measurements and calculations with access to operational data, then certifying the measurement methods behind a public PUE disclosure are sound. Urs said he'll connect me with someone in Google to discuss this idea, and I am going to talk to some others I know who could create this service.
I could go on for longer, but given this is a blog entry, I know the longer it goes the less you will read. And, after my meeting with Urs, I have a lot more questions for Google.
This is just the beginning.
Google is uncloaking and we are starting to see details.
The game is changing. The Financial Institutions used to be the people who set the direction for data centers, Five Nines of reliability, high redundancy, physical security, etc. Who cares about energy efficiency? We make lots of money with our highly reliable data center infrastructure. Uhh, we did make lots of money.
The new paradigm is how well do you use power. The new rules are going to be made by those who use power the best providing information services. And, as part of Google coming out telling a story, it is the start of a new battle for mind share. A Green/Sustainable/low impact data center strategy is required to provide pervasive Internet Services. The last thing Google wants is to have its growth limited by the availability of power.
This is the start of a new way where those who share their knowledge will lead the industry with best practices, creating an advantage with brand awareness. Look at how much coverage Google got with its announcement.