Google's Urs Hölzle explains why beefier cores are better than whimpy cores

The Register covers a new paper by Google's Urs Hölzle.

Google ops czar condemns multi-core extremists

Sea of 'wimpy' cores will sink you

By Cade Metz in San FranciscoGet more from this author

Posted in Servers, 17th September 2010 07:04 GMT

Free whitepaper – The Reg Guide to Solutions for the Virtual Era

Google is the modern data poster-child for parallel computing. It's famous for splintering enormous calculations into tiny pieces that can then be processed across an epic network of machines. But when it comes to spreading workloads across multi-core processors, the company has called for a certain amount of restraint.

With a paper (PDF) soon to be published in IEEE Micro, the IEEE magazine of chip and silicon design, Google Senior Vice President of Operations Urs Hölzle – one of the brains overseeing the web giant's famous back-end – warns against the use of multi-core processors that take parallelization too far. Chips that spread workloads across more energy-efficient but slower cores, he says, may not be preferable to chips with faster but power-hungry cores.

The paper is here and only 2 pages long.  And, when thinking what motivated Urs to write this paper i think it was his frustration that too many people are focusing on the number of cores to solve a problem and not taking into consideration what happens to the overall system when you try to solve problems with a bunch of whimpy cores vs. brawny cores.

We classify multicore systems as brawny-core systems, whose single-core performance is fairly high, or wimpy-core systems, whose single-core performance is low. The latter are more power efficient. Typically, CPU power decreases by approximately O(k2) when CPU frequency decreases by k, and decreasing DRAM access speeds with core speeds can save additional power.

Urs as usual uses excellent presentation skills to make his point in three areas.

First, the more threads handling a parallelized request, the larger the overall response time. Often all parallel tasks must finish before a request is completed, and thus the overall response time becomes the maximum response time of any subtask, and more subtasks will push further into the long tail of subtask response times. With 10 subtasks, a one-in-a-thousand chance of suboptimal process scheduling will affect 1 percent of requests (recall that the request time is the maximum of all subrequests), but with 1,000 subtasks it will affect virtually all requests.

In addition, a larger number of smaller systems can increase the overall cluster cost if fixed non-CPU costs can’t be scaled down accordingly. The cost of basic infrastructure (enclosures, cables, disks, power supplies, network ports, cables, and so on) must be shared across multiple wimpy-core servers, or these costs might offset any savings. More problematically, DRAM costs might increase if processes have a significant DRAM footprint that’s unrelated to throughput. For example, the kernel and system processes consume more aggregate memory, and applications can use memory-resident data structures (say, a dictionary mapping words to their synonyms) that might need to be loaded into memory on multiple wimpy-core machines instead of a single brawny-core machine.

Third, smaller servers can also lead to lower utilization. Consider the task of allocating a set of applications across a pool of servers as a bin-packing problem—each of the servers is a bin, and we try to fit as many applications as possible into each bin. Clearly that task is harder when the bins are small, because many applications might not completely fill a server and yet use too much of its CPU or RAM to allow a second application to coexist on the same server. Thus, larger bins (combined with resource containers or virtual machines to achieve performance isolation between individual applications) might offer a lower total cost to run a given workload.

How many data center operation VPs can write this paper?  One.  :-)

Keep the number of cores in mind for a green data center, smaller energy efficient processors may not be the most efficient overall.

Read more

Intel acquires McAfee, defining the relationship between Security and energy-efficient performance

Intel announced the purchase of McAfee.

SANTA CLARA, Calif., Aug. 19, 2010 – Intel Corporation has entered into a definitive agreement to acquire McAfee, Inc., through the purchase of all of the company’s common stock at $48 per share in cash, for approximately $7.68 billion. Both boards of directors have unanimously approved the deal, which is expected to close after McAfee shareholder approval, regulatory clearances and other customary conditions specified in the agreement.

image

Most will focus on this as the reason for Intel's acquisition.

The acquisition reflects that security is now a fundamental component of online computing. Today’s security approach does not fully address the billions of new Internet-ready devices connecting, including mobile and wireless devices, TVs, cars, medical devices and ATM machines as well as the accompanying surge in cyber threats. Providing protection to a diverse online world requires a fundamentally new approach involving software, hardware and services.

What caught my eye though is this statement.

Inside Intel, the company has elevated the priority of security to be on par with its strategic focus areas in energy-efficient performance and Internet connectivity.

With a quote from Intel CEO Paul Otellini

“With the rapid expansion of growth across a vast array of Internet-connected devices, more and more of the elements of our lives have moved online,” said Paul Otellini, Intel president and CEO. “In the past, energy-efficient performance and connectivity have defined computing requirements. Looking forward, security will join those as a third pillar of what people demand from all computing experiences.

What Intel has identified is the relationship between Security and Energy-Efficient Performance.  How you approach Security can have a big impact on power consumption for a green data center.  PUE is discussed to explain power and cooling overhead for IT. 

What is the power consumed by security infrastructure?  10%, 20%, 50%

How many systems cannot be consolidated because of security issues?

Security issues contribute to the fiefdoms in data centers.

What is the energy consumption of your security decisions in the data center?

I posted about Security's relationship to being Green back in Apr 2008.

Security is The Opposing Force of Green, demonstration - techniques to remove hard drive data

I was a having a brainstorming session with another smart guy, I don't want to name him, because the idea is too controversial.  We were discussing Green Ideas and we stumbled on the issue of Security being un-Green.

Why? Security at its simplest level creates friction in processes to make things more difficult, this takes more energy, effort, and other resources.  The enemies of your Green IT efforts will be your Security group as they will not want to compromise their security policies.

Now I am not arguing for no security.  It is requirement of any system, but how much security creates an environmental cost which is not sustainable?

Read more

A day of intense meetings, asking what is the future of data centers & evolutionary economics, a view out of the window

Today was a long day, I was up at 6:15a to catch a bus from Lake Oswego, OR to downtown Portland, then to Portland Airport to meet a cloud computer operations director to introduce him to some ideas that enable adapting to changes if you adopt evolutionary economics ideas.

Evolutionary economics deals with the study of processes that transform economy for firms, institutions, industries, employment, production, trade and growth within, through the actions of diverse agents from experience and interactions, using evolutionary methodology.

The data center is ready for a transformation.  Cloud computing is helping to push things in a direction, but there is much more beyond cloud computing.

The thought experiment we went through is what happens if the data center industry adopts an information sharing methodology as opposed to an information hoarding, accelerating change in the industry, asking tough questions of what problems should a data center be solving.  Being open to discover new ways to look at the problems and ask new questions, driving more innovation.

Here is a bit more explanation of evolutionary economics.

Ideas are articulated in language and thus transported into the social domain. Generic ideas, in particular, can bring about cognitive and behavioral processes, and in this respect they are practical and associated with the notion of ‘productive knowledge’. It is generic ideas that evolve and form causal powers underlying the change. Evolutionary economics is essentially about changes in generic knowledge, and involves transition between actualized generic ideas. Actual phenomena, being manifestations of ideas, are seen as ‘carriers of knowledge’.

Three analytical concepts corresponding to ontological axiomatics are thus:

  • (1) carriers of knowledge,
  • (2) generic ideas as components of a process, and
  • (3) evolutionary-formative causality.

After a long day of intense thinking, I am riding the train back to Seattle.  So glad I didn't drive, so I can get some rest and look at the window, taking some time to reflect.

I'll wait for the cloud computing operations director to write his own blog entry, but that may take a while.  As his head is probably just as tired as mine.

image

Read more

Can China build Green Data Centers?

I am having conversations with an entrepreneur in China who is working on the Green Data Center idea in China.  All the big data center operators have been to China to look for data center sites.  I would expect most cannot find the right site for their data center operations for a variety of reasons.  Building data centers will be difficult with a short term approach if you only want to build one building.  What makes more sense is to take small incremental steps with continuous build out in China and other areas in Asia Pacific.

I've been to Beijing,Shanghai, and Hong Kong over a dozen times when I was working at Microsoft and Apple.  As well as Japan, Taiwan, Korea, and Singapore.  I saw many different sides of the country working with hardware suppliers, internal development groups, and software entrepreneurs.

Google's recent pullout of China can be interpreted in many ways and there are some interesting assumptions I can make based on some key people who are coincidentally now working in Google Asia who I used to work with.  These ideas are much too complicated and subtle to try and write in a blog entry.

So, back to the problem of can China build Green Data Centers?  Ideally China would have a few big US companies building data centers in China that Chinese engineers can learn from.  But as far as I know no one has done this, even though a lot of have evaluated sites.  Which makes things difficult, but creates an opportunity.  China doesn't have data center people who have been doing the same thing for the last 20 years who want to build data centers the same way they did in the past.

China can build smaller data centers, using geo-redundancy as part of the design.  Power may not exist, but China is building power generation faster than anyone else.  So, it isn't what power is available.  Tell me what power will be available.  See this Economist article.

Electricity and development in China

Lights and action

China is parlaying its hunger for power into yet more economic clout

Apr 29th 2010 | HONG KONG | From The Economist print edition

AFTER a brief blip caused by the global economic slowdown, the electricity business in China is back to normal: in other words, it is buzzing. On April 26th Huaneng Power, the country’s biggest utility, began work on a nuclear reactor on the island of Hainan. The week before, the firm had announced that its power output had risen by 40% during the first quarter. The day before that, Datang International Power, the second-largest utility, had said its output was up by 33%. Surges of this magnitude, unimaginable in most countries, are commonplace in China.

China’s endless power-plant construction boom has accounted for 80% of the world’s new generating capacity in recent years and will continue to do so for many years to come, says Edwin Chen of Credit Suisse, an investment bank. Capacity added this year alone will exceed the installed total of Brazil, Italy and Britain, and come close to that of Germany and France. By 2012 China should produce more power annually than America, the current leader.

The US gov't hasn't treated the Data Center industry as a strategic industry to provide special treatment.  China will.

Much of the data centers are built and designed to maximize profits for the vendors.  Data Centers are the most profitable construction.  The silos in Real Estate, Facilities, Data Center Ops, IT Ops, Finance, and SW are ripe for over specification for features that have little business value in the holistic view, but look right from a limited perspective.  The top data center people know this which is why they have broken down the silos and integrated the functionality within one manager.  Look to Google's Urs Hoelzle as the epitome of owning the data center stack, including SW infrastructure.

It is a bit of irony if China's data center strategy targeted Urs and his thinking as the customer, asking what he wants in data center infrastructure.  Google wants cheap, reliable, cleaner power.  Multiple Fiber paths.  And, government support for the data center build out.  In the US we hear about the tax incentives, and this is proof the local community wants the data center construction.

An example of the opportunity is to be work with the SinoHydro on a China Data Center strategy.  Here a perspective you'll enjoy reading on China's HydroElectric build out.

China: Not the Rogue Dam Builder We Feared It would Be?

Hydropwer accounts for the overwhelming share of China’s alternative energy mix, but is perhaps also the one of the more controversial alternative energy options due to the ecological and social impacts of dam construction.   This guest post by Peter Bosshard, policy director of International Rivers Network, examines China’s growing pains in its increasing role as an exporter of hydropower technology and expertise.

A few years ago, Chinese dam builders and financiers appeared on the global hydropower market with a bang. China Exim Bank and companies such as Sinohydro started to take on large, destructive projects in countries like Burma and Sudan, which had before been shunned by the international community. Their emergence threatened to roll back progress regarding human rights and the environment which civil society had achieved over many years. However, new evidence suggests that Chinese dam builders and financiers are trying to become good corporate citizens rather than rogue players on the global market. Here is a progress report.

Could you partner with China to build data centers around the world where dams are being built?  The power generation is one part, and Fiber is next.  Government support fits in easily as governments were involved in the Hydro construction.

One of Google's crown jewels are its data center designs.  Is part of the reason why Google pulled out of China are the issues they ran into if they built a data center in China?

Read more

Long Now, Long View, Long Lived Data Center, a 10,000 year clock - a 10,000 year data center?

I am currently thinking of rules for the ontology in data center designs.  Translated, I am trying to figure out the principles, components, and relationships for the Open Source Data Center Initiative. 

This is a complex topic to try and explain, but I found an interesting project the Long Now started by a bunch of really smart people, Jeff Bezos, Esther Dyson, Mitch Kapor, Peter Schwartz, and Steward Brand.  Here is a video discussing the idea of a 10,000 year clock.

 

But, what I found interesting was their long term approach and transparency that we will be using in the Open Source Data Center Initiative.  And, now thinking a Long View is part of what we have as principles.

Here are the principles of the Long Now Clock that make a lot of sense to use data center design.


These are the principles that Danny Hillis used in the initial stages of designing a 10,000 Year Clock. We have found these are generally good principles for designing anything to last a long time.

Longevity

With occasional maintenance, the clock should reasonably be expected to display the correct time for the next 10,000 years.

Maintainability

The clock should be maintainable with bronze-age technology.

Transparency
It should be possible to determine operational principles of the clock by close inspection.
Evolvability
It should be possible to improve the clock with time.
Scalability
It should be possible to build working models of the clock from table-top to monumental size using the same design.

Some rules that follow from the design principles:

Longevity:
Go slow
Avoid sliding friction (gears)
Avoid ticking
Stay clean
Stay dry
Expect bad weather
Expect earthquakes
Expect non-malicious human interaction
Dont tempt thieves
Maintainability and transparency:
Use familiar materials
Allow inspection
Rehearse motions
Make it easy to build spare parts
Expect restarts
Include the manual
Scalability and Evolvabilty:
Make all parts similar size
Separate functions
Provide simple interfaces

Why think about a 10,000 year clock, because thinking about slowness teaches us things we don't have time when think only of speed.

Hurry Up and Wait

The Slow Issue > Jennifer Leonard on January 5, 2010 at 6:30 am PST

018-futurists-1

We asked some of the world’s most prominent futurists to explain why slowness might be as important to the future as speed.

Julian Bleecker
Julian Bleecker, a designer, technologist, and co-founder of the Near Future Laboratory, devises “design-to-think experiments” that focus on interactions away from conventional computer settings. “When sitting at a screen and keyboard, everything is tuned to be as fast as possible,” he says. “It’s about diminishing time to nothing.”
So he asks, “Can we make design where time is inescapable and not be brought to zero? Would it be interesting if time were stretched, or had weight?” To test this idea, Bleeker built a Slow Messaging Device, which automatically delayed electronic (as in, e-mail) messages. Especially meaningful messages took an especially long time to arrive.

Read more: http://www.good.is/post/hurry-up-and-wait#ixzz0jmOEcLg4

The biggest unknown problems in data centers are those things that we didn't think were going to happen in the future.  And, this leaves the door open to over-engineering, increasing cost, brittleness of systems, and delays.  Taking a Long View what will the future possibly look like can help you see things you normally wouldn't.

image

Read more