Identify the Power Hog in your data center, AOL uses the brass pig award

Mike Manos has a post on eliminating the cruft in data center.

Attacking the Cruft

Today the Uptime Institute announced that AOL won the Server Roundup Award.  The achievement has gotten some press already (At Computerworld, PCWorld, and related sites) and I cannot begin to tell you how proud I am of my teams.   One of the more personal transitions and journeys I have made since my experience scaling the Microsoft environments from tens of thousands of servers to hundreds of thousands of servers has been truly understanding the complexity facing a problem most larger established IT departments have been dealing with for years.  In some respects, scaling infrastructure, while incredibly challenging and hard, is in large part a uni-directional problem space.   You are faced with growth and more growth followed by even more growth.  All sorts of interesting things break when you get to big scale. Processes, methodologies, technologies, all quickly fall to the wayside as you climb ever up the ladder of scale.

What I like is the the Power Hog part.

Power Hog – An effort to audit our data center facilities, equipment, and the like looking for inefficient servers, installations, and /or technology and migrating them to new more efficient platforms or our AOL Cloud infrastructure.  You knew you were in trouble when you had a trophy of a bronze pig appear on your desk or office and that you were marked.

NewImage

 

MSN data center construction video

Many data centers have video cameras taking on site filming the construction project.  I've seen plenty, but normally these videos are only for the project team.

If you want to see a project, here is a Microsoft MSN data center video that shouldn't probably be posted.  Check it out before before it gets pulled down.  It is only 2 minutes.

1000 Genomes, 200+TB of data available in AWS to run compute jobs

Normally when you think of running a compute project in AWS, you need to move you data and then compute.  AWS has hosted the 1000 Genome project with over 200 TB of data available to run compute jobs against without moving the data into the environment.

The 1000 Genomes Project

We're very pleased to welcome the 1000 Genomes Project data to Amazon S3.

The original human genome project was a huge undertaking. It aimed to identify every letter of our genetic code, 3 billion DNA bases in total, to help guide our understanding of human biology. The project ran for over a decade, cost billions of dollars and became the corner stone of modern genomics. The techniques and tools developed for the human genome were also put into practice in sequencing other species, from the mouse to the gorilla, from the hedgehog to the platypus. By comparing the genetic code between species, researchers can identify biologically interesting genetic regions for all species, including us.

This is a lot of data.

The data is vast (the current set weighs in at over 200Tb), so hosting the data on S3 which is closely located to the computational resources of EC2 means that anyone with an AWS account can start using it in their research, from anywhere with internet access, at any scale, whilst only paying for the compute power they need, as and when they use it. This enables researchers from laboratories of all sizes to start exploring and working with the data straight away. The Cloud BioLinux AMIs are ready to roll with the necessary tools and packages, and are a great place to get going.

Making the data available via a bucket in S3 also means that customers can crunch the information using Hadoop via Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.

It is interesting to think that AWS is hosting data that is too expensive for people to move around.

More information can be found here http://aws.amazon.com/1000genomes/

If you want to get the data yourself.  here it is

Other Sources

The 1000 Genomes project data are also freely accessible through the 1000 Genomes website, and from each of the two institutions that work together as the project Data Coordination Centre (DCC).

Is the end of Coal Power coming to the USA? EPA proposes new rules

MSNBC reports on EPA's new rules for Coal Power Plants.

End of coal power plants? EPA proposes new rules


By msnbc.com staff and news services

The Obama administration on Tuesday proposed the first-ever standards to cut carbon dioxide emissions from new power plants -- a move welcomed by environmentalists but criticized by some utilities as well as Republicans, who are expected to use it as election campaign fodder.

The difficulty for Coal Power plants is they need to meet the same emissions as natural gas plants.

While the proposed rules do not dictate which fuels a plant can burn, they would require any new coal plants essentially to halve carbon dioxide emissions to match those of plants fired by natural gas.

The pessimist view comes from the Coal industry.

Steve Miller, CEO and President of the American Coalition for Clean Coal Electricity, a group of coal-burning electricity producers, took a more dismal view, saying it "will make it impossible to build any new coal-fueled power plants and could cause the premature closure of many more coal-fueled power plants operating today."

Other opponents of the long-delayed EPA proposal say it will limit sources for electricity by making coal prohibitively expensive.

The NRDC and American Lung Association cheered the new rules.

Frances Beinecke, president of the Natural Resources Defense Council, called it a "historic step ... toward protecting the most vulnerable among us — including the elderly and our children — from smog worsened by carbon-fueled climate change."

The American Lung Association agreed. "Scientists warn that the buildup of carbon pollution will create warmer temperatures which will increase the risk of unhealthful smog levels," said board chairman Albert Rizzo. "More smog means more childhood asthma attacks and complications for those with lung disease."

Do you get your electricity from Coal?  What happens to your electricity prices in the future?

 

Using Situation Awareness Principle to Green the Data Center, Google continues the march from 1.16 to 1.14 PUE

Google posts it's latest PUE achievement of 1.14.

Measuring to improve: comprehensive, real-world data center efficiency numbers

March 26, 2012 at 9:00 AM
To paraphrase Lord Kelvin, if you don’t measure you can’t improve. Our data center operations team lives by this credo, and we take every opportunity to measure the performance of our facilities. In the same way that you might examine your electricity bill and then tweak the thermostat, we constantly track our energy consumption and use that data to make improvements to our infrastructure. As a result, our data centers use 50 percent less energy than the typical data center.
...
NewImage
...
NewImage

Google's Joe Kava uses the Lord Kelvin principle of "if you don't measure you can't improve."  But, I think a more apt explanation for the complexity of greening a data center is situation awareness.

Situation awareness

From Wikipedia, the free encyclopedia

Situation awareness is the perception of environmental elements with respect to time and/or space, the comprehension of their meaning, and the

projection of their status after some variable has changed, such as time. It is also a field of study concerned with perception of the environment

critical to decision-makers in complex, dynamic areas from aviationair traffic control, power plant operations, military command and control, and

emergency services such as fire fighting and policing; to more ordinary but nevertheless complex tasks

such as driving an automobile or bicycle.

Situation awareness involves being aware of what is happening in the vicinity to understand how information, events, and one's own actions will

impact goals and objectives, both immediately and in the near future. Lacking or inadequate situation awareness has been identified as one of

the primary factors in accidents attributed to human error.[1] Thus, situation awareness is especially important in work domains where the information

flow can be quite high and poor decisions may lead to serious consequences (e.g., piloting an airplane, functioning as a soldier, or treating critically

ill or injured patients).

Having complete, accurate and up-to-the-minute SA is essential where technological and situational complexity on the human decision-maker are

a concern. Situation awareness has been recognized as a critical, yet often elusive, foundation for successful decision-making across a broad

range of complex and dynamic systems, including aviation and air traffic control,[2] emergency response and military command and controloperations,[3]

and offshore oil and nuclear power plant management.[4]

Situation awareness vs. Lord Kelvin's principle has you thinking in the bigger picture.  Thinking about knowledge.  Am I doing the right thing?  How did I get here and can I repeat it?

Situation assessment

Endsley (1995b, p. 36) argues that "it is important to distinguish the term situation awareness, as a state of knowledge, from the processes used to achieve that state. These processes, which may vary widely among individuals and contexts, will be referred to as situation assessment or the process of achieving, acquiring, or maintaining SA." Thus, in brief, situation awareness is viewed as "a state of knowledge," andsituation assessment as "the processes" used to achieve that knowledge. Note that SA is not only produced by the processes of situation assessment, it also drives those same processes in a recurrent fashion. For example, one's current awareness can determine what one pays attention to next and how one interprets the information perceived (Endsley, 2000).

Google has shared the high level concepts of achieving a lower PUE.

1. Measure PUE

You can't manage what you don’t measure, so characterize your data center's efficiency performance by measuring energy use. We use a ratio called PUE - Power Usage Effectiveness - to help us reduce energy used for non-computing, like cooling and power distribution. To effectively use PUE it's important to measure often - we sample at least once per second. It’s even more important to capture energy data over the entire year - seasonal weather variations have a notable affect on PUE.

2. Manage airflow

Good air flow management is fundamental to efficient data center operation. Start with minimizing hot and cold air mixing by using well-designed containment. Eliminate hot spots and be sure to use blanking plates for any unpopulated slots in your rack. We've found a little analysis can pay big dividends. For example, thermal modeling using computational fluid dynamics (CFD) can help you quickly characterize and optimize air flow for your facility without many disruptive reorganizations of your computing room. Also be sure to size your cooling load to your expected IT equipment, and if you are building extra capacity, be sure your cooling approach is energy proportional

...

 

 

 

 

 

 

 

 

 

 

 

 

What does Google do to determine where it should spend its resources?  At some point there is a marginal return or a negative return.  It will cost more than what can be saved.  On the other hand at Google's scale what may be small for most can be huge for them.

Our 2011 numbers and more are available for closer examination on our data center site. We’ve learned a lot through building and operating our data centers, so we’ve also shared our best practices. These include steps like raising the temperature on the server floor and using the natural environment to cool the data center, whether it’s outside air or recycled water.

The really interesting thing to know is what has Google tried and found not to work.  As any good engineer knows many times you learn more from failures than success.

Cover Image: November 2009 Scientific American MagazineSee Inside

How You Learn More from Success Than Failure

The brain may not learn from its mistakes after all

Have you ever bowled a string of strikes that seems like it came out of nowhere? There might be more to such streaks than pure luck, according to a study that offers new clues as to how the brain learns from positive and negative experiences.

I think good engineers have learned to rewire their brain vs. others.

“Success has a much greater influence on the brain than failure,” says Massachusetts Institute of Technology neuroscientist Earl Miller, who led the research. He believes the findings apply to many aspects of daily life in which failures are left unpunished but achieve­ments are rewarded in one way or another—such as when your teammates cheer your strikes at the bowling lane. The pleasurable feeling that comes with the successes is brought about by a surge in the neurotransmitter dopamine. By telling brain cells when they have struck gold, the chemical apparently signals them to keep doing whatever they did that led to success. As for failures, Miller says, we might do well to pay more attention to them, consciously encouraging our brain to learn a little more from failure than it would by default.