History of Google's Evolution of Massive Scale Data Processing

To figure out what to do now and the future it is useful to look at history. One that I am currently immersed in is what the platform for future systems should look like and one part is the data processing area. I found a great video from Tyler Akidau and there is a link below. But let's start with the end. What is the evolution? Here are the technologies that Tyler covers and what they did.

Screen Shot 2018-03-14 at 10.16.52.png

Here is the list in a timeline view.

Screen Shot 2018-03-14 at 10.23.50.png

What can you do with this data processing approach? Here are 6 ways to process data.

Screen Shot 2018-03-14 at 10.28.14.png

 

The youtube video is below.

If you want stuff to read you can go to long posts on streaming - 101 and 102

The Good and Bad Ways that Teams Interact

To run Internet services requires many teams to interact. What makes things harder is there are not a clear set of rules to support Good Ways for the teams to interact. The ways tend to develop organically. Speaking of organic I was looking at a presentation John Boyd created called "Organic Design for Command and Control." Slide 10 had the positive and negative interactions of command and control. This made me think of how teams interact. The slide deck is here. and the slide I was looking at is this one.

I've been staring at this slide for a while and let me translate this into how it applies to teams. If more managers spent time reducing the negatives and increasing the positives the teams will work so much better.

Having the Best doesn't necessarily work if you don't have the knowledge that supports it

F1 Racing is the most technically advanced racing out there.  More money and more technology is thrown at winning that any other racing.  Back in the early 90s working at Microsoft there were a bunch of us who would get together at somebody’s house at 6a Sunday morning to watch the European F1 races.  One guy was so into F1, he quit Microsoft and joined the Ferrari race team to work on the computer systems in the cars.

McLaren racing dumped Mercedes engines for Honda in the 2015 season, and part of the reason is McLaren wanted the source for the engine systems.

"A modern grand prix engine at this moment in time is not just about sheer power; it's about how you harvest the energy, store the energy and effectively if you don't have control of that process - meaning access to source code - then you are not going to be able to stabilise your car in the entry to corners, for instance, and you lose lots of lap time. So even though you have the same brand of engine you do not have the ability to optimise the engine."

I have been out of following F1, but 2015 might be when I start following again.  Here is a Honda video they released on their 2015 engine.  Honda has bet on one team McLaren to win.  Which means they’ll be sharing everything they can to get the most performance out of their engine.

Two Things that will Make Your Data Center AI Projects Hard to Execute - Data & Culture

It was predictable that with Google sharing its use of Machine Learning in a mathematical model of a mechanical system that others would say they can do it too.  DCK has a post on Romonet and Vigilent being other companies that use AI concepts in data centers.

Google made headlines when it revealed that it is using machine learning to optimize its data center performance. But the search giant isn’t the first company to harness artificial intelligence to fine-tune its server infrastructure. In fact, Google’s effort is only the latest in a series of initiatives to create an electronic “data center brain” that can analyze IT infrastructure.

...

One company that has welcomed the attention around Google’s announcement is Romonet, the UK-based maker of data center management tools.

...

 Vigilent, which uses machine learning to provide real-time optimization of cooling within server rooms.

Google has been using Machine Learning for a long time and uses it for many other things like their Google Prediction API.

What is the Google Prediction API?

Google's cloud-based machine learning tools can help analyze your data to add the following features to your applications:

Customer sentiment analysis

Spam detection
Message routing decisions

Upsell opportunity analysis
Document and email classification

Diagnostics
Churn analysis

Suspicious activity identification
Recommendation systems

And much more...

Here is a Youtube video from 2011 where Google is telling developers how to use this API.

Learn how to recommend the unexpected, automate the repetitive, and distill the essential using machine learning. This session will show you how you can easily add smarts to your apps with the Prediction API, and how to create apps that rapidly adapt to new data.

So you are all pumped up to get AI in your data center.  But, here are two things you need to be aware of that can make your projects harder to execute.

First the quality of your data.  Everyone has heard garbage in - garbage out.  But when you create machine learning systems the accuracy of data can be critical.  Google’s Jim Gao, their data center “boy genius” discusses one example.

 Catching Erroneous Meter Readings

In Q2 2011,Google announced that it would include natural gas as part of ongoing efforts to calculate PUE in a holistic and transparent manner [9]. This required installing automated natural gas meters at each of Google’s DCs. However, local variations in the type of gas meter used caused confusion regarding erroneous measurement units. For example, some meters reported 1 pulse per 1000 scf of natural gas, whereas others reported a 1:1 or 1:100 ratio. The local DC operations teams detected the anomalies when the realtime, actual PUE values exceeded the predicted PUE values by 0.02 - 0.1 during periods of natural gas usage.

Going through all your data inputs to make sure the data is clean is painful.  Google used 70% of its data to train the model and 30% to validate the model.  Are you that disciplined?  Do you have a mechanical engineer on staff who can review the accuracy of your mathematical model?

Second, the culture in your company is an intangible to many.  But, if you have been around enough data center operations staff, their habits and methods are not intangible.  They are real and what makes so many things happen.  Going back to Google’s Jim Gao.  He had a wealth of subject matter expertise on machine learning and other AI methods in Google.  He had help deploying the models from Google staff.  And he had the support of the VP of data centers and the local data center operations teams.

 I would like to thank Tal Shaked for his insights on neural network design and implementation. Alejandro

Lameda Lopez and Winnie Lam have been instrumental in model deployment on live Google data centers.

Finally, this project would not have been possible without the advice and technical support from Joe Kava,

as well as the local data center operations teams.

Think about these issues of data quality and the culture in your data center before you attempt an AI project.  If you dig into automation projects it is rarely as easy as when people thought it would be.

Google's Data Center Machine Learning enables shaving Electricity Peak Demand Charges

A week ago I was able to interview Google’s Joe Kava, VP of Data Centers regarding Better Data Centers through Machine Learning.  The media coverage is good and almost everyone focuses on the potential for lower power consumption.

Google has put its neural network technology to work on the dull but worthy problem of minimizing the power consumption of its gargantuan data centers.

One of the topics I was able to discuss with Joe is the idea that accurately prediction of PUE and a mathematical model of the mechanical systems enables Google to focus on the Peak Demand during the billing period to reduce overall charges.  The above quote says power consumption is dull. What is focusing on peak power demand?  Crazy.  Or you understand a variable cost of running your data center. :-)

How you get billed is complicated and varies widely dependingUnderstanding Peak Demand Charges on your specific contract, but it’s important for you to understand your tariff. Without knowing exactly how you're billed for energy, it's difficult to prioritize which energy savings measures will have the biggest impact. 

...

In many cases, electricity use is metered (and you are charged) in two ways by your utility: first, based on your total consumption in a given month, and second, your demand, based on the highest capacity you required during the given billing period, typically a 15-minute interval during that billing cycle.

To use an analogy, think about consumption as the number that registers on your car’s odometer – to tell you how far you’ve driven – and demand as what is captured on your speedometer at the moment when you hit your max speed. Consumption is your overall electricity use, and demand is your peak intensity, or maximum “speed.”

National Grid does a great job explaining this: "The price we pay for anything we buy contains the cost of the product plus profit, plus the cost of making the product available for sale, or overhead.” They suggest that demand is akin to an overhead expense and note that “this is in contrast to charges…customers pay for the electricity itself, or the ‘cost of product,’ largely made up of fuel costs incurred in the actual generation of energy. Both consumption and demand charges are part of every electricity consumer’s service bill.”

When you think about the ROI of reducing your energy consumption the business people should understand the overall consumption and the peak demand of its operations.  Unfortunately it is all too common for people to focus only on the $/kWhr.

Google can look at the peak power consumption and see if there are ways the PUE could be improved to reduce the peak power for the billing period.

NewImage

Here are tips that can help you shave peak demand.

Depending on your rate structure, peak demand charges can represent up to 30% of your utility bill. Certain industries, like manufacturing and heavy industrials, typically experience much higher peaks in demand due largely to the start-up of energy-intensive equipment, making it even more imperative to find ways to reduce this charge – but regardless of your industry, taking steps to reduce demand charges will save money.

...

Consider no or low-cost energy efficiency adjustments you can make immediately. When you start up your operations in the morning, don't just flip the switch on all of your high intensity equipment. Consider a staged start-up: turn on one piece of equipment at a time, create a schedule where the heaviest intensity equipment doesn’t all operate at full tilt simultaneously, and think about what equipment can be run at a lower intensity without adverse effect. You may use more kWh – resulting in greater energy consumption or a higher “energy odometer” reading as discussed above – but you'll ultimately save on demand charges and your energy bill overall will be lower.