Machine Learning (ML) in Google’s Data Center, Jeff Dean shares details

Jeff Dean is one of Google’s amazing staff who works on data centers. He posted a presentation on ML that is here. Who is Jeff Dean? Here is a business insider article on Jeff. If you want a good laugh check out the jokes on Jeff Dean’s capabilities. I’ve been lucky to have a few conversations with Jeff and watched him up close which helps to read the ML presentation.

Below is a small fraction of what is in Jeff’s presentation. It is going to take me a while to digest it, and luckily I shared the presentation with one of my friends who has been getting into ML architecture and we are both looking at ML systems. 

Part of Jeff’s presentation is the application of ML in the data center. 

FullSizeRender.jpg

This slide doesn’t show up until 3/4 through the presentation, and to show you how important this slide is it shows up again in Jeff’s conclusion slide. 

 

FullSizeRender.jpg

So now that you have seen the end slide what is Jeff trying to do?  Kind of simple he wants a computational power beyond the limits of Intel Processors. Urs Hoelzle wrote a paper on the need for brawny cores to replace the direction for wimpy cores. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36448.pdf

FullSizeRender.jpg

So what’s this look like? 

 

FullSizeRender.jpg

Look at the aisle shot. 

FullSizeRender.jpg

And here is shot of the TPU logic board with 4 TPUs. 

FullSizeRender.jpg

AWS IOT platform will be a new platform for DCIM

A couple of weeks ago I wrote about Implementing DCIM on an IOT Platform. And that the end of DCIM is coming. What I didn’t think about was how quickly things can change and AWS would be at the center of this change.

AWS Re:invent is going on. I have never gone, but have plenty of friends who are there and I can chat with them. What I did do yesterday was watch Andy Jassy’s opening keynote and the best part was the last 15 minutes where Andy covered the AWS IOT Platform. Below are the 8 parts of the platform.

Also I placed a pre-order for an AWS DeepLens

With the AWS Platform the current ways of building DCIM are shown to be in the past. The future is to build DCIM on IOT. All you DCIM vendors get ready to compete against AWS and its partners. This includes Litbit who was trying to do a subset of AWS IOT. Luckily for DCIM vendors AWS IOT is not targeting the DCIM market. AWS IOT is going after the industrial IOT market which is magnitudes bigger. I checked out the vendor list to run their services on at the edge and it is impressive for a start at launch.

AWS IOT is good enough that I am looking to add it to a lab environment to play with it more. 

AWS IoT Services

AWS IoT Core 

AWS IoT Core is a managed cloud platform that lets connected devices easily and securely interact with cloud applications and other devices. IoT Core can support billions of devices and trillions of messages, and can process and route those messages to AWS endpoints and to other devices reliably and securely.

Learn more »

AWS IoT Device Management 

AWS IoT Device Management is a service that makes it easy to securely onboard, organize, monitor, and remotely manage IoT devices at scale.

Learn more »

AWS Greengrass

AWS Greengrass is software that lets you run local compute, messaging & data caching for connected devices in a secure way. With AWS Greengrass, connected devices can run AWS Lambda functions, keep device data in sync, and communicate with other devices securely - even when not connected to the Internet.

Learn more »

AWS IoT Analytics

AWS IoT Analytics is a fully-managed service that makes it easy to run sophisticated analytics on massive volumes of IoT data without having to worry about all the cost and complexity typically required to build your own IoT analytics platform. It is the easiest way to run analytics on IoT data and get insights to make better and more accurate decisions for IoT applications and machine learning use cases.

Learn more »

Amazon FreeRTOS

Amazon FreeRTOS is an operating system for microcontrollers that makes small, low-power edge devices easy to program, deploy, secure, connect, and manage.

Learn more »

AWS IoT 1-Click 

AWS IoT 1-Click is a service that makes it easy for simple devices to trigger AWS Lambda functions that execute a specific action. Some examples of possible actions include calling technical support, reordering goods and services, or locking and unlocking doors and windows.

Learn more »

AWS IoT Button

The AWS IoT Button is a programmable button based on the Amazon Dash Button hardware. This simple Wi-Fi device is easy to configure and designed for developers to get started with AWS IoT CoreAWS LambdaAmazon DynamoDBAmazon SNS, and many other Amazon Web Services without writing device-specific code.

Learn more »

AWS IoT Device Defender

AWS IoT Device Defender is a fully managed service that helps you secure your fleet of IoT devices. AWS IoT Device Defender continuously audits the security policies associated with your devices to make sure that they aren’t deviating from security best practices. AWS IoT Device Defender also lets you monitor devices for behavior that deviates from what you have defined as appropriate behavior for each device. 

Learn more »

 

Are you ready for Edge Computing in 2017? I started the discussion in 2010

Tom Krazit at Geekwire posts on edge computing. https://www.geekwire.com/2017/setting-edge-cloud-experts-sketch-edge-computing-will-evolve/

“The last ten years marked a centralization of computing, in which we moved away from relying on our individual computers to process our orders toward a world in which lightweight mobile apps and web services backed by powerful cloud data centers took over.

At Structure 2017 in San Francisco on Tuesday, it was pretty clear things are moving back in the other direction.

Several of the sessions on the opening day of the venerable cloud computing conference addressed the growing certainty that computing power is moving back to intelligent connected devices on the “edge” of the network. Microsoft CEO Satya Nadella made it a key theme of his opening keynote at Microsoft Build in May, and momentum toward this shift would appear to be growing.”

Data Center Frontier posts on who the players are. https://datacenterfrontier.com/edge-computing-101/

“Edge computing is a hot topic right now, and holds the potential to alter the geography of the data center industry, as infrastructure adapts to support the Internet of Things, virtual reality and connected cars.

As these technologies develop and gain traction, a number of companies are targeting the challenges address challenges and opportunities of deploying capacity at the edge of the network. Here’s our guide to the new players on the edge, which includes both startups and established names in the data center sector.”

You can run a google news search and find “edge computing” comes on a regular basis.

A few people have have contacted me to tell me about their efforts. I tell them I discussed this idea years ago. Here is a post from Feb 2011 where I wrote about DC Containers at cell towers. http://www.greenm3.com/gdcblog/2011/2/8/50-lower-carbon-footprint-with-new-cell-tower.html?rq=Cell%20tower

In 2010 is when I realized that compute was going to move to reduce the latency issues and network performance. The practice of having Points of Presence and Content Distribution Network was well established, but there was still an interesting opportunity to get close to devices. Since 2010 the growth of mobile devices has replaced the desktop and notebook use case.

The idea of edge computing is not new. It is just more popular. Edge computing is just one piece in the overall system and how it gets used takes time to figure out. I have had 7 years to think about it. We will see what others try.

Example applying abstraction to IT asset management to make it more powerful

Years and years ago I went to the IAITAM with friends from 3 of the big 5 data center companies (GAFAM) who all work on asset management. IAITAM is the International Association of IT Asset Managers. At the conference I realized that what most of the presentations and users were focused on was how to count assets, record, and report on a regular schedule to align with financial systems. This isn't quite what I think of asset management, but it is a critical part.

A couple of weeks ago I was watching a great presentation by Cheng Lou "on The Spectrum of Abstraction" and Barbara Liskov "The Power of Abstraction"

JavaScript and the React community have evolved over the years through all the ups and downs. This talk goes over the tools we've come to recognize, from Angular, Ember and Grunt, all the way go Gulp, Webpack, React and beyond, and captures all these in a unifying mental framework for reasoning in terms of abstraction levels, in an attempt to make sense of what is and might be happening.
Barbara Liskov, Electrical Engineering and Computer Science, MIT, MA This lecture has been videocast from the Computer Science Department at Duke. The abstract of this lecture and a brief speaker biography is available at: http://research.csc.ncsu.edu/colloquia/seminar-post.php?id=308

So what happens if you apply the ideas that Cheng and Barbara shared on asset management? How do you apply abstraction to asset management? Start with abstraction in software engineering. https://en.wikipedia.org/wiki/Abstraction_(software_engineering) 

Break the asset management problem. One part is counting things. You could count the number of HP DL380 in an area, but since you are counting DL380s for a depreciation schedule you need to identify individual DL380s not just the total number so you need a unique ID system. Most would apply an asset tag. Maybe a bar code. Thinking they are being more advanced by using RFID tags. The flaw with this method is if you apply the asset tags incorrectly, they fall off, or make data entry error in the original record creation it can be extremely hard to catch the error and you are counting incorrectly for the life of the assets. So let's abstract the asset identification problem to be a virtual asset ID that can automated and is near perfect in its identification method. Oh and make it so there is a REST API to identify an asset programmatically.

If you can do all the above, then change the way assets are counted to be run by microservice. Counting each DL380 uniquely ID in a network means it is in a given space. If you know the network, then you know the location. If you had a perfect accounting of network cables, then you can determine location based on cables and network ports.

Now you may say this is way too hard for your legacy environment which is why people get stuck with asset tags and having people walk down the aisles reading bar codes. The Art of Abstraction can be applied to parts of the problem. if you started on Jan 1, 2017 with all new assets, then at least those can be counted with an abstraction approach. Wouldn't you like to know there are areas of the data center where you have automated asset management? I would.

This is my plan to change asset management with abstraction. I left out of some details because this post would get way too long, but you get the idea.