11 price reductions over 4 years, Amazon Web Service's James Hamilton thoughts on pace of innovation

June 11, 2011 Dave Ohara

James Hamilton is keynoting at SIGMOD Athens and his presentation description has some good ideas to think about.

Keynote 1: James Hamilton, Amazon Web Services

Internet Scale Storage

Abstract

The pace of innovation in data center design has been rapidly accelerating over the last five years, driven by the mega-service operators. I believe we have seen more infrastructure innovation in the last five years than we did in the previous fifteen. Most very large service operators have teams of experts focused on server design, data center power distribution and redundancy, mechanical designs, real estate acquisition, and network hardware and protocols. At low scale, with only a data or center or two, it would be crazy to have all these full time engineers and specialist focused on infrastructural improvements and expansion. But, at high scale with tens of data centers, it would be crazy not to invest deeply in advancing the state of the art.

Looking specifically at cloud services, the cost of the infrastructure is the difference between an unsuccessful cloud service and a profitable, self-sustaining business. With continued innovation driving down infrastructure costs, investment capital is available, services can be added and improved, and value can be passed on to customers through price reductions. Amazon Web Services, for example, has had eleven price reductions in four years. I don’t recall that happening in my first twenty years working on enterprise software. It really is an exciting time in our industry.

Here is anther thing to keep in mind. From reading this statement it seems Amazon Web Services does not use blades. If Amazon has determined it shouldn’t use blades why should you?

· Datacenter Construction Costs

o Land: <2%

o Shell: 5 to 9%

o Architectural: 4 to 7%

o Mechanical & Electrical: 70 to 85%

· Summarizing the above list, we get 80% of the costs scaling with power consumption and 10 to 20% scaling with floor space. Reflect on that number and you’ll understand why I think the industry is nuts to be focusing on density. See Why Blade Servers Aren’t the Answer to All Questions for more detail on this point – I think it’s a particularly important one.

From 2008 James has discussed blades.

Summary so far: Blade servers allow for very high power density but they cost more than commodity, low power density servers. Why buy blades? They save space and there are legitimate reasons to locate data centers where the floor space is expensive. For those, more density is good. However, very few data center owners with expensive locations are able to credibly explain why all their servers NEED to be there. Many data centers are in poorly chosen locations driven by excessively manual procedures and the human need to see and touch that for which you paid over 100 million dollars. Put your servers where humans don’t want to be. Don’t worry, attrition won’t go up. Servers really don’t care about life style, how good the schools are, and related quality of life issues.

Here is a simple one liner.

Density is fine but don’t pay a premium for it unless there is a measurable gain and make sure that the gain can’t be achieved by cheaper means.

Architecting for Outages, an architect posts on surviving AWS

June 9, 2011 Dave Ohara

Everyone wants to survive a data center outage, but as AWS outage shows, not all do survive. Here is a post that summarize best practices in SW architecture to survive an outage like AWS.

Retrospect on recent AWS outage and Resilient Cloud-Based Architecture

Thursday, June 9, 2011 at 8:19AM

A bit over a month ago Amazon experienced its infamous AWS outage in the US East Region. As a cloud evangelist, I was intrigued by the history of the outage as it occurred. There were great posts during and after the outage from those who went down. But more interestingly for me as architect were the detailed posts of those who managed to survive the outage relatively unharmed, such as SimpleGeo, Netflix,SmugMug, SmugMug’s CTO, Twilio, Bizo and others.

The list of best practices are:

The main principles, patterns and best practices are:

Design for failure

Stateless and autonomous services

Redundant hot copies spread across zones

Spread across several public cloud vendors and/or private cloud

Automation and monitoring

Avoiding ACID services and leveraging on NoSQL solutions

Load balancing

If this seems daunting, there are new services coming to provide scalability and availability services.

The emerging solution to this complexity is a new class of application servers that offers to take care of the high availability and scalability concerns of your application, allowing you to focus on your business logic. Forrester calls these "Elastic Application Platforms", and defines them as:

An application platform that automates elasticity of application transactions, services, and data, delivering high availability and performance using elastic resources.

Amazon’s Data Center Container "Perdix" something we haven’t seen

June 8, 2011 Dave Ohara

Yesterday I went to Amazon’s Technology Open House.

Here is a 1/4 of the crowd getting food and drinks early before James Hamilton’s keynote.

In James’s presentation he has a section on Modular & Advanced Building Designs

Every day, Amazon Web Services adds enough new capacity to support all of Amazon.com’s global infrastructure through the company’s first 5 years, when it was $2.7 billion annual revenue.

James presents his latest observations on data center costs.

And waste in mechanical systems.

But, here is something I didn’t expect. Amazon Perdix. Amazon’s version of modular pre-fab data container data center. The below picture has Microsoft’s design on the left and Amazon’s on the right.

James is a believer in low density, 30 servers per rack where the cost per server is $1,450 or less.

Being Faster creates more traffic than Quality, Uptime Institute one month after Symposium posts videos

June 7, 2011 Dave Ohara

Uptime Institute has posted more videos a month after their Symposium event. http://www.youtube.com/user/uptimeinstitute

After only one day there is single digit traffic on the most recent posts.

Videos that have been out for 5 days have 10 - 20 views.

One video that got up sooner and has more traffic is Matt Stansberry interviewing Gary Cook with 186 views, 187 now that I watched it.

If you are going to post content most of time speed beats quality.

Steve Jobs Keynote serious about data centers, compares Apple, Amazon, and Google

June 7, 2011 Dave Ohara

Steve Jobs gave his iCloud keynote http://events.apple.com.edgesuite.net/11piubpwiqubf06/event/

at minute 115:00 you can see Steve Jobs compare Apple, Amazon, and Google cost of music cloud services.

To make the point Apple is committed to iCloud he makes the point Apple is serious about data centers. Steve discusses its 3rd data center in Maiden, NC at minute 116:00.

Steve says this data center is as eco friendly as a data center can be with modern technology.

Steve is a great show man as usual and wows people showing the scale of the building.

and points to two dots on the roof that are actually people, getting laughs from the crowd. When is the last time you heard someone laugh when they talk about the scale of their data center.

"Full of stuff. expensive stuff." More laughs. Who would ever call millions of dollars of IT equipment stuff? You won't see Jobs calling an iPhone, iPod, or iPad stuff. Do you think he is making fun of the other stuff he doesn't make?

It's been over 20 years since I worked at WWDC as an Apple employee, and never would have thought Steve Jobs would be talking about data centers. A lot has changed in 20 years. Wow 20 years, and there are people I know that have been there the whole time. This video was probably some of the first pictures they've seen of their mothership data center in Maiden, NC.