NYTimes AWS Cloud Computing Mistake Cost $240

Nicholas Carr discusses what is possible with Cloud Computing.

The new economics of computing

November 05, 2008

Are we missing the point about cloud computing?

That question has been rattling around in my mind for the last few days, as the chatter about the role of the cloud in business IT has intensified. The discussion to date has largely had a retrospective cast, focusing on the costs and benefits of shifting existing IT functions and operations from in-house data centers into the cloud. How can the cloud absorb what we're already doing? is the question that's being asked, and answering it means grappling with such fraught issues as security, reliability, interoperability, and so forth. To be sure, this is an important discussion, but I fear it obscures a bigger and ultimately more interesting question: What does the cloud allow us to do that we couldn't do before?

The history of computing has been a history of falling prices (and consequently expanding uses). But the arrival of cloud computing - which transforms computer processing, data storage, and software applications into utilities served up by central plants - marks a fundamental change in the economics of computing. It pushes down the price and expands the availability of computing in a way that effectively removes, or at least radically diminishes, capacity constraints on users. A PC suddenly becomes a terminal through which you can access and manipulate a mammoth computer that literally expands to meet your needs. What used to be hard or even impossible suddenly becomes easy.

His example is the NYTimes.

My favorite example, which is about a year old now, is both simple and revealing. In late 2007, the New York Times faced a challenge. It wanted to make available over the web its entire archive of articles, 11 million in all, dating back to 1851. It had already scanned all the articles, producing a huge, four-terabyte pile of images in TIFF format. But because TIFFs are poorly suited to online distribution, and because a single article often comprised many TIFFs, the Times needed to translate that four-terabyte pile of TIFFs into more web-friendly PDF files. That's not a particularly complicated computing chore, but it's a large computing chore, requiring a whole lot of computer processing time.

Fortunately, a software programmer at the Times, Derek Gottfrid, had been playing around with Amazon Web Services for a number of months, and he realized that Amazon's new computing utility, Elastic Compute Cloud (EC2), might offer a solution. Working alone, he uploaded the four terabytes of TIFF data into Amazon's Simple Storage System (S3) utility, and he hacked together some code for EC2 that would, as he later described in a blog post, "pull all the parts that make up an article out of S3, generate a PDF from them and store the PDF back in S3." He then rented 100 virtual computers through EC2 and ran the data through them. In less than 24 hours, he had his 11,000 PDFs, all stored neatly in S3 and ready to be served up to visitors to the Times site.

The total cost for the computing job? Gottfrid told me that the entire EC2 bill came to $240. (That's 10 cents per computer-hour times 100 computers times 24 hours; there were no bandwidth charges since all the data transfers took place within Amazon's system - from S3 to EC2 and back.)

An interesting part of the story on the original blog which was confirmed by an Amazon Web Services evangelist is the NYTimes had to run the cloud computing job twice, as the first had errors.

I then began some rough calculations and determined that if I used only four machines, it could take some time to generate all 11 million article PDFs. But thanks to the swell people at Amazon, I got access to a few more machines and churned through all 11 million articles in just under 24 hours using 100 EC2 instances, and generated another 1.5TB of data to store in S3. (In fact, it work so well that we ran it twice, since after we were done we noticed an error in the PDFs.)

So, it cost the NYTimes an extra $240 for their mistake as they had to run the job a second time.

Read more

Amazon Promotes Frustration Free (Greener) Packaging

How big is Green?  Amazon chooses to make Frustration Free (Greener) Packaging highlighted on its home page today.

image

One of the products featured is this pirate ship.

image

In addition Amazon has an environment site that points to a study online shopping is is more environmentally friendly than traditional shopping.

Economic and Environmental Implications of Online
Retailing in the United States
H. Scott Matthews1 and Chris T. Hendrickson2


Abstract


The advent of the Internet and e-commerce has brought a new way of marketing and selling many products, including books. The system-wide impacts of this shift in retail methods on cost and the environment are still unclear. While reductions in inventories and returns provide significant environmental savings, some of the major concerns of the new e-commerce business models are the energy and packaging materials used by the logistics networks for product fulfillment and delivery. In this paper, we analyze the different logistics networks and assess the environmental and cost impacts of different delivery systems. With a return (remainder) rate of 35% for best-selling books, ecommerce logistics are less costly and create lower environmental impacts, especially if private auto travel for shopping is included. Without book returns, costs and environmental effects are comparable for the two delivery methods.

Read more

Load Testing From The Cloud, A Killer App?

There are debates over the usefulness of cloud' computing for enterprise.  Amazon Web Services blog has a post on how one company is using AWS to create load testing to test other web sites.

SOASTA - Load Testing From the Cloud

I met Tom Lounibos, CEO of SOASTA, at the Palo Alto stop of the AWS Start-Up Tour. Tom gave the audience a good introduction to theirCloudTest product, an on demand load testing solution which resides on and runs from Amazon EC2.

Soasta_recordTom wrote to me last week to tell me that they are now able to simulate over 500,000 users hitting a single web application. Testing at this level gives system architects the power to verify the scalability of sites, servers, applications, and networks in advance of a genuine surge in traffic.

Here are a few of their most recent success stories:

  • Hallmark tested their e-card sites in preparation for the holiday season, and are ramping up testing to over 200,000 simultaneous users using CloudTest.
  • Marvel Entertainment is doing extensive cloud testing in order to get ready for the release of the sequel to IronMan.
  • A division of Procter & Gamble is using cloud testing to get ready for new releases of their web site.

Soasta also has a monitoring capability.

See Across Your Entire Web Application Infrastructure.

Resource Monitoring

Monitoring is the ability to monitor a resource (hardware, network, load balancer, firewall, Web server, database, application server, content management system, etc.) and capture usage information about that resource. Resource monitoring is a key component of professional Web testing. While it is crucial that a Web application functions correctly, resource usage and end-to-end response are extremely important. When there are problems, you need information about resource usage across the entire infrastructure of your Web application.

Resource information can be captured from any available resource in the Web application infrastructure—not just the Web server hosting the Web application. SOASTA CloudTest can monitor all three tiers of your Web application—the Web server, the application server, and the database server. It can also capture valuable information about other components in your network architecture—load balancers, for example.

Soasta says load testing is the killer app for cloud computing.  Didn’t think about it until now, but makes a lot of sense.

"Load and performance testing is the ‘killer application’ for cloud computing," said Tom Lounibos, CEO of SOASTA. "Companies can very easily create a real-world test environment without having to invest in it. Developers have virtually unlimited and affordable access to thousands of servers, memory, storage, etc. and can, essentially on demand, simulate load and performance tests for tens of thousands of users without having to purchase the hardware."

Why SOASTA CloudTest Lab is uniquely different:

  • It’s Real World: Load and performance testing in cloud computing environments is the closest thing to running an application in production minus the worry of negatively impacting your customers. SOASTA CloudTest Lab provides you with a controlled environment to thoroughly simulate and stage a real world scenario before it goes into production.
  • It’s On Demand: No more costly investment in new hardware or worries about staffing up for support and management. Ready when you are, SOASTA CloudTest Lab serves as a virtual test lab at your service 24x7x365 and has you testing in a matter of minutes.
  • It’s Scalable: An overloaded Web site is a major problem. Being prepared and understanding the limits of your application is crucial to maintaining availability and a quality customer experience. SOASTA CloudTest Lab allows you instant access to up to 1,000 available test servers when needed, and the ability to shut them down when unused to reduce costs. In short, you pay only for what you need. Find out, without the cost and time of setting up and tearing down hundreds if not thousands of servers, whether or not your Web application can scale exponentially at a moment’s notice.
  • It’s Affordable: Load and performance testing is no longer cost prohibitive. A capability that would typically cost hundreds of thousands of dollars is now available to companies of all sizes in a matter of minutes for only a few hundred dollars.
Read more

Data Center Best Practices – Microsoft, Google, or Uptime Institute, Don’t Forget Sun, HP, Dell, and IBM

TechHermit has a post base on Uptime’s latest conference.

Uptime Institute to IT, Microsoft and Google are your enemy

October 31, 2008

At the Uptime conference this last week in Dallas, Ken Brill shocked many of the attendees by throwing out some amazing statements that can only be classified as fear mongering. Perhaps the rhetoric was a first salvo in an inflammatory exchange between industry giants.

The comments in question came up in a meeting of Uptime members assembled for the event.  His comments to the audience centered around how Microsoft and Google were a direct threat to everyone in attendance.  Perhaps this was a result of the Microsoft Azure release last week and the existence of Google Apps. In addition many of the comments were directed towards the PUE metric.  Specifically that Microsoft and Google were highlighting low PUE results of 1.2x which were unrealistic and flawed.

Now I admit to being biased as I have many more conversations with Microsoft and Google Engineers than Uptime Institute Engineers.  Oh wait,  I’ve talked to Uptime Sales people though.

Microsoft has its Power of Software blog, the Microsoft data center team presenting at conferences and even Microsoft Research giving away data center knowledge for no charge.

Google has its data center site.

Sun has both a Blog and data center site.

HP, Dell, and IBM all have data center offerings as well.

The one guy who doesn’t say anything about its data centers is Amazon, but they have a huge presence with Amazon Web Services.  So, don’t expect Ken Brill/Uptime throwing any criticism at Amazon.

Unfortunately, for Ken Brill and Uptime Institute they are making enemies faster than friends.

How long can Uptime Institute survive?

I know most of your out there don’t care, but a few of you do, and this is an interesting study of sharing information vs. charging for information to improve the efficiency of your data center.

Right now market perception is Microsoft and Google are the leaders in data center innovation and all of these companies are battling to be at the top which is good for the whole industry.

Read more

Where the Clouds Meet The Ground

The Economist has a feature on Cloud computing.  An audio is available here.

Nicholas Carr summarizes The Economist article.

The Economist tours the cloud

October 25, 2008

The new issue of The Economist features a good primer on cloud computing, written by Ludwig Siegele, which looks at trends in data centers, software, networked devices, and IT economics and speculates about the broader implications for businesses and nations. A free pdf of the entire report is also available.

Siegele notes that the hype surrounding the term "cloud computing" may have peaked already - Google searches for the phrase have fallen after a big spike in July - but that "even if the term is already passé, the cloud itself is here to stay and to grow. It follows naturally from the combination of ever cheaper and more powerful processors with ever faster and more ubiquitous networks. As a result, data centres are becoming factories for computing services on an industrial scale; software is increasingly being delivered as an online service; and wireless networks connect more and more devices to such offerings." The "precipitation from the cloud," he concludes (milking the passé metaphor one last time), "will be huge."

Part of the report is a specific feature on Data Centres.

CORPORATE IT

Where the cloud meets the ground

Oct 23rd 2008
From The Economist print edition

Data centres are quickly evolving into service factories

Illustration by Matthew Hodson

IT IS almost as easy as plugging in a laser printer. Up to 2,500 servers—in essence, souped-up personal computers—are crammed into a 40-foot (13-metre) shipping container. A truck places the container inside a bare steel-and-concrete building. Workers quickly connect it to the electric grid, the computer network and a water supply for cooling. The necessary software is downloaded automatically. Within four days all the servers are ready to dish up videos, send e-mails or crunch a firm’s customer data.

This is Microsoft’s new data centre in Northlake, a suburb of Chicago, one of the world’s most modern, biggest and most expensive, covering 500,000 square feet (46,000 square metres) and costing $500m. One day it will hold 400,000 servers. The entire first floor will be filled with 200 containers like this one. Michael Manos, the head of Microsoft’s data centres, is really excited about these containers. They solve many of the problems that tend to crop up when putting up huge data centres: how to package and transport servers cheaply, how to limit their appetite for energy and how to install them only when they are needed to avoid leaving expensive assets idle.

But containers are not the only innovation of which Mr Manos is proud. Microsoft’s data centres in Chicago and across the world are equipped with software that tells him exactly how much power each application consumes and how much carbon it emits. “We’re building a global information utility,” he says.

Engineers must have spoken with similar passion when the first moving assembly lines were installed in car factories almost a century ago, and Microsoft’s data centre in Northlake, just like Henry Ford’s first large factory in Highland Park, Michigan, may one day be seen as a symbol of a new industrial era.

Read more