When will solid state memory server be an option in AWS instances?

January 7, 2010 Dave Ohara

I was having another stimulating conversation in silicon valley last night, and one of the ideas that made sense is for solid state memory servers to be part of the cloud computing option. It’s just a matter of time. Amazon has their current instance offerings with a division of performance and memory.

Standard Instances

Instances of this family are well suited for most applications.

Small Instance (default)*

1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage (150 GB plus 10 GB root partition)
32-bit platform
I/O Performance: Moderate

Large Instance

7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage (2×420 GB plus 10 GB root partition)
64-bit platform
I/O Performance: High

Extra Large Instance

15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage (4×420 GB plus 10 GB root partition)
64-bit platform
I/O Performance: High

High-Memory Instances

Instances of this family offer large memory sizes for high throughput applications, including database and memory caching applications.

High-Memory Double Extra Large Instance

34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High

High-Memory Quadruple Extra Large Instance

68.4 GB of memory
26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High

High-CPU Instances

Instances of this family have proportionally more CPU resources than memory (RAM) and are well suited for compute-intensive applications.

High-CPU Medium Instance

1.7 GB of memory
5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each)
350 GB of instance storage
32-bit platform
I/O Performance: Moderate

High-CPU Extra Large Instance

7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High

But as with Virident’s offering you can get higher performance with high memory addressing if you are MySql or memcached, resulting in a higher performance per watt which should translate into a higher performance per dollar.

GreenCloud Server for MySQL

The GreenCloud Server for MySQL delivers extreme performance improvement over industry standard servers using disk arrays or SSDs, including high-performance PCIe SSDs, on Web 2.0 workloads. Virident optimized versions of MyISAM and InnoDB storage engines directly access datasets stored in the storage class memory tier to eliminate I/O bottlenecks. GreenCloud servers sustain significantly higher query rates, dramatically lower the cost of scaling to larger datasets, and simplify the replication and sharding processes usually employed for scaling. The extreme performance additionally makes it possible to obtain new insights into data and deliver new services by running complex operations such as multi-table joins, which are beyond the reach of traditional servers.

50-70x performance versus Industry Standard Servers with hybrid disk/DRAM configuration on third party benchmarks.

5 -7x versus fastest PCIe-based SSD systems.

Binary compatible with existing InnoDB and MyISAM databases.

30-35x Improvement in QPS/Watt.

10-15x Improvement in QPS/$.

GreenCloud Server for Memcached

The Virident GreenCloud Server for Memcached delivers a new standard of high-performance and cache size scaling for the popular distributed caching application. These servers can deliver 250K object gets per second with low and predictable latencies and support caches with up to 3 billion objects, increasing performance by up to 4x and the available cache memory by up to 8x versus industry standard servers. These performance and scaling benefits permit larger key spaces to be supported by a single server and decrease cache miss rates thereby reducing load on backend database servers.

Industry–leading performance
▫ Up to 250K object gets per second w/ average size of 200-300 bytes
▫ Supports a larger object cache – up to 3 Billion objects

Higher cache hit rates due to larger caches – up to 8x versus industry standard servers
▫ Lower the backend database load up to 50%

50-70% decrease in TCO
▫ GreenCloud servers can replace 4 or more traditional servers in a sever consolidation project

I would expect AWS is evaluating this, and it will be here by the summer.

Ex-Microsoft Security Evangelist works for AWS, shouldn’t he have been transferred to Azure instead of fired?

December 17, 2009 Dave Ohara

A new AWS technical evangelist has a blog entry.

Hello, world!

Good day, everyone. I'm Steve Riley. In July 2009 I joined the AWS evangelism team. I spent my first few months absorbing information about all our offerings and am now getting back on the road again, speaking at various events and user groups and meeting with customers. I came from Microsoft, where I was in the telecommunications consulting practice for three years and in the Trustworthy Computing group for seven. I was a global security evangelist there and also worked closely with our chief security officer and enterprise security architect communities. I'm continuing that work here at Amazon Web Services, concentrating on enterprise deployment of cloud computing, all things cloud security, and of course the Windows Server aspects of our offerings.

I'm very excited to be part of AWS. The cloud is the future, and I look forward to meeting many of you and working together. As with all of us on the team, I'm here to help you succeed. More information in the links below.

My personal blog -- information security, cloud computing, travel, music, politics

My Twitter

I also watch and respond to the AWS Twitter

Our LinkedIn group

My old Microsoft blog -- Windows security, security process and practice

My book -- Protect Your Windows Network

A biography

Steve has a nice map of the Amazon EC2, S3, and CloudFront from one his presentations that is on his presentation page.

What I found interesting is Steve Riley was laid off from Microsoft’s security group, trustworthy computing.

Good bye, and good luck

Friends, as a part of Microsoft’s second round of restructuring, my position was eliminated yesterday and my employment with Microsoft has ended.

Shouldn’t Steve been transferred to Windows Azure instead of being laid off, and being hired by Amazon Web Services?

Amazon delivers elastic cloud computing pricing driving creative destruction of IT business models

December 14, 2009 Dave Ohara

I am at first hesitant to write another Amazon Web Services post as I haven written so many Amazon posts lately, but AWS’s latest announcement of spot pricing will drive changes at multiple levels.

What AWS spot pricing has done is simple. You can now bid for EC2 spot instances in a spot market way for AWS capacity.

Spot Instances are a new way to purchase and consume Amazon EC2 Instances. They allow customers to bid on unused Amazon EC2 capacity and run those instances for as long as their bid exceeds the current Spot Price. The Spot Price changes periodically based on supply and demand, and customers whose bids meet or exceed it gain access to the available Spot Instances. Spot Instances are complementary to On-Demand Instances and Reserved Instances, providing another option for obtaining compute capacity.

Amazon CTO Werner Vogels summarizes the significance.

Spot instances are a great innovation that, as far as I know, has no equivalent in the IT industry. It brings our customers a powerful new way of managing the cost for those workloads that are flexible in their execution and completion times. This new customer-managed pricing approach holds the power to make new areas of computing feasible for which the economics were previously unfavorable.

Why is this significant? Nicholas Carr explains.

AWS: the new Chicago Edison

DECEMBER 14, 2009

The key to running a successful large-scale utility is to match capacity (ie, capital) to demand, and the key to matching capacity to demand is to manipulate demand through pricing. The worst thing for a utility, particularly in the early stages of its growth, is to have unused capacity. At the end of the nineteenth century, Samuel Insull, president of the then-tiny Chicago Edison, started the electric utility revolution when he had the counterintuitive realization that to make more money his company had to cut its prices drastically, at least for those customers whose patterns of electricity use would help the utility maximize its capacity utilization.

Amazon Web Services is emerging as the Chicago Edison of utility computing. Perhaps because its background in retailing gives it a different perspective than that of traditional IT vendors, it has left those vendors in the dust when it comes to pioneering the new network-based model of supplying computing and storage capacity.

Besides the economic benefits what this means is there is now a financial incentive to re-architect applications to be efficiently turned on and off. These questions are normally not asked at the Enterprise Architect level, but the AWS user base now will.

Architecting Applications to Use Spot Instances

There are a number of best practices to keep in mind when making use of Spot Instances:

Save Your Work Frequently: Because Spot Instances can be terminated without warning, it is important to build your applications in a way that allows you to make progress even if your application is interrupted. There are many ways to accomplish this, two of which include adding checkpoints to your application and splitting your work into small increments. Using Amazon EBS volumes to store your data is one easy way to protect your data.

Test Your Application: When using Spot Instances, it is important to make sure that your application is fault tolerant and will correctly handle interruptions. While we attempt to cleanly terminate your instances, your application should be prepared to deal with an immediate shutdown. You can test your application by running an On-Demand Instance and then terminating it suddenly. This can help you to determine whether or not your application is sufficiently fault tolerant and is able to handle unexpected interruptions.

Track when Spot Instances Start and Terminate: The simplest way to know the current status of your Spot Instances is to monitor your Spot requests and running instances via the AWS Management Console or AmazonEC2 API.

Choose a Maximum Price for Your Request: Remember that the maximum price that you submit as part of your request is not necessarily what you will pay per hour, but is rather the maximum you would be willing to pay to keep it running. You should set a maximum price for your request that is high enough to provide whatever probability you would like that your instances run for the amount of time that you desire within a given timeframe. Use the Spot Price history via the AWS Management Console or the Amazon EC2 API to help you set a maximum price.

An example of using spot instances is Pfizer’s Protein Engineering group architecting their AWS app to have “must do” and “like to do”

The Protein Engineering group at Pfizer has been using AWS to model Antibody-Antigen interactions using a protein docking system. Their protocol utilizes a full stack of services including EC2, S3, SQS, SimpleDB and EC2 Spot instances (more info can be found in a recent article by BioTeam's Adam Kraut, a primary contributor to the implementation). BioTeam described this system as follows:

The most computationally intensive aspect of the protocol is an all-atom refinement of the docked complex resulting in more accurate models. This exploration of the solution space can require thousands of EC2 instances for several hours.

Here's what they do:

We have modified our pipeline to submit "must do" refinement jobs on standard EC2 instances and "nice to do" workloads to the Spot Instances. With large numbers of standard instances we want to optimize the time to complete the job. With the addition of Spot Instances to our infrastructure we can optimize for the price to complete jobs and cluster the results that we get back from spot. Not unlike volunteer computing efforts such as Rosetta@Home, we load the queue with tasks and then make decisions after we get back enough work units from the spot instances. If we're too low on the Spot bids we just explore less solution space. The more Spot Instances we acquire the more of the energy landscape we can explore.

Here is their architecture:

Going back to Werner Vogels blog post, Cloud computing has three different purchasing models.

Different Purchasing Models

The three different purchasing models Amazon EC2 offers give customers maximum flexibility in managing their IT costs; On-Demand Instances are charged by the hour at a fixed rate with no commitment; with Reserved Instances you pay a low, one-time fee and in turn receive a significant discount on the hourly usage charge for that instance; and Spot Instances provide the ability to assign the maximum price you want for capacity with flexible start and end times.

On-Demand Instances - On-Demand Instances let you pay for compute capacity by the hour with no long-term commitments or upfront payments. You can increase or decrease your compute capacity depending on the demands of your application and only pay the specified hourly rate for the instances you use. These instances are used mostly for short term workloads and for workloads with unpredictable resource demand characteristics.

Reserved Instances - Reserved Instances let you make a low, one-time, upfront payment for an instance, reserve it for a one or three year term, and pay a significantly lower rate for each hour you run that instance. You are assured that your Reserved Instance will always be available in the Availability Zone in which you purchased it. These instances are used for longer running workloads with predictable resource demands.

Spot Instances - Spot Instances allow you to specify the maximum hourly price that you are willing to pay to run a particular instance type. We set a Spot Price for each instance type in each region, which is the price all customers will pay to run a Spot Instance for that given hour. The Spot Price fluctuates based on supply and demand for instances, but customers will never pay more than the maximum price they have specified. These instances are used for workloads with flexible completion times.

What’s next for AWS? Users are asking for sub hour increments. It makes sense if you continue down the path of spot market pricing and the ability to maximize utilization.

This is awesome. Market pricing for computer power. People have dreamed of this and now Amazon is making it happen!

Now the real question is when will AWS start charging for half hours or quarter hours?

I have projects I need to run every hour for only 15 to 20 minutes ... but they need to run every hour.

Amazon Web Services adds global physical data shipping and receiving to cloud computing services

December 11, 2009 Dave Ohara

Amazon is setting the standard for cloud computing services. AWS just announced a beta import/export service to allow 2TB of data to be imported or exported globally from AWS S3.

AWS Import/Export Goes Global

AWS Import/Export is a fast and reliable alternative to sending large volumes of data across the internet. You can send us a blank storage device and we'll copy the contents of one or more Amazon S3 buckets to it before shipping it back to you. Or, you can send us a storage device full of data and we'll copy it to the S3 buckets of your choice.

Until now, this service was limited to US shipping addresses and to S3's US Standard Region. We've lifted both of those restrictions; developers the world over now have access to AWS Import/Export. Here's what's new:

Storage devices can now be shipped to an AWS address in the EU for use with S3's EU (Ireland) Region.At this time, devices shipped to our AWS locations in the EU most originate from and be returned to an address within the European Union.

Storage devices can be shipped from almost anywhere in the world to a specified AWS address in the US for data loads into and out of buckets in the US Standard Region. Previously, devices could only be shipped from and returned to addresses in the United States.

What would use this for?

Common Uses for AWS Import/Export

AWS Import/Export makes it easy to quickly transfer large amounts of data into and out of the AWS cloud. You can use AWS Import/Export for:

Data Migration – If you have data you need to upload into the AWS cloud for the first time, AWSImport/Export is often much faster than transferring that data via the Internet.

Offsite Backup – Send full or incremental backups to Amazon S3 for reliable and redundant offsite storage.

Direct Data Interchange – If you regularly receive content on portable storage devices from your business associates, you can have them send it directly to AWS for import into your Amazon S3 buckets.

Disaster Recovery – In the event you need to quickly retrieve a large backup stored in Amazon S3, use AWSImport/Export to transfer the data to a portable storage device and deliver it to your site.

When should you consider this service? AWS answers this as well.

When to Use AWS Import/Export

If you have large amounts of data to load and an Internet connection with limited bandwidth, the time required to prepare and ship a portable storage device to AWS can be a small percentage of the time it would take to transfer your data over the internet. If loading your data over the Internet would take a week or more, you should consider using AWS Import/Export.

Below is a table that gives guidance around common internet connection speeds on: (1) how long it will take to transfer 1TB of data over the Internet into AWS (see the middle column for this estimate); and, (2) what volume of total data will require a week to transfer over the Internet into AWS, and therefore warrant consideration of AWSImport/Export (see the right-hand column). For example, if you have a 10Mbps connection and expect to utilize 80% of your network capacity for the data transfer, transferring 1TB of data over the Internet to AWS will take 13 days. The volume at which this same set-up will take at least a week, is 600GB, so if you have 600GB of data or more to transfer, and you want it to take less than a week to get into AWS, we recommend you using AWSImport/Export.

Available Internet Connection
Theoretical Min. Number of Days to Transfer 1TB at 80% Network Utilization
When to Consider AWSImport/Export?

T1 (1.544Mbps)
82 days
100GB or more

10Mbps
13 days
600GB or more

T3 (44.736Mbps)
3 days
2TB or more

100Mbps
1 to 2 days
5TB or more

1000Mbps
Less than 1 day
60TB or more

If anyone can efficiently receive and ship items it is amazon, and it was smart they added this capability to AWS. We’ll see how long before other cloud computing providers add this service. My bet is you’ll have to wait a while as few would have thought to set up shipping and receiving in their cloud computing internal network.

Amazon Web Services Economics Center, comparing AWS/cloud computing vs co-location vs owned data center

December 8, 2009 Dave Ohara

Amazon Web Services has a post on the Economics of AWS.

The Economics of AWS

For the past several years, many people have claimed that cloud computing can reduce a company's costs, improve cash flow, reduce risks, and maximize revenue opportunities. Until now, prospective customers have had to do a lot of leg work to compare the costs of a flexible solution based on cloud computing to a more traditional static model. Doing a genuine "apples to apples" comparison turns out to be complex — it is easy to neglect internal costs which are hidden away as "overhead".

We want to make sure that anyone evaluating the economics of AWS has the tools and information needed to do an accurate and thorough job. To that end, today we released a pair of white papers and an Amazon EC2 Cost Comparison Calculator spreadsheet as part of our brand new AWS Economics Center. This center will contain the resources that developers and financial decision makers need in order to make an informed choice. We have had many in-depth conversations with CIO's, IT Directors, and other IT staff, and most of them have told us that their infrastructure costs are structured in a unique way and difficult to understand. Performing a truly accurate analysis will still require deep, thoughtful analysis of an enterprise's costs, but we hope that the resources and tools below will provide a good springboard for that investigation.

The AWS team has laid out the costs of AWS Cloud vs. owned IT infrastructure.

The Economics of the AWS Cloud vs. Owned IT Infrastructure. This paper identifies the direct and indirect costs of running a data center. Direct costs include the level of asset utilization, hardware costs, power efficiency, redundancy overhead, security, supply chain management, and personnel. Indirect factors include the opportunity cost of building and running high-availability infrastructure instead of focusing on core businesses, achieving high reliability, and access to capital needed to build, extend, and replace IT infrastructure.

If you have every wished for a spreadsheet to help you calculate data center costs, AWS has this.

The Amazon EC2 Cost Comparison Calculator is a rich Excel spreadsheet that serves as a starting point for your own analysis. Designed to allow for detailed, fact-based comparison of the relative costs of hosting on Amazon EC2, hosting on dedicated in-house hardware, or hosting at a co-location facility, this detailed spreadsheet will help you to identify the major costs associated with each option. We've supplied the spreadsheet because we suspect many of our customers will want to customize the tool for their own use and the unique aspects of their own business.

And, they launched an Economics Center.

The AWS Economics Center provides access to information, tools, and resources to compare the costs of Amazon Web Services with IT infrastructure alternatives. Our goal is to help developers and business leaders quantify the economic benefits (and costs) of cloud computing.

Overview

Amazon Web Services (AWS) gives your business access to compute, storage, database, and other in-the-cloud IT infrastructure services on demand, charging you only for the resources you actually use. With AWS you can reduce costs, improve cash flow, minimize business risks, and maximize revenue opportunities for your business.

Reduce costs and improve cash flow.
Avoid the capital expense of owning servers or operating data centers by using AWS’s reliable, scalable, and elastic infrastructure platform. AWSallows you to add or remove resources as needed based on the real-time demands of your applications. You can lower IT operating costs and improve your cash flow by avoiding the upfront costs of building infrastructure and paying only for those resources you actually use.

Minimize your financial and business risks.
Simplify capacity planning and minimize both the financial risk of owning too many servers and the business risk of not owning enough servers by usingAWS’s elastic, on-demand cloud infrastructure. SinceAWS is available without contracts or long-term commitments and supports multiple programming languages and operating systems, you retain maximum flexibility. And for many businesses, the security and reliability of the AWS platform often exceeds what they could develop affordably on their own.

Maximize your revenue opportunities.
Maximize your revenue opportunities with AWS by allocating more of your time and resources to activities that differentiate your business to your customers – instead of focusing on IT infrastructure “heavy lifting.” Use AWS to provision IT resources on-demand within minutes so your business’s applications launch in days instead of months. UseAWS as a low-cost test environment to sample new business models, execute one-time projects, or perform experiments aimed at new revenue opportunities.

Capacity vs. Usage Comparison

This last graph is the Christmas wish list for enlightened green IT thinkers. IT load that tracks to demand.