Economist article on IT systems effect on the financial crisis

Economist has an article on the relationship of IT system and the financial crises.

The article starts by pointing out financial services spends $500 billion globally annually on IT, according to Gartner.

Banks and information technology

Silo but deadly

Dec 3rd 2009
From The Economist print edition

Messy IT systems are a neglected aspect of the financial crisis

NO INDUSTRY spends more on information technology (IT) than financial services: about $500 billion globally, more than a fifth of the total (see chart). Many of the world’s computers, networking and storage systems live in the huge data centres run by banks. “Banks are essentially technology firms,” says Hugo Banziger, chief risk officer at Deutsche Bank. Yet the role of IT in the crisis is barely discussed.

The point of the article is the silos of IT made it difficult to see the overall risk.

This fragmented IT landscape made it exceedingly difficult to track a bank’s overall risk exposure before and during the crisis. Mainly as a result of the Basel 2 capital accords, many banks had put in new systems to calculate their aggregate exposure. Royal Bank of Scotland (RBS) spent more than $100m to comply with Basel 2. But in most cases the aggregate risk was only calculated once a day and some figures were not worth the pixels they were made of.

During the turmoil many banks had to carry out big fact-finding missions to see where they stood. “Answering such questions as ‘What is my exposure to this counterparty?’ should take minutes. But it often took hours, if not days,” says Peyman Mestchian, managing partner at Chartis Research, an advisory firm. Insiders at Lehman Brothers say its European arm lacked an integrated picture of its risk position in the days running up to its demise.

But is IT really the cause or its the people who refuse to work with other groups?  IT has grows so large because users want to own the data systems, as information is power.   As the economist points out the problem was discovery of issues across systems.

During the turmoil many banks had to carry out big fact-finding missions to see where they stood. “Answering such questions as ‘What is my exposure to this counterparty?’ should take minutes. But it often took hours, if not days,” says Peyman Mestchian, managing partner at Chartis Research, an advisory firm. Insiders at Lehman Brothers say its European arm lacked an integrated picture of its risk position in the days running up to its demise.

Due to the power of IT industry, people focus on going faster.

But many other banks are still in firefighting mode, says Mr Mestchian. Much of the money invested in IT still goes into making things faster rather than more transparent.

The change needed in IT is to think more about transparency of their systems and how they work with other systems.  This is will happen as social software systems permeate more of IT.  The old term was collaboration, now it is is social software/networking.

Imagine if twitter and facebook worked in a financial systems IT systems.  Could you discover issues faster?

Read more

Amazon delivers elastic cloud computing pricing driving creative destruction of IT business models

I am at first hesitant to write another Amazon Web Services post as I haven written so many Amazon posts lately, but AWS’s latest announcement of spot pricing will drive changes at multiple levels.

What AWS spot pricing has done is simple.  You can now bid for EC2 spot instances in a spot market way  for AWS capacity.

Amazon EC2 Spot Instances

Spot Instances are a new way to purchase and consume Amazon EC2 Instances. They allow customers to bid on unused Amazon EC2 capacity and run those instances for as long as their bid exceeds the current Spot Price. The Spot Price changes periodically based on supply and demand, and customers whose bids meet or exceed it gain access to the available Spot Instances. Spot Instances are complementary to On-Demand Instances and Reserved Instances, providing another option for obtaining compute capacity.

Amazon CTO Werner Vogels summarizes the significance.

Spot instances are a great innovation that, as far as I know, has no equivalent in the IT industry. It brings our customers a powerful new way of managing the cost for those workloads that are flexible in their execution and completion times. This new customer-managed pricing approach holds the power to make new areas of computing feasible for which the economics were previously unfavorable.

Why is this significant?  Nicholas Carr explains.

AWS: the new Chicago Edison

DECEMBER 14, 2009

The key to running a successful large-scale utility is to match capacity (ie, capital) to demand, and the key to matching capacity to demand is to manipulate demand through pricing. The worst thing for a utility, particularly in the early stages of its growth, is to have unused capacity. At the end of the nineteenth century, Samuel Insull, president of the then-tiny Chicago Edison, started the electric utility revolution when he had the counterintuitive realization that to make more money his company had to cut its prices drastically, at least for those customers whose patterns of electricity use would help the utility maximize its capacity utilization.

Amazon Web Services is emerging as the Chicago Edison of utility computing. Perhaps because its background in retailing gives it a different perspective than that of traditional IT vendors, it has left those vendors in the dust when it comes to pioneering the new network-based model of supplying computing and storage capacity.

Besides the economic benefits what this means is there is now a financial incentive to re-architect applications to be efficiently turned on and off.  These questions are normally not asked at the Enterprise Architect level, but the AWS user base now will.  

Architecting Applications to Use Spot Instances

There are a number of best practices to keep in mind when making use of Spot Instances:

Save Your Work Frequently: Because Spot Instances can be terminated without warning, it is important to build your applications in a way that allows you to make progress even if your application is interrupted. There are many ways to accomplish this, two of which include adding checkpoints to your application and splitting your work into small increments. Using Amazon EBS volumes to store your data is one easy way to protect your data.

Test Your Application: When using Spot Instances, it is important to make sure that your application is fault tolerant and will correctly handle interruptions. While we attempt to cleanly terminate your instances, your application should be prepared to deal with an immediate shutdown. You can test your application by running an On-Demand Instance and then terminating it suddenly. This can help you to determine whether or not your application is sufficiently fault tolerant and is able to handle unexpected interruptions.

Track when Spot Instances Start and Terminate: The simplest way to know the current status of your Spot Instances is to monitor your Spot requests and running instances via the AWS Management Console or AmazonEC2 API.

Choose a Maximum Price for Your Request: Remember that the maximum price that you submit as part of your request is not necessarily what you will pay per hour, but is rather the maximum you would be willing to pay to keep it running. You should set a maximum price for your request that is high enough to provide whatever probability you would like that your instances run for the amount of time that you desire within a given timeframe. Use the Spot Price history via the AWS Management Console or the Amazon EC2 API to help you set a maximum price.

An example of using spot instances is Pfizer’s Protein Engineering group architecting their AWS app to have “must do” and “like to do”

The Protein Engineering group at Pfizer has been using AWS to model Antibody-Antigen interactions using a protein docking system. Their protocol utilizes a full stack of services including EC2, S3, SQS, SimpleDB and EC2 Spot instances (more info can be found in a recent article by BioTeam's Adam Kraut, a primary contributor to the implementation). BioTeam described this system as follows:

The most computationally intensive aspect of the protocol is an all-atom refinement of the docked complex resulting in more accurate models. This exploration of the solution space can require thousands of EC2 instances for several hours.

Here's what they do:

We have modified our pipeline to submit "must do" refinement jobs on standard EC2 instances and "nice to do" workloads to the Spot Instances. With large numbers of standard instances we want to optimize the time to complete the job. With the addition of Spot Instances to our infrastructure we can optimize for the price to complete jobs and cluster the results that we get back from spot. Not unlike volunteer computing efforts such as Rosetta@Home, we load the queue with tasks and then make decisions after we get back enough work units from the spot instances. If we're too low on the Spot bids we just explore less solution space. The more Spot Instances we acquire the more of the energy landscape we can explore.

Here is their architecture:

Going back to Werner Vogels blog post, Cloud computing has three different purchasing models.

Different Purchasing Models

The three different purchasing models Amazon EC2 offers give customers maximum flexibility in managing their IT costs; On-Demand Instances are charged by the hour at a fixed rate with no commitment; with Reserved Instances you pay a low, one-time fee and in turn receive a significant discount on the hourly usage charge for that instance; and Spot Instances provide the ability to assign the maximum price you want for capacity with flexible start and end times.

    pencil-calc.jpg
  • On-Demand Instances - On-Demand Instances let you pay for compute capacity by the hour with no long-term commitments or upfront payments. You can increase or decrease your compute capacity depending on the demands of your application and only pay the specified hourly rate for the instances you use. These instances are used mostly for short term workloads and for workloads with unpredictable resource demand characteristics.
  • Reserved Instances - Reserved Instances let you make a low, one-time, upfront payment for an instance, reserve it for a one or three year term, and pay a significantly lower rate for each hour you run that instance. You are assured that your Reserved Instance will always be available in the Availability Zone in which you purchased it. These instances are used for longer running workloads with predictable resource demands.
  • Spot Instances - Spot Instances allow you to specify the maximum hourly price that you are willing to pay to run a particular instance type. We set a Spot Price for each instance type in each region, which is the price all customers will pay to run a Spot Instance for that given hour. The Spot Price fluctuates based on supply and demand for instances, but customers will never pay more than the maximum price they have specified. These instances are used for workloads with flexible completion times.

What’s next for AWS?  Users are asking for sub hour increments.  It makes sense if you continue down the path of spot market pricing and the ability to maximize utilization.

This is awesome. Market pricing for computer power. People have dreamed of this and now Amazon is making it happen!

Now the real question is when will AWS start charging for half hours or quarter hours?

I have projects I need to run every hour for only 15 to 20 minutes ... but they need to run every hour.

Read more

Early indicator of Google Data Center growth? $400 SE Asia Japan cable project

Guardian UK reports.on the announcement.

Google backs world's fastest internet cable

• Undersea line set to run 5,000 miles across southeast Asia
• £245m cable marks latest investment in net infrastructure

In little more than a decade, Google has conquered the technology industry and become one of the world's most powerful companies. Its latest undertaking, however, may be one of its most ambitious: a giant undersea cable that will significantly speed up internet access around the globe.

The Californian search engine is part of a consortium that confirmed its plans to install the new Southeast Asia Japan Cable (SJC) yesterday, the centrepiece of a $400m (£245m) project that will create the highest capacity system ever built.

Gigaom references the 2008 SJC proposal.

Google’s Underwater Ambitions Expand

By Stacey Higginbotham December 11, 2009 1 Comment

0 45

The original SJC proposal

Read more

Amazon Web Services adds global physical data shipping and receiving to cloud computing services

Amazon is setting the standard for cloud computing services.  AWS just announced a beta import/export service to allow 2TB of data to be imported or exported globally from AWS S3.

AWS Import/Export Goes Global

AWS Import/Export is a fast and reliable alternative to sending large volumes of data across the internet. You can send us a blank storage device and we'll copy the contents of one or more Amazon S3 buckets to it before shipping it back to you. Or, you can send us a storage device full of data and we'll copy it to the S3 buckets of your choice.

Until now, this service was limited to US shipping addresses and to S3's US Standard Region. We've lifted both of those restrictions; developers the world over now have access to AWS Import/Export. Here's what's new:

  • Storage devices can now be shipped to an AWS address in the EU for use with S3's EU (Ireland) Region.At this time, devices shipped to our AWS locations in the EU most originate from and be returned to an address within the European Union.
  • Storage devices can be shipped from almost anywhere in the world to a specified AWS address in the US for data loads into and out of buckets in the US Standard Region. Previously, devices could only be shipped from and returned to addresses in the United States.

What would use this for? 

Common Uses for AWS Import/Export

AWS Import/Export makes it easy to quickly transfer large amounts of data into and out of the AWS cloud. You can use AWS Import/Export for:

  • Data Migration – If you have data you need to upload into the AWS cloud for the first time, AWSImport/Export is often much faster than transferring that data via the Internet.
  • Offsite Backup – Send full or incremental backups to Amazon S3 for reliable and redundant offsite storage.
  • Direct Data Interchange – If you regularly receive content on portable storage devices from your business associates, you can have them send it directly to AWS for import into your Amazon S3 buckets.
  • Disaster Recovery – In the event you need to quickly retrieve a large backup stored in Amazon S3, use AWSImport/Export to transfer the data to a portable storage device and deliver it to your site.

When should you consider this service?  AWS answers this as well.

When to Use AWS Import/Export

If you have large amounts of data to load and an Internet connection with limited bandwidth, the time required to prepare and ship a portable storage device to AWS can be a small percentage of the time it would take to transfer your data over the internet. If loading your data over the Internet would take a week or more, you should consider using AWS Import/Export.

Below is a table that gives guidance around common internet connection speeds on: (1) how long it will take to transfer 1TB of data over the Internet into AWS (see the middle column for this estimate); and, (2) what volume of total data will require a week to transfer over the Internet into AWS, and therefore warrant consideration of AWSImport/Export (see the right-hand column). For example, if you have a 10Mbps connection and expect to utilize 80% of your network capacity for the data transfer, transferring 1TB of data over the Internet to AWS will take 13 days. The volume at which this same set-up will take at least a week, is 600GB, so if you have 600GB of data or more to transfer, and you want it to take less than a week to get into AWS, we recommend you using AWSImport/Export.

Available Internet Connection
Theoretical Min. Number of Days to Transfer 1TB at 80% Network Utilization
When to Consider AWSImport/Export?

T1 (1.544Mbps)
82 days
100GB or more

10Mbps
13 days
600GB or more

T3 (44.736Mbps)
3 days
2TB or more

100Mbps
1 to 2 days
5TB or more

1000Mbps
Less than 1 day
60TB or more

If anyone can efficiently receive and ship items it is amazon, and it was smart they added this capability to AWS.  We’ll see how long before other cloud computing providers add this service.  My bet is you’ll have to wait a while as few would have thought to set up shipping and receiving in their cloud computing internal network.

Read more

Container Data Center form a silo cylinder or shipping container box

Clumeq has a new super computer repurposing their decommissioned Van de Graff particle accelerator.

The Quebec site is on the campus of Université Laval inside a renovated van de Graaf silo, with an innovative cylindrical layout for the data center. This cluster will feature upwards of 12,000 processing elements. Compute racks will be distributed among three floors of concentrical rings with a total surface area of 2,700 sq.ft. with an IT capacity of approximately 600 kW.

DataCenterKnowledge picked up the news.

Wild New Design: Data Center in A Silo

December 10th, 2009 : Rich Miller

clumeq-design-470

A diagram of the design of the CLUMEQ Colossus supercomputer, from a recent presentation by Marc Parizeau of CLUMEQ.

Here’s one of the most unusual data center designs we’ve seen. The CLUMEQsupercomputing center in Quebec has worked with Sun Microsystems to transform a huge silo into a data center. The cylindrical silo, which is 65 feet high and 36 feet wide with two-foot thick concrete walls, previously housed a Van de Graaf particle accelerator. When the accelerator was decommissioned, CLUMEQ decided to convert the facility into a high-performance computing (HPC) cluster known as Colossus.

Here is the youtube video.

This idea may seem strange, but it is part of connecting the building to IT equipment.  Microsoft just did this showing their Windows Azure Containers with the cooling system integrated in the container.

L1020981

Sun has their own page on Clumeq.

When supercomputing consortium CLUMEQ designed its high-performance computing (HPC) system in Quebec, it was able to house it in the silo of a former particle accelerator on the Université Laval campus. The structure's 3-level cylindrical floor plan was ideal for cooling the 56 standard-size racks, and enabled the university to retain a treasured landmark.

Background

CLUMEQ is a supercomputing consortium of universities in the province of Quebec, Canada. It includes McGill University, Université Laval, and all nine components of the Université du Québec network. CLUMEQ supports scientific research in disciplines such as climate and ecosystems modeling, high energy particle physics, cosmology, nanomaterials, supramolecular modeling, bioinformatics, biophotonics, fluid dynamics, data mining and intelligent systems.

Read more