Google Data Centers, a part of their infrastructure advantage

I was talking to a Sr Google guy at a conference and asked what he does.  His response was "I work on Google's Infrastructure."

What is infrastructure as defined by searchdatacenter?

DEFINITION - In information technology and on the Internet, infrastructure is the physical hardware used to interconnect computers and users. Infrastructure includes the transmission media, including telephone lines, cable television lines, and satellites and antennas, and also the routers,aggregators, repeaters, and other devices that control transmission paths. Infrastructure also includes the software used to send, receive, and manage the signals that are transmitted.

In some usages, infrastructure refers to interconnecting hardware and software and not to computers and other devices that are interconnected. However, to some information technology users, infrastructure is viewed as everything that supports the flow and processing of information.

Infrastructure companies play a significant part in evolving the Internet, both in terms of where the interrconnections are placed and made accessible and in terms of how much information can be carried how quickly.

But, the Google guy clarified he works on the search and services infrastructure to support Google services.  Ohh, this is interesting Google defines infrastructure above what most think.

Which fits with a competitive Google has that GigaOm points out as an infrastructure advantage.

Google’s Growing Infrastructure Advantage

By Stacey Higginbotham Mar. 17, 2010, 7:50am PDT 2 Comments

75

Google’s content comprises between 6 and 10 percent of global Internet traffic, making its internal network one of the top three ISPs in the world, according to Arbor Networks. The maker of deep packet inspection equipment, which runs a survey of international ISPs, detailed Google’s traffic in a blog post Tuesday.

The original information came from here with details on Google's use of direct peering.

The graph below shows an estimate of the average percentage of Google traffic per month using direct interconnection (i.e. not using a transit provider). As before, this estimate is based on anonymous statistics from 110 providers. In 2007, Google required transit for the majority of their traffic. Today, most Google traffic (more than 60%) flows directly between Google and consumer networks.

google_peering

So, even though the data center crowd thinks of data centers as infrastructure, Google has a bigger picture.

But even building out millions of square feet of global data center space, turning up hundreds of peering sessions and co-locating at more than 60 public exchanges is not the end of the story.

Over the last year, Google deployed large numbers of Google Global Cache (GGC) servers within consumer networks around the world. Anecdotal discussions with providers, suggests more than half of all large consumer networks in North America and Europe now have a rack or more of GGC servers.

So, after billions of dollars of data center construction, acquisitions, and creation of a global backbone to deliver content to consumer networks, what’s next for Google?

I am regularly surprised how data center discussions many times only discuss the data center, not the data center as part of the overall system.

Read more

Cisco targets Data Center Containers for Federal/Defense market saves 50% capital and 30% operating costs

Containers have gone through its hype phase, and now we'll see how many start buying containers.  There is some new media coverage on Cisco's move in containers.

Cisco claims that by purchasing a portable data center—which cost around $1.2 million for a 40-foot, fully loaded model and some $600,000 for a 20-footer—an enterprise can save 50 percent in capital expenses and 30 percent in operating expenses compared with a similar-sized, permanent land-based facility. But those are very general numbers.

InformationWeek Gov't has coverage.

Cisco has long been selling pieces of containerized data centers to the military through systems integrators, but with the company now selling servers in addition to network equipment, it has the product line in place to get into the containerized data center business.

"We're looking at a model of building a Cisco container -- with a Cisco part number -- that will contain the unified computing platform," said Bruce Klein, Cisco's U.S. public sector senior VP.

Cisco has a PDF on containers.

image

And, DataCenterKnowledge points out NASA's cloud computing container was delivered by Cisco.

Cisco Containers Target Federal Market

March 15th, 2010 : Rich Miller

The data center container housing the NASA Nebula cloud computing application arrives at Ames Research Center in Mountain View, Calif.

It’s no surprise that Cisco Systems has confirmed that it is officially developing a data center container offering. In reality, Cisco (CSCO) has been busy in the container market for some time, most visibly in procuring a container for the Nebula cloud computing project at the NASA Ames Research Center in Mountain View, Calif. The Nebula “data center in a box” was built inside a FOREST container from Verari Systems filled with Cisco Systems’ Unified Computing System (UCS).

Read more

Google Android’s team adds web expert Tim Bray as "developer advocate" for open web mobile experience

The battle being Google and Apple is reaching a media high point. Tim Bray, XML co-creator has joined Google as developer advocate and the media is highlighting the competitive move.

Tim Bray lands on Android team

Posted by Dana Blankenhorn @ 6:12 am

XML co-creator Tim Bray has joined the exodus from Oracle and landed at Google, as a “developer advocate” for the Android.

Bray, who like tech titans Steve Jobs and Bill Gates, was born in 1955 (as was this humble blogger), is now expected to be a much more familiar face to reporters, contrasting what he calls Android’s open development vision with the Apple iPhone’s “sterile Disney-fied walled garden surrounded by sharp-toothed lawyers.”

Tim Bray is a well known blogger and he writes on his move to Google.

How? · Google and I have been a plausible match for a long time. Web-centric, check. Search,check. Open-source, check. The list goes on. We’ve talked repeatedly over the years, but the conversations all ended at the point when I said “...and I don’t want to move to the Bay Area”.

Being an experienced tech guy, born in 1955 he gets the importance of energy efficient computing.

I’m not going to stop worrying about concurrent programming, because our failure to equip developers to do it right is going to bite our asses just as hard in the mobile space as anywhere else. Maybe harder, since mobiles are power-starved by definition and current data seem to show that slower many-core CPUs give you more computing per milliwatt.

Combine the energy efficiency focus in Android Mobile with Urs Hoelzle data center team, and there are huge synergies for energy efficient systems.

He who can use less energy for the same performance and capability has the advantage.

Read more

Pike Research forecasts 2010 to 2015 microgrid growth from 100 to 2,000

One of the biggest changes coming to the power grid are microgrids.  Pike Research has a report on microgrids.

More than 2,000 Microgrids to be Deployed by 2015

January 26, 2010

Microgrids, which are “islanded” power generation and distribution zones that can operate autonomously from the larger electrical grid, are an increasing area of focus for institutions, governments, corporations, and utilities.  According to a recent report fromPike Research, a variety of trends are converging to create significant growth potential for microgrids, and the cleantech market intelligence firm forecasts that more than 2,000 sites will be operational worldwide by 2015, up from fewer than 100 in 2010.

“The distinguishing feature of a microgrid is the ability to separate and isolate itself from the utility’s distribution system during brownouts and blackouts,” says managing director Clint Wheelock.  “This degree of localized control is compelling for many microgrid proponents during this time of increasing concern over grid reliability.”

Out of the 2,000, one microgrid will be at the Ewing Industrial Park in Columbia, MO, the site where the Open Source Data Center Initiative ideas will be tested.

There is a lot of information in the report which you can buy here.

Key questions addressed:
  • What is a “microgrid” and what are its key components and features?
  • Why are inverters the key advance enabling microgrids to develop today despite opposition from many electric utilities?
  • What are the key market drivers at the policy level – and why does the United States have the best near-term market opportunity?
  • Why are microgrids inevitable if investments in a smart grid are accompanied by a paradigm shift from central station to distributed generation supply sources?
  • Who are the big players – and new technology vendors – in the microgrid space, and what is their key role in developing this new energy market?
Who needs this report?
  • Microgrid Developers
  • Smart Grid Hardware and Software Providers
  • Venture Capitalists
  • Communities, institutions, and corporations interested in building their own microgrid
  • Distribution Utilities worried about worker safety and market share issues
  • Policy Makers examining new business models for renewable generation

Even though we could buy a copy of the report.  Our first preference is to develop things from scratch with an open source approach, then publish the results.  I would assume if we bought a copy of the report, we can't republish anything from it.  And, any ideas we come up with potentially could be limited given we bought a research publication.

Which means we most likely will not be buying any other research as it would limit our ability to publish.

Read more

Google Warehouse Scale Computing pattern harvested, solving the current or future performance problems

The Open Source Data Center Initiative is using a Pattern based approach.

In software engineering, a design pattern is a general reusable solution to a commonly occurring problem in software design. A design pattern is not a finished design that can be transformed directly into code. It is a description or template for how to solve a problem that can be used in many different situations.

I was reading Google's Warehouse Scale Computing document which can be daunting with its 120 pages of dense topics.  One of the points made which is an example of a design pattern is under the following conditions.

Key pieces of Google’s services have release cycles on the order of a couple of weeks compared to months or years for desktop software products. Google’s front-end Web server binaries, for example, are released on a weekly cycle, with nearly a thousand independent code changes checked in by hundreds of developers— the core of Google’s search services has been reimplemented nearly from scratch every 2 to 3
years.

This may not sound like your environment, but it is common in agile dynamic SW development at Google, start-ups and other leading edge IT shops.

Agile methods generally promote a disciplined project management process that encourages frequent inspection and adaptation, a leadership philosophy that encourages teamwork, self-organization and accountability, a set of engineering best practices intended to allow for rapid delivery of high-quality software, and a business approach that aligns development with customer needs and company goals.

The old way of purchasing IT hardware to support an application's SLA is a lower priority.  The new way is to add hardware capabilities to support the rapid innovation in SW development.

A beneficial side effect of this aggressive software deployment environment is that hardware architects are not necessarily burdened with having to provide good performance for immutable pieces of code. Instead, architects can consider the possibility of significant software rewrites to take advantage of new hardware capabilities or devices.

BTW, immutable in SW means which applies to many legacy systems.

In object-oriented and functional programming, an immutable object is an object whose state cannot be modified after it is created. This is in contrast to a mutable object, which can be modified after it is created.

Problem: How to improve the performance per watt  in IT efficiencies with data center infrastructure and hardware?

Options:

  1. Improve data center efficiency, aka PUE.
  2. Buy more efficiency IT HW.
  3. Improve HW utilization with virtualization and server consolidation.
  4. Add new hardware capabilities that support the future of software.

Solution: even though 1 - 3 are typical, the efficiencies from #4 could be sizeable larger.  Some part of the data center and IT hardware should be designed for the future applications vs. making the future applications run on what the past applications require.

Examples of technologies are NVidia's GPU, solid state memory, startups with new hardware designs like www.tilera.com, and complete re-architecture of the data center system.

People are working on the complete re-architecture of the data center system as the performance per watt gains are huge.

How many data centers are designed for the current hardware vs the future? 50%, 75%, 90%, 95%, 98%

Should data centers be designed for a 5 year lifespan vs. 20 - 30 to support more rapid innovation?  Then, be upgradable?

Read more