29 days, 11K servers of Google Cluster Server data shared with Researchers

Google had a crazy idea a year ago, let's share some of our cluster data to the research community.  In Jan 2010, Google shared 7 hrs of data.

Google Cluster Data



Google faces a large number of technical challenges in the evolution of its applications and infrastructure. In particular, as we increase the size of our compute clusters and scale the work that they process, many issues arise in how to schedule the diversity of work that runs on Google systems.

We have distilled these challenges into the following research topics that we feel are interesting to the academic community and important to Google:
  • Workload characterizations: How can we characterize Google workloads in a way that readily generates synthetic work that is representative of production workloads so that we can run stand alone benchmarks?
  • Predictive models of workload characteristics: What is normal and what is abnormal workload? Are there "signals" that can indicate problems in a time-frame that is possible for automated and/or manual responses?
  • New algorithms for machine assignment: How can we assign tasks to machines so that we make best use of machine resources, avoid excess resource contention on machines, and manage power efficiently?
  • Scalable management of cell work: How should we design the future cell management system to efficiently visualize work in cells, to aid in problem determination, and to provide automation of management tasks?

Now Google has shared 29 days from 11,000 Servers in a Google Cluster.

More Google Cluster Data



Google has a strong interest in promoting high quality systems research, and we believe that providing information about real-life workloads to the academic community can help.

In support of this we published a small (7-hour) sample of resource-usage information from a Google production cluster in 2010 (research blog on Google Cluster Data). Approximately a dozen researchers at UC Berkeley, CMU, Brown, NCSU, and elsewhere have made use of it.

Recently, we released a larger dataset. It covers a longer period of time (29 days) for a larger cell (about 11k machines) and includes significantly more information, including:

  • the original resource requests, to permit scheduling experiments
  • request constraints and machine attriibutes
  • machine availability and failure events
  • some of the reasons for task exits
  • (obfuscated) job and job-submitter names, to help identify repeated or related jobs
  • more types of usage information
  • CPI (cycles per instruction) and memory traffic for some of the machines

Besides the feedback from the the research community, this is a great way for Google to find future hires.

Big Data Webinar, Dec 7, 2011, GigaOm and Splunk

GigaOm Pro has a webinar on Dec 7, 2011 10 - 11a PT on Big Data.

The Big Machine

How the Internet of Things Is Shaping Big Data

 

Even relatively conservative forecasts predict there will be 50 billion connected devices online by the end of the decade. Over time, the majority won’t be laptops or phones, but rather machine-to-machine connections from network infrastructure, sensors in cars, appliances, healthcare monitors and the like. They’ll produce data that needs to be combined and analyzed alongside structured data, application logs, customer info and social media streams. Already today, companies across multiple industries and government agencies are struggling to harness the sheer volume, complexity and variety of the data generated. In this webinar, we’ll look at the various kinds of machine-driven big data, how to develop an analytics and usage framework for them, and how companies can use these data to run their businesses.

Join GigaOM Pro and our sponsor Splunk for “The Big Machine: How the Internet of Things Is Shaping Big Data,” a free analyst roundtable webinar on Wednesday, December 7, 2011 at 10 a.m. PST.

I'll be on the panel along with other GigaOm analysts and Splunk's VP of Engineering, Stephen Sorkin.

Moderator

GigaOM Pro Cloud Curator, Founder,The Cloud of Data

Panelists

GigaOM Pro Analyst, Executive Director, Zettaforce
GigaOM Pro Analyst, FounderGreenM3
VP of Engineering Splunk

Storage Vendor predicts data centers could be 25% of power consumption by 2020. huh??

I was this press release and got a good laugh.

PRESS RELEASE

Dec. 5, 2011, 9:01 a.m. EST

Symform Forecasts Top 5 Cloud and Storage Predictions for 2012

New Year Will Ring in a "Storage Revolution" Amid Record Data Growth and Continued Data Center Bloat

 

 

SEATTLE, WA, Dec 05, 2011 (MARKETWIRE via COMTEX) -- The coming year will herald in a "storage revolution," according to Symform, which released its top cloud and storage predictions for 2012. Over the last year, top headlines centered on the growing popularity of cloud services, the staggering growth in data, and several high-profile data center outages. Based on these strong industry trends and insights gathered from customers, partners and industry experts, Symform predicts 2012 will be all about data -- how to store, secure, access and manage it. This will not only be a large enterprise trend but also impactful to the millions of small and medium-sized businesses (SMB), and the service providers who deliver solutions to them.

Check out this claim for energy consumption.

By 2020, Symform predicts that if left unchecked, more than 25 percent of the nation's power will be required to power data centers, unless businesses can identify new means for storing data without building additional data centers.