Shifting mindset to an Information Factory from a Data Center, the industrialization of the data

There is a different way to think about data centers where the goal of a company is to bring raw unprocessed bits and turn them into higher value bits just like a factory brings in raw materials and transforms the materials into higher value finished goods. The factory uses huge amounts of power in special buildings with lots of equipment and custom processes to support the transformation.   This is the industrialization of the data center.

Barton George writes a post on Big Data any how in general 5% of data is only used.

Big Data is the new Cloud

Rate This


Big Data represents the next not-completely-understood got-to-have strategy.  This first dawned on me about a year ago and has continued to become clearer as the phenomenon has gained momentum.  Contributing to Big Data-mania is Hadoop, today’s weapon of choice in the taming and harnessing of  mountains of unstructured data, a project that has its own immense gravitational pull of celebrity.

So what

But what is the value of slogging through these mountains of data?  In a recent Forrester blog, Brian Hopkins lays it out very simply:

We estimate that firms effectively utilize less than 5% of available data. Why so little? The rest is simply too expensive to deal with. Big data is new because it lets firms affordably dip into that other 95%. If two companies use data with the same effectiveness but one can handle 15% of available data and one is stuck at 5%, who do you think will win?

But, do you think Google, Facebook, Amazon, Twitter, and Zynga use only 5% of the data.  These companies are analyzing all their users and information looking where to make more money.

The new way of thinking is all that data is both market intelligence and the raw materials for information factories.

Barton goes on to point out that Google Facebook, and Yahoo are big Hadoop type of users analyzing unstructured big data.

Deal with it

Hadoop, which I mentioned above, is your first line of offense when attacking big data.  Hadoop is an open source highly scalable compute and storage platform.  It can be used to collect, tidy up and store boatloads of structure and unstructured data.  In the case of enterprises it can be combined with a data warehouse and then linked to analytics (in the case web companies they forgo the warehouse).

And speaking of web companies Hopkins explains

Google, Yahoo, and Facebook used big data to deal with web scale search, content relevance, and social connections, and we see what happened to those markets. If you are not thinking about how to leverage big data to get the value from the other 95%, your competition is.

Some of you may think of this is new, but this is standard practice for many.  The winners think like an information factory integrating across many different systems.  The losers are thinking of a data center as a place their data is stored in silos to support internal organizational structures.