A Lesson from Minority Report, sometimes you want a everybody agreeing to be right

Two of my friends and I have been discussing a variety of technical and business decisions that need to be made.  One of the things we have done is to make it a rule that all three of us need to be in agreement on decisions.   Having three decision makers is a good pattern to insure that a diversity of perspectives are included in analysis, and decisions can be made if one decision maker is not available.

Triple redundancy though is typically used though where as long as two systems are in agreement than you can make a decision.

In computing, triple modular redundancy, sometimes called triple-mode redundancy,[1] (TMR) is a fault-tolerant form of N-modular redundancy, in which three systems perform a process and that result is processed by a voting system to produce a single output. If any one of the three systems fails, the other two systems can correct and mask the fault.

But, an example of the flaw in this approach could be taken from the Minority Report and the use of pre-cogs where a zealousness to come to a conclusion allows a "minority report" to be discarded.

Majority and minority reports

Each of the three precogs generates its own report or prediction. The reports of all the precogs are analyzed by a computer and, if these reports differ from one another, the computer identifies the two reports with the greatest overlap and produces a majority report, taking this as the accurate prediction of the future. But the existence of majority reports implies the existence of a minority report.

James Hamilton has a blog post on error detection.  Errors could be consider the crimes in the data center.  And, you can falsely assume there are no errors (crimes) because there is error correction in various parts of the system.

Every couple of weeks I get questions along the lines of “should I checksum application files, given that the disk already has error correction?” or “given that TCP/IP has error correction on every communications packet, why do I need to have application level network error detection?” Another frequent question is “non-ECC mother boards are much cheaper -- do we really need ECC on memory?” The answer is always yes. At scale, error detection and correction at lower levels fails to correct or even detect some problems. Software stacks above introduce errors. Hardware introduces more errors. Firmware introduces errors. Errors creep in everywhere and absolutely nobody and nothing can be trusted.

If you think like this.

This incident reminds us of the importance of never trusting anything from any component in a multi-component system. Checksum every data block and have well-designed, and well-tested failure modes for even unlikely events. Rather than have complex recovery logic for the near infinite number of faults possible, have simple, brute-force recovery paths that you can use broadly and test frequently. Remember that all hardware, all firmware, and all software have faults and introduce errors. Don’t trust anyone or anything. Have test systems that bit flips and corrupts and ensure the production system can operate through these faults – at scale, rare events are amazingly common.

Maybe you won't let the majority rule and listen to minority.  All it takes is a small system, a system in the minority to bring down a service.

A new way to regulate who you talk to at a Data Center event, Yellow and Red Cards flag fouls

I was at a the Open Compute Summit that Facebook hosted in NYC, and one of the data center executives was sucked into a sales conversation and sold quite flagrantly, interrupting our group's conversation.  At an industry event where people have paid admission fees and/or exhibit fees many sales people think it is their right to sell the attendees.  You have little hope of doing anything to get an aggressive salesman to leave you alone.

Then it hit me after the salesman left.  We should have yellow and red cards for attendees to flag fragrant behavior.  I've order a few of these for a group of us to use in a week.

NewImage

We are hosting our own event, so we can create our own rules.  We'll have fun with this idea.

Here is the record for yellow and red cards in a soccer match.

Is the Public Cloud a place of refuge from the infighting in Enterprise IT?

There are many reasons why the public could is popular.  MSNBC has a post on how executives hate their jobs just as much as lower level employees.

Execs are just like you: They don't like their jobs, either


By Allison Linn

If you feel stuck in a job you don’t like, maybe you can take comfort in the fact that the big boss may well be in the same boat.

A new global survey of business executives finds that less than half like their jobs, although most don’t plan on leaving.

The Path Forward, a survey of 3,900 business executives from around the world conducted by consulting firm Accenture, found that only 42 percent said they were satisfied with their jobs. That’s down slightly from 2010.

And, reading about the Power of Habits reminded of a possible reason for the displeasure.  The fact that some companies are a civil war.

Companies aren’t big happy families where everyone plays together nicely. Rather, most workplaces are made up of fiefdoms where executives compete for power and credit, often in hidden skirmishes that make their own performances appear superior and their rivals’ seem worse. Divisions compete for resources and sabotage each other to steal glory.

Companies aren’t families. They’re potential battlefields in a civil war.

Then it hit me that the Data Center is the one place that all theses families (internal company teams) need to put their information.  What other place other than finance has the whole organization connecting.  The finance scenario is actually probably easier as it ultimately a money issue.  But, enterprise IT is very complex.

If you accept this difficulty of having everyone get along in enterprise IT which can be wearing and frustrating, then maybe people just want to escape the mental anguish and feuding between groups.  The lower costs and better service of a cloud environment like AWS could be the side benefits when the ultimate reason was the frustration dealing with central enterprise IT.  If you accept this as a potential reason for why users have gone to the public cloud, they are not going to be satisfied with a private cloud run by the central enterprise IT.

 

AMD buys SeaMicro, SeaMicro CEO heads up AMD's GM of Data Center Server Solutions

SeaMicro has a press release that AMD has acquired SeaMicro.

AMD to Acquire SeaMicro: Accelerates Disruptive Server Strategy

— SeaMicro’s Low-Power, High-Bandwidth Microserver Solutions Set the Stage for AMD’s Disruptive Approach To Lead Fast-Growing Cloud Data Center Market

SUNNYVALE, Calif. — Feb. 29, 2012 – AMD (NYSE: AMD) today announced it has signed a definitive agreement to acquire SeaMicro, a pioneer in energy-efficient, high-bandwidth microservers, for approximately $334 million, of which approximately $281 million will be paid in cash. Through the acquisition of SeaMicro, AMD will be accelerating its strategy to deliver disruptive server technology to its OEM customers serving Cloud-centric data centers. With SeaMicro’s fabric technology and system-level design capabilities, AMD will be uniquely positioned to offer industry-leading server building blocks tuned for the fastest-growing workloads such as dynamic web content, social networking, search and video.

It's kind of funny to think of AMD selling Intel processors, until 2nd half of 2012

NewImage

AMD’s server technology combined with SeaMicro technology provides customers with a range of processor choices and platforms that can help significantly reduce data center complexity, cost and energy consumption while improving performance.  AMD plans to offer the first AMD Opteron™ processor-based solutions that combine AMD and SeaMicro technology in the second half of 2012.  The company remains firmly committed to its traditional server business, and will continue to focus and invest in this area.

“By acquiring SeaMicro, we are accelerating AMD’s transformation into an agile, disruptive innovator capable of staking a data center leadership position,” said Rory Read, president and CEO, AMD.  “SeaMicro is a pioneer in low-power server technology.  The unmatched combination of AMD’s processing capabilities, SeaMicro’s system and fabric technology, and our ambidextrous technology approach uniquely positions AMD with a compelling, differentiated position to attack the fastest growing segment of the server market.”

The SeaMicro of CEO has a new job.

“Cloud computing has brought a sea change to the data center -- dramatically altering the economics of compute by changing the workload and optimal characteristics of a server,” said Andrew Feldman, SeaMicro CEO, who will become general manager of AMD’s newly created Data Center Server Solutions business. “SeaMicro was founded to dramatically reduce the power consumed by servers, while increasing compute density and bandwidth.  By becoming a part of AMD, we will have access to new markets, resources, technology, and scale that will provide us with the opportunity to work tightly with our OEM partners as we fundamentally change the server market.”

Thinking about Big Data, here are 8 rules

Andreas Weigand has a blog post on Eight Rules of Big Data.

Start with the problem, not with the data
Share data to get data
Align interests of all parties
Make it trivially easy for people to contribute, connect, collaborate
Base the equation of your business on customer centric metrics
Decompose the business into its “atoms”
Let people do what people are good at, and computers what computers are good at
Thou shalt not blame technology for barriers of institutions and society

Here is video of Andreas presenting the Eight Rules at Walmart Labs.