Creating a better operations environment by thinking abstractly, using agency

I am watching Gabe Newell again in his talk to LBJ school. I owe much of my career change as if it wasn't for Gabe I would not have left Apple to go to Microsoft in 1992. Now some of you may say that if I had stayed at Apple in 1992 i would be better off than going to Microsoft, but the chances of my staying at Apple after 1992 were slim.

And one of the nuggets is how does Valve make decisions on what to do to make better games. at the 28m mark is where Gabe shares Valves insight.

Below is written explanation.

This idea is powerful, because it is a good abstraction. i plan on using this.

Economics is really human behavior/psychology dressed up in fancy technical language. A lot of it is actually fairly common-sense, once you get past the language.

Here, specifically, “agency costs” refers to the principal-agent problem, aka, the agency problem.

In common terms, the principal-agent problem is, how do you (the principal) get someone (the agent) to do something that you want, instead of something that they want.

So if you (the principal) own a business and hire an employee (agent), how do you get the employee to do things to advance your company’s interests rather than their own? For example, you can try to reward someone with power and responsibility to motivate them to do well (that’s in your interest), but if you structure it wrong, they can play the system so that they look good and get the reward while doing things that hurt the company. An example of this would be the bonuses paid to bankers—in theory, they are supposed to reward people for doing good for the company and align their interests with that of ours, but because of the messed-up way performance is measured (and also the lack of a disincentive to not do bad things), it just ends up encouraging people to do shady and risky things that ultimately bankrupt the company while they walk away with huge bonuses.

The agency problem is everywhere. The classic textbook example is of the real estate agent. In theory, by giving them a percentage commission on the sale, you are encouraging them to get as good of a price as possible since the better the price, the better their commission. But after a certain point, they’ll decide that it’s no longer worth spending an extra month of their time to get a small percent of a somewhat better selling price, at which point, their interests (in getting the sale over with, because they have better things to do) supersede your interest (in getting the best possible price).

And it’s in politics, too, where politicians are (supposed to be) the agents of citizens. I don’t think I need to explain this one—it is clearly one area where the agency problem is... quite severe.

Anyway, a good manager is one that’s able to identify and analyze the agency problem and find a good solution to it (there usually isn’t a perfect solution—you try to find the solution with the best balance of upside vs. downside). And many people don’t have a good grasp of this (again, look at the BS that goes on in the financial markets—the principals are almost always getting screwed).
— https://www.reddit.com/r/Steam/comments/2zxcyy/gabe_newell_responds_to_email_asking_about/

Capitol One Data Centers follow the pattern of the Innovators, 8 locations consolidated into 3

Capitol One had a data center event not unlike a Google Data Center event.  The local press covered the event in Richmond, VA.  How does this look like Google.  The VA governor and Capitol One CEO are at a data center event.  

Capital One new data center

 

Capital One founder, chairman, and chief executive officer, Richard Fairbank, left, and Virginia Gov. Terry McAuliffe, right, connect symbolic cables with the help of Brian Cobb, center, managing vice president of Capital One, during a grand opening ceremony of Capital One's new data center in Chesterfield County on Wednesday, March 12, 2014.

5 years ago you would never see an event like this.  Now you see data center openings as PR events.

Operations had already started there. But on Wednesday, the McLean-based company held a ceremony to serve as symbolic opening for the center, located in a highly secured facility surrounded by a wooded area and tall, reinforced fencing.

The security is important because the Meadowville site is one of Capital One’s three primary data centers where the company stores and manages vast amounts of information produced by its business providing financial services to roughly 65 million customers.

The other pattern Capitol One has followed used by Google, eBay and others for high availability is three main data centers.

Capital One is consolidating operations at its eight data centers that it has been operating because of numerous recent acquisitions of other banks and credit card providers.

The data operations are being consolidated into three centers — the one in Chesterfield, one in Henrico County, and one near Chicago.

The latency to the customers isn’t as important as the latency between the data centers. The data centers are all located East of Mississippi.

NewImage

And just like other efficient data center operators the headcount is low.

The center has about 50 employees now, but the company expects it will employ more than 100 eventually.

 

Do you think of Trust as a Design Pattern? It changes many things

Sitting around thinking about how to be different than the rest I realized focusing on creating a service where Trust is a design priority changes many things.  Trust is one of those things that is valuable yet hard to develop.

GigaOm has a post on the problem of perceiving trust.

Is it safe to buy that new gadget? Why trust is perceptual computing’s biggest problem

 

4 HOURS AGO

No Comments

Hear_speak_see_no_evil_Toshogu
SUMMARY:

This year’s CES is a frustrating affair — so many cool new context-aware toys to play with, and so little reassurance from the manufacturers that their use will stay secure or private.

Thanks to Eric Snowden the issue of trust is a hot topic.

I am really frustrated right now. I look at the slew of awesome announcements coming in from the Consumer Electronics Show in Las Vegas, and I keep thinking the same thing: “Nope, because surveillance.” Damn you, Snowden!

...

Security and privacy can no longer be afterthoughts or nice-to-haves — difficult as they are to implement in this age of embedded systems. We the consumers now know the dark flipside to these innovations, and that, manufacturers and app providers, is your problem.

So many free services are built on users not thinking about the trust of watch is being done with the events in your life.

Trust is one of those things that is hard to do and with all the latest technology is more and more valuable.

Who do you trust?  Do you focus on developing more trust?

What would change in your data center with more trust?  What changes in your data center with less trust?

Data Center Capacity Infrastructure Pattern

The concept of software patterns is well established.  

In software engineering, a design pattern is a general reusable solution to a commonly occurring problem within a given context in software design. A design pattern is not a finished design that can be transformed directly into source or machine code. It is a description or template for how to solve a problem that can be used in many different situations. Patterns are formalized best practices that the programmer must implement themselves in the application.

Over 9 years ago I tried to figure out infrastructure patterns in the same way that software patterns are used.  About 2-3 years ago, I finally understood how to develop infrastructure patterns, and it has taken me the additional 2-3 years to test some of the ideas to get to point of writing them up.

The following is going to be a riff of ideas, and I'll most likely clean it up with the help of some of my friends who are good at writing up patterns, so read the following as a rough draft.

I was talking to a software guy who now works in a data center deploying some of the more complex IT equipment.  His background is software so he gets patterns. We had a brief conversation the other day and I explained the following pattern of site design and capacity.  One of the most important things in defining a pattern is to identify what problem you want to solve.

The problem I am going to discuss is how to add data center capacity in a region like a city.

The typical method is to identify the current use.  Let's say there is a need for 100kW in a facility in a city.  The team who acquires capacity knows how difficult it can be to add space and add to an existing cage, so they decide to quadruple the requirement and look for 400kW.  To start they'll use 25% of the capacity and grow into it over a 10 year period.  They set up the lease to have one fee for reserving the capacity and another set of fees for actual use.

The flaw in using this method is there is an assumption that the space needs to contiguous in one cage area and it is a requirement to have contiguous space.  Logical from a real estate perspective.

Proposed method:  Pick a unit of power that is the most cost effective in a facility given the power infrastructure.  Let's say 140kW.  Enough to handle the 100kW requirement and 40% head room.  Fear is the business could rapidly need more space.  The key to picking this first space is it should have high connectivity to other spaces in the building (not necessarily adjacent) and other buildings that can support the growth of the company. As the business out grows the original 140kW, the data center group has identified other candidates for space to add for growth.  The strategy is to have at first two spaces that are on different power, cooling and network infrastructure, then continue to add more in a mesh of 3-5 sites.  The trade-off of adding smaller units of expansion that can be fully loaded and optimized forces an isolation of compute that can be useful.

For example,  by the time you get to the 4th unit it is highly likely the 1st unit is in need for a hardware refresh across most of the IT gear.  As you power up the 4th unit, you can be working on decommissioning the 1st site, complete replacing the gear to support the future growth.  If you had one contiguous space it is highly likely the 1st deployments are so intertwined with the next 3 years of deployments, the upgrade process is extremely complex.  If each unit of expansion is meant to be isolating in a mesh, then the dependencies are reduced and easier to take offline.

Issues: it is over simplistic to treat data centers as if it is office space that needs to be in one building and adjacent floors.  Can you imagine if the corporate real estate group let the office groups be on the 3rd fl, 8 fl, and 15 fl of a building, and the other team in another building 1/2 mile away?  But, guess what with the right network infrastructure bits going from floor to floor, or to another building is not an issue.  

Examples;  When you look at Google, Facebook, and Microsoft's data centers they build additional buildings to add capacity to a site.  They did not build a building 4 times bigger than what they needed and grow into it over 10 years.  Modular data centers by Dell, HP, and Compass data centers allow those who feel they need to have buildings to use this same approach.  Once you jump off the rack top of rack switch it can make little difference whether you are going 5 ft, 500 ft, 5,000 ft, or 50,000 ft.

The Data Center World is getting smaller as it grows

One of my favorite books in high school was "Small is Beautiful."

Small Is Beautiful: A Study of Economics As If People Mattered is a collection of essays by British economist E. F. Schumacher. The phrase "Small Is Beautiful" came from a phrase by his teacher Leopold Kohr.[1] It is often used to champion small, appropriate technologies that are believed to empower people more, in contrast with phrases such as "bigger is better".

After two weeks of being in LV and then SJ hanging around data center people having interesting discussions it struck me how small the data center world is.  Yet it is growing.

With Social Networking and the bigger getting bigger, there is a small set of people who are driving the industry forward.  Yet, there is an increasing set of people who demand data center services including IT organizations who don't understand how the small data center world works.

I think part of the problem for an newbie to data centers is to filter through the marketing and sales positioning to get the core of how the data center works.  The marketing folks are not taking an approach that "Small is Beautiful" and it is about taking small steps in technology to empower people to design, build, and operate data centers better than the past.

The small is beautiful approach is an interesting one, that needs to be studied more.