Do you have the bad habit of trying to be the smartest in school vs. the smartest in the real world

June 19, 2013 Dave Ohara

Hitting the road is a time to meet new people and run into old friends. I left SEA to SJC to go to GigaOm and start the networking. And, as usual the networking starts as soon as I get to the airport. I run into one of my old bosses, John Frederiksen who left Microsoft a year ago and is now VP of product management at NetApp. We chat about cloud and data centers. I had an interest in chatting about NetApp since I am moderating a panel with NetApp's CIO Cyndi Stoddard in 8 hrs.

Going to a hosted reception last night I chatted with some good friends and met new people.

One characteristic I find most interesting is people who are in a learning mode. I enjoy the smart people who realize they need to try new things to learn. Here is a post on Facebook page that is popular.

Robert Kiyosaki · 863,574 like this
November 4, 2011 at 7:00pm ·

In the real world, the smartest people are people who make mistakes and learn. In school, the smartest people don’t make mistakes.

Do you find you are surrounded by smart people who have the bad habits from school of showing how good their grades are and how they make no mistakes. Everybody makes mistakes. To err is human. I've been paying more attention to the mistakes I make. Do you? Do your friends?

The more you trust someone it is easier to admit your mistakes. If you don't trust someone, why would discuss your mistakes. If you don't trust someone, why are spending time with them? Life is too short to spend with people who you don't trust.

Some of the best data center discussions I've ever had are when we discuss mistakes made.

Won't be blogging much this week, focused on listening, learning and networking

June 18, 2013 Dave Ohara

I am at GigaOm Structure and I find if it is really hard to listen, learn, network and blog at the same time. I can time shift the blogging to later, so I am going to focus on listening to the presentations, networking like crazy, and learning as much as I can.

Here is a sample of what is covered at GigaOm Structure.

See inside Facebook’s network & explore Google’s data dreams at Structure

by Stacey Higginbotham

JUN. 17, 2013 - 6:00 AM PDT

No Comments

A

A

photo: (c) 2012 Pinar Ozger pinar@pinarozger.com

SUMMARY:
Infrastructure nerds, it’s time to meet the accountants. At this year’s Structure conference this Wednesday and Thursday we’re focusing on the economics of cloud computing, not just for vendors, but for practitioners.

tweet this

Want to understand how Facebook connects its servers? Hear from VMware’s CEO how the virtualization giant plans to build its next big business? Discover why Snapchat builds on Google App Engine as opposed to Amazon Web Services? Or maybe you want to understand if Microsoft can compete in the cloud.

Google publishes ideas discussing Good Enough approach to achieve low latency

June 18, 2013 Dave Ohara

It can be really hard to get the media to publish complex concepts which is why companies will submit their own articles. Google's Luiz Barroso and Jeff Dean have an article on Google's Data Center challenge to provide low latency performance at scale.

The Tail at Scale

By Jeffrey Dean, Luiz André Barroso
Communications of the ACM, Vol. 56 No. 2, Pages 74-80
10.1145/2408776.2408794

Systems that respond to user actions quickly (within 100ms) feel more fluid and natural to users than those that take longer.³Improvements in Internet connectivity and the rise of warehouse-scale computing systems² have enabled Web services that provide fluid responsiveness while consulting multi-terabyte datasets spanning thousands of servers; for example, the Google search system updates query results interactively as the user types, predicting the most likely query based on the prefix typed so far, performing the search and showing the results within a few tens of milliseconds. Emerging augmented-reality devices (such as the Google Glass prototype⁷) will need associated Web services with even greater responsiveness in order to guarantee seamless interactivity.

The article can be long for most and here are two key points.

In large information-retrieval (IR) systems, speed is more than a performance metric; it is a key quality metric, as returning good results quickly is better than returning the best results slowly. Two techniques apply to such systems, as well as other to systems that inherently deal with imprecise results:

Good enough. In large IR systems, once a sufficient fraction of all the leaf servers has responded, the user may be best served by being given slightly incomplete ("good-enough") results in exchange for better end-to-end latency. The chance that a particular leaf server has the best result for the query is less than one in 1,000 queries, odds further reduced by replicating the most important documents in the corpus into multiple leaf servers. Since waiting for exceedingly slow servers might stretch service latency to unacceptable levels, Google's IR systems are tuned to occasionally respond with good-enough results when an acceptable fraction of the overall corpus has been searched, while being careful to ensure good-enough results remain rare. In general, good-enough schemes are also used to skip nonessential subsystems to improve responsiveness; for example, results from ads or spelling-correction systems are easily skipped for Web searches if they do not respond in time.

Google has used a technique like sticking your toe in the water to test out an environment before jumping. They call it a canary request.

Canary requests. Another problem that can occur in systems with very high fan-out is that a particular request exercises an untested code path, causing crashes or extremely long delays on thousands of servers simultaneously. To prevent such correlated crash scenarios, some of Google's IR systems employ a technique called "canary requests"; rather than initially send a request to thousands of leaf servers, a root server sends it first to one or two leaf servers. The remaining servers are only queried if the root gets a successful response from the canary in a reasonable period of time. If the server crashes or hangs while the canary request is outstanding, the system flags the request as potentially dangerous and prevents further execution by not sending it to the remaining leaf servers. Canary requests provide a measure of robustness to back-ends in the face of difficult-to-predict programming errors, as well as malicious denial-of-service attacks.

The canary-request phase adds only a small amount of overall latency because the system must wait for only a single server to respond, producing much less variability than if it had to wait for all servers to respond for large fan-out requests; compare the first and last rows in Table 1. Despite the slight increase in latency caused by canary requests, such requests tend to be used for every request in all of Google's large fan-out search systems due to the additional safety they provide.

Do you care more about Top Supercomputers in China and NSA or Massive Clusters at Google, Facebook, Microsoft, and Amazon

June 18, 2013 Dave Ohara

There is news that China has the world's record for Supercomputer.

The ten fastest supercomputers on the planet, in pictures

Chinese supercomputer clocks in at 33.86 petaflops to break speed record.

by Jon Brodkin - June 17 2013, 8:45am PDT

SUPERCOMPUTING

64

A Chinese supercomputer known as Tianhe-2 was today named the world's fastest machine, nearly doubling the previous speed record with its performance of 33.86 petaflops. Tianhe-2's ascendance was revealed in advance and was made official today with the release of the new Top 500 supercomputer list.

The media will gladly write about who has the biggest and most powerful supercomputer.

As one of my friends who has worked on supercomputer data centers said, we realized we could reduce a lot costs in the data center, because the super computer would often have weekly maintenance intervals as well as monthly and quarterly. Components are constantly failing and yes there is a degree of isolation in the failures, but you need to eventually repair the failures which can mean a complete shut down. During these shut downs is when data center maintenance can be performed.

But, at Google, Facebook, Microsoft, and Amazon there is no time to shut down services. 100,000s of servers need to run all the time.

Amazon threw up a supercomputer entry in 2011, and it is still ranked 127.

List Rank System Vendor Total Cores Rmax (TFlops) Rpeak (TFlops) Power (kW)

06/2013 127 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1

11/2012 102 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1

06/2012 72 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1

11/2011 42 Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet Self-made 17,024 240.1 354.1

List	Rank	System	Vendor	Total Cores	Rmax (TFlops)	Rpeak (TFlops)
06/2013	127	Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet	Self-made	17,024	240.1	354.1
11/2012	102	Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet	Self-made	17,024	240.1	354.1
06/2012	72	Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet	Self-made	17,024	240.1	354.1
11/2011	42	Amazon EC2 Cluster, Xeon 8C 2.60GHz, 10G Ethernet	Self-made	17,024	240.1	354.1

Can you imagine if Google, Facebook, Microsoft, or Amazon put up their clusters as an entry?

Part of companies like Google has as advantage is they have teams of people led by guys like Jeff Dean to really think hard about compute clusters. Here is a presentation Dean gave 4 years ago.

Google, Facebook, Microsoft, and Amazon are solving the problem to keep supercomputer performance running 24x7x365 a year. I think this type of innovation affects us much more than who has the fastest supercomputer which requires hundreds of hours of downtime for maintenance.

Watch out water shortages are getting worse, fracking is bidding for the water

June 18, 2013 Dave Ohara

Droughts are scattered around and typically agriculture gets first priority. But, as RT.com reports Fracking is causing problems.

Fracking is occurring in several counties in Arkansas, Colorado, New Mexico, Oklahoma, Texas, Utah and Wyoming, which are currently suffering a severe drought, the Associated Press reports. Although the procedure requires less water than farming or overall residential uses, it contributes to the depletion of an already-scare resource.

Some oil and gas companies manage to drain states of their water supply without spending any money, by depleting underground aquifers or rivers. But when unable to acquire the resource for free, the corporations can purchase large quantities at hefty prices.

“There is a new player for water, which is oil and gas,” Colorado farmer Kent Peppler told AP, noting that he is fallowing some of his corn fields because he can’t afford to irritate them. “And certainly they are in a position to pay a whole lot more than we are.”

Peppler, president of the Rocky Mountain Farmers Union, said that the price of water has skyrocketed since oil companies have moved in. The Meade, Colo. Farmer said he used to pay $9 to $100 per acre-foot of water at city-held auctions, but that energy companies are now buying the excess supplies for $1,200 to $2,900 per acre-foot.

NPR has a good post on Water Wars.

There are two doctrines that govern surface water rights in the U.S. — one for the West and one for the East.

'A Reasonable Right'

The riparian doctrine covers the East. "[Under] the riparian doctrine, if you live close to the river or to that water body [or] lake, you have reasonable rights to use that water," says Venki Uddameri, a professor and the director of water resources at Texas Tech University.

The Western U.S. uses the prior appropriation doctrine. "As people started exploring the West and started looking for water for agriculture and mining, there was a need to move water away from the rivers," Uddameri tells Jacki Lyden, host of weekends on All Things Considered.

People wanted a claim to water but often lived too far away from a river for the riparian doctrine to make any sense. So the prior appropriation doctrine was devised.

Uddameri explains: "It allocates rights based on who started using the water first. So if you are first in time, you are first in rights. And historically, it was based on a permitting process where you go and say you asked for the permit first, so you became the first user.

"But then there's been a shift saying not first use strictly based on who asked for the permit first, but who was actually there first," he says. "So the Indian tribes who were there first may not have asked for a permit, but there's recognition now that they were the first users of water, so they get that first appropriation."

Very few, but some of the smartest data center people look at the water rights for their data center. Do you?

In an arid place like the Klamath Basin, there often isn't enough water available for everyone who has a right to use it. And the person with the oldest water right gets all the water they are entitled to first.