Google Ads

Enter your email address:

Delivered by FeedBurner

This form does not yet contain any fields.

    An OS that scares the Linux Vendors, CoreOS designed for a modern data center

    Being an old time OS guy I once made the observation “I think people would pay money to just have drivers and kernel of the OS updated and leave the new features as options."

    A buddy told me to check out CoreOS. Why?  Because it has the security, service discovery, clustering and updating stuff that guys like AWS haven’t made a priority.  I was surprised at Gigaom Structure when AWS’s Werner Vogel said that security was something developers need to work on developing their apps.  Google’s Urs Hoelzle said Google thinks there are things they can do to make building secure services easier.

    CoreOS makes security #1 priority and many other things that a modern data center group wants.

    CoreOS is a server OS built from the ground up for the modern datacenter. CoreOS provides tools and guidance to ensure your platform is secure, reliable, and stays up to date.

    Small Footprint

    CoreOS utilizes 40% less RAM than typical Linux server installations. We provide a minimal, stable base for you to build your applications or platform on.

    Reliable, Fast, Patching and Updates

    CoreOS machines are patched and updatedfrequently with system patches and new features.

    Built for Scale

    CoreOS is designed for very large scale deployments.PXE boot and diskless configurations are fully supported.

    Infoworld posts on how CoreOS is a threat to Linux vendors.

    Indeed, by changing the very definition of the Linux distribution, CoreOS is an "existential threat" to Red Hat, Canonical, and Suse, according to some suggestions. The question for Red Hat in particular will be whether it can embrace this new way of delivering Linux while keeping its revenue model alive.


    When I pressed him on what he meant by that last sentence, he elaborated:

    CoreOS is the first cloud-native OS to emerge. It is lightweight, disposable, and tries to embed devops practices in its architecture. RHEL has always been about adding value by adding more. CoreOS creates value by giving you less [see the cattle vs. pets analogy]. If the enterprise trend is toward webscale IT, then CoreOS will become more popular with ops too.

    Project Atomic is a competitor of CoreOS.  You can probably look for more choices with the idea that an OS service that just keeps it updated.  Updated with what?  Bug fixes, performance improvements, and better security.  That’s worth a lot.  


    Two Ways to Save Server Power - Google (Tune to Latency) vs. Facebook (Efficient Load Balancing)

    Saving energy in the data center is more than a low PUE.  Using 100% renewable power while wasting energy is not a good practice.  I’ve been meaning to post on what Google and Facebook have done in these areas for a while and have been staring at these open browser tabs for a while.

    1st is Google in June 2014 shared its method of turning down the power consumption of a server as low as they could as long as it met performance latency.  The Register covered this method.

    Google has worked out how to save as much as 20 percent of its data-center electricity bill by reaching deep into the guts of its infrastructure and fiddling with the feverish silicon brains of its chips.

    In a paper to be presented next week at the ISCA 2014 computer architecture conference entitled "Towards Energy Proportionality for Large-Scale Latency-Critical Workloads", researchers from Google and Stanford University discuss an experimental system named "PEGASUS" that may save Google vast sums of money by helping it cut its electricity consumption.


    The Google paper is here.

    We presented PEGASUS, a feedback-based controller

    that implements iso-latency power management policy for

    large-scale, latency-critical workloads: it adjusts the powerperformance

    settings of servers in a fine-grain manner so that

    the overall workload barely meets its latency constraints for user

    queries at any load. We demonstrated PEGASUS on a Google

    search cluster. We showed that it preserves SLO latency guarantees

    and can achieve significant power savings during periods

    of low or medium utilization (20% to 40% savings). We also es-

    tablished that overall workload latency is a better control signal

    for power management compared to CPU utilization. Overall,

    iso-latency provides a significant step forward towards the goal

    of energy proportionality for one of the challenging classes of

    large-scale, low-latency workloads.

    Facebook in Aug 2014 shared Autoscale its method of using load balancing to reduce energy consumption.  Gigaom covered this idea.

    The social networking giant found that when its web servers are idle and not taking user requests, they don’t need that much compute to function, thus they only require a relatively low amount of power. As the servers handle more networking traffic, they need to use more CPU resources, which means they also need to consume more energy.

    Interestingly, Facebook found that during relatively quiet periods like midnight, while the servers consumed more energy than they would when left idle, the amount of wattage needed to keep them running was pretty close to what they need when processing a medium amount of traffic during busier hours. This means that it’s actually more efficient for Facebook to have its servers either inactive or running like they would during busier times; the servers just need to have network traffic streamed to them in such a way so that some can be left idle while the others are running at medium capacity.

    Facebook posts on Autoscale here.

    Overall architecture

    In each frontend cluster, Facebook uses custom load balancers to distribute workload to a pool of web servers. Following the implementation of Autoscale, the load balancer now uses an active, or “virtual,” pool of servers, which is essentially a subset of the physical server pool. Autoscale is designed to dynamically adjust the active pool size such that each active server will get at least medium-level CPU utilization regardless of the overall workload level. The servers that aren’t in the active pool don’t receive traffic.

    Figure 1: Overall structure of Autoscale

    We formulate this as a feedback loop control problem, as shown in Figure 1. The control loop starts with collecting utilization information (CPU, request queue, etc.) from all active servers. Based on this data, the Autoscale controller makes a decision on the optimal active pool size and passes the decision to our load balancers. The load balancers then distribute the workload evenly among the active servers. It repeats this process for the next control cycle.


    Why 7x24 Exchange conference is popular with my friends? 

    7x24 Exchange Conference Phoenix is coming up and now that it is two weeks away. I am checking in with some friends to see if they’ll be there.  So far I am batting 100% of the people I am looking forward to see.  Why are so many of my data center friends going to 7x24 Exchange Conferences?  At 7x24 Exchange Conferences there is a critical mass of friends and ideas that support data center innovation.  Almost every DC conference will claim it is driving data center innovation, but so many times the innovation is coming from the conversations not in the program.

    I return to 7x24 to see friends and make news ones.  Is this just a social event?  No, there are good presentations which is the benefit of not using a “pay to play” presentation model.  Some conferences, a Platinum sponsor means you get a keynote spot.  Silver you’ll get a small breakout room. etc.


    What ideas are discussed?  That is constantly changing which is part of why you return, and have had value from past conferences.

    Disclosure: In the past I would meet most of my friends at another conference that I am blacklisted from attending, so I have an incentive to help drive my friends to a conference where we will feel free to talk about anything we want.  7x24 Exchange has been supportive and open to feedback on what it takes to be a data center event that my friends find useful for so many reasons.



    Love Your Dog? You may love them more after watching this video

    We have two kids and a dog.  In many ways our dog is like our third child.  In this 60 minutes video a dog owner thinks of his dog as his child.

    Anderson Cooper: Do you view Chaser as a family pet? As a friend? How do you see Chaser?

    John Pilley: She's our child.

    Anderson Cooper: She's your child?

    John Pilley: She's our child, a member of the family. Oh yes. She comes first.

    Many people think of their dogs as children, but John Pilley has been teaching her like a child as well. By assigning names to toys, Pilley has been helping Chaser learn words and simple sentences.

    Check out this video that shows the smartest dog in the world, and you may love your dog more.


    DCIM has not taken off the way people thought, Why?

    In the data center world there has been hype on DCIM.  Multiple start-ups have tried to build businesses on DCIM.  The electrical equipment supplier have added DCIM solutions.  Yet DCIM has not taken off.  I have had the pleasure or pain of seeing some DCIM implementations first hand and seen how they work or don’t.

    So here are some of the reasons why I think DCIM has not lived up to its hype.

    - Given the limited deployments many systems don’t scale well.

    - Usability is not there yet.  Main focus has been to just get things to work.

    - Manual data entry is required too many times.

    - Decision makers who choose DCIM are not the operations staff, so there is a disconnect from expectations and reality.  Many people don’t know the operating expense of running a DCIM system.

    - The data center market is actually a decreasing market from the total number of companies who are running data centers even though overall capacity is increasing.

    - The big players have tried many of the services, and none is the killer app.

    Given the hype is dying down it is pretty hard to launch a start-up targeting DCIM.  I would expect DCIM teams within electrical suppliers is finding it harder to get more resources and money given the limited sales.

    If a DCIM solution scaled to 100K+ of servers, was easy to use,  automated data entry, bridged the reality of operations with executive expectations, a standard at the big data center users, then it would be the killer app.

    I don’t see this happening any time soon.  Do you?  If you do which one of these can do it?