Loggly, a Hadoop approach in the Cloud to manage servers

Almost everyone puts their management system in the same area as their IT assets. When I worked on management system architecture I asked the question why don’t management systems get located offsite?  This was back in 2005 before the cloud was popular.  Recently, I’ve been asking about a Hadoop base approach to collect IT logs.

Wouldn’t it be cool if there was a cloud base server management system that used Hadoop to do things the big management tools can’t.  And cheaper with a pay as you go basis.

Loggly is a company that uses Hadoop to store log files.  See this job description.

Hadoop Engineer

Forge and weld a different kind of search engine. You are building part of the back end systems that accept data from our customers and push it through to our archiving, indexing and map/reduce framework, then make it available through search and large scale analytics systems. You’re helping form a core team who’s responsibilities are to make us bigger, better and faster. You know what to do, and don’t ask twice.

Here's what makes you tick:

  • have constructed a distributed, elastic system before
  • familiar with both solr and lucene, and realize those projects have in fact merged
  • you conduct map reduce jobs on hadoop for breakfast, or for small afternoon snacks
  • achieved authoring or implementing a high throughput distributed queuing system
  • or have authored or implemented a high performance distributed data store
  • understand that high reliability systems are expected to be highly reliable
  • you’re that guy that comes in, in the middle of the night, and makes magic happen

What is Loggly?

Logging as a service — any time — your way — fast.

Loggly collects, indexes, and stores all log data and makes it accessible through search for analysis and reporting.

You can try Loggly today by signing up for the free product. With no up front investment necessary, you reduce your risk of locking into a software solution. Once you decide to purchase the Loggly service, we run your service at a fraction of the cost you would incur yourself. We manage the infrastructure for you. You don't need to do anything and have your logs at your fingertips at any time from anywhere — fast.

Running in AWS.

Loggly- United States (San Francisco, California)

Loggly is a cloud-based server logging service. Loggly provides a way to collect logs from servers in one centralized location and then quickly search them with an intuitive user interface.

Here is a comparison of Splunk vs. Loggly.

Update:Here’s how chief executive Kord Campbell described the difference between Splunk and Loggly:

We are a hosted solution compared to Splunk’s enterprise software download. Instead of installing your own server, downloading the code, and forwarding logs to that server, you just send them to our system. We run all the servers, storage, code, etc. for you, making life easier in the process. It’s a hell of a lot cheaper too.

We’re leveraging a bunch of Open Source technologies to leap ahead in the search portions of our offering, which makes us more nimble than Splunk. We’re focused on web app developers (like us) initially, providing development and monitoring features for them to maintain their code and systems. Later on we’ll branch out into security, compliance, and analytics.

When it comes to analytics, we’ll be able to use the search system we’ve built to pull data from a customer’s logs, then run a map reduce algorithm on them to crank out statistics on the data. For lots of data. Think of it as a flip side to Google Analytics. They take the log entries from browsers hitting your site – we take the entries from the hits to your server directly, through its logs.

Read more

What the Private Cloud will bring? Really Bad $h*!

I had a full day at Gartner DC LV conference.  At the end of the day I got a good question on what I saw in the future.  Cloud is top of the topics being discussed.

Lots of people are thinking about building private clouds, but how many people know how to build an operating system for the cloud.  A common tweet from #GartnerDC

barton808 Barton George

by sean_kelley_ms

66% of folks here say they will be pursuing private cloud by 2014.#gartnerDC

So a safe answer is private cloud is the future of IT.  High utilized hardware. Dynamic Infrastructure.

image

Gartner has been saying the private cloud is coming for a while here is a post from 2009.

I believe that enterprises will spend more money building private cloud computing services over the next three years than buying services from cloud computing providers. But those investments will also make them better cloud computing customers in the future.

Building a private cloud computing environment is not just a technology thing – it also changes management processes, organization/culture, and relationship with business customers (our Infrastructure and Operations Maturity Model has a roadmap for all four). And these changes will make it easier for an IT organization and its customers to make good cloudsourcing decisions and transitions in the future.

The ability for people to understand the private cloud is daunting.  The choices are large and growing faster than people can understand.  All of this reminds me of the arrival of Desktop Publishing. with new issues for typography, color matching, images, layout, printers,scanners, and SW.

Desktop publishing began in 1985 with the introduction of MacPublisher, the first WYSIWYG layout program, which ran on the original 128K Macintosh computer. (Desktop typesetting, with only limited page makeup facilities, had arrived in 1978–9 with the introduction of TeX, and was extended in the early 1980s by LaTeX.) The DTP market exploded in 1985 with the introduction in January of the Apple LaserWriter printer, and later in July with the introduction of PageMaker software from Aldus which rapidly became the DTP industry standard software.

Before the advent of desktop publishing, the only option available to most persons for producing typed (as opposed to handwritten) documents was a typewriter, which offered only a handful of typefaces (usually fixed-width) and one or two font sizes. The ability to create WYSIWYG page layouts on screen and then print pages at crisp 300 dpi resolution was revolutionary for both the typesetting industry and the personal computer industry. Newspapers and other print publications made the move to DTP-based programs from older layout systems like Atex and other such programs in the early 1980s.

Now if you are an experienced Operating System developer and have a team who can make the design trade-offs in designing a private cloud the transition to private cloud will be like print publications that moved to Mac based DTP.  But the number of IT organizations with this skill set are only a handful - Google, Microsoft, VMware, Yahoo, Facebook, Amazon, etc.  Maybe at the most 6% of the installed base has these skills and ability to recruit top talent, so what happens to the remaining 60% of the 66% that are building private clouds?

We are going to see some really bad $h*!.

Private clouds that are bad performers.  Clouds that have bad UI.  Manageability requires giving a UI to the private cloud.  How many IT organizations have a user interface design team?

Building a private cloud is like building an operating system to manage the resources in IT with UI for system administrators designed for your internal users.

Now the smart guys have figured out they can hire experience operating system staff.  Why do you think Google hired so many Microsoft guys?  Microsoft hired a bunch of DEC guys to work on NT.

If you don't want to build some really bad $h*! you should think of hiring some OS guys.  I have a friend who runs a technical executive placement company and I think she should start up a private cloud placement service.

Are you in the 6% group with OS level talent or in the 60% group who is new to DTP and have an organization who sees the private cloud as the answer to take control.

Keep in mind this Gartner statement.

I believe that enterprises will spend more money building private cloud computing services over the next three years than buying services from cloud computing providers.

The analyst and vendors are going to market private cloud so it is unstoppable.  Just saying no to the private cloud is not an option.

Gartner DC LV is a great event to meet people and circulate ideas.  Today is a full day of interviews, business disconnections, and making new connections. 

Read more

Skype’s Platform Push

Some of my data center friends have been making the switch to Skype, and here is some posts that mention Skype’s efforts to staff up to be a cloud computing platform.

TechCrunch covers the hiring efforts at Skype.

Skype Staffing Up For A Big Push To The Cloud

As Skype prepares for an IPO in the next year, the VoIP company has been looking for new ways to expand its business both in terms of revenue and product development. One avenue the company is exploring to bring in more revenue is throughenterprise offerings, via B2C and B2B offerings. However, it looks like Skype will be moving its VoIP offerings to the cloud.

Note the hiring efforts for cloud and web technology engineers.

We spotted these job postings on Skype’s website, indicating that the company is looking to build a team of cloud and web technology engineers. According to the postings, these staff members will “build an infrastructure capable of supporting hundreds of millions of users.” The products, will deliver “voice, video, chat and presence” to the web and “enable radically new Skype applications.”

GigaOm covers Skype’s copying NetFlix to a be a platform.

How Netflix Shaped Skype’s Platform Strategy

By Michael Wolf Jul. 1, 2010, 5:20pm PDT 1 Comment

Skype quietly announced SkypeKit, an SDK for CE manufacturers last week, to push Skype services beyond the PC. However, it wasn’t until the following day when Jonathan Christensen, the General Manager and Head of Platform at Skype, talked in depth about the company’s plans to mimic Netflix by following a platform-centric approach did the company’s broader intentions become clear.

Read more

Amazon Web Services Supercomputer configuration, 880 Servers

AWS announced their supercomputer configuration with Amazon’s James Hamilton posting on the configuration.

The cc1.4xlarge instance specification:

· 23GB of 1333MHz DDR3 Registered ECC

· 64GB/s main memory bandwidth

· 2 x Intel Xeon X5570 (quad-core Nehalem)

· 2 x 845GB 7200RPM HDDs

· 10Gbps Ethernet Network Interface

The AWS supercomputer configuration is 7040 cores.  At 4 cores per processor and 2 processors per server you get 880 servers (nodes) in the compute environment. 

If you assume assume about 350 watts/server you can get 300KW of power.  20 2U server per rack makes for 44 racks and 7KW per rack.  Sounds abut right.

Amazon is one of 4 self-made configurations.

image

10Ge is rare in many of supercomputer clusters, but AWS chose 10G Ethernet which may explain their self-made configuration.

image

But AWS was after a specific scenario like Hadoop.

It’s this last point that I’m particularly excited about. The difference between just a bunch of servers in the cloud and a high performance cluster is the network. Bringing 10GigE direct to the host isn’t that common in the cloud but it’s not particularly remarkable. What is more noteworthy is it is a full bisection bandwidth network within the cluster. It is common industry practice tostatistically multiplex network traffic over an expensive network core with far less than full bisection bandwidth. Essentially, a gamble is made that not all servers in the cluster will transmit at full interface speed at the same time. For many workloads this actually is a good bet and one that can be safely made. For HPC workloads and other data intensive applications like Hadoop, it’s a poor assumption and leads to vast wasted compute resources waiting on a poor performing network.

Read more

2 Dell DCS Customers, LBNL & Saudi Aramco

I was talking to a friend and he said whenever they talked to Dell they couldn’t find out much information about who Dell DCS’s customers were.  There are approximately 30 clients of DCS.  Part of their criteria is customers purchasing over 2,000 servers with the following characteristics.

 image

So, I spent a few hours researching who Dell DCS customers.  I came up with about 20 DCS customers.  I am not going to share the complete list on this blog, but I am sharing the list with a few others who can help me figure out who the others are.

Let me tell you how I found two of the DCS customers – Lawrence Berkeley National Labs & Saudi Aramco.

LBNL was easy as they have a bunch of Dell DCS Xanadu systems in Top 500 Supercomputer list.

image

I blogged about Xanadu first generation.  Here is generation II.

Dell has launched new XS-23 II high-end x86 servers, codenamed Xanadu II in celebration of the IInd birthday of Data Center Solutions (DCS), a division producing servers aimed at easy customizability for cloud computing and other data center applications.

The XS-23 come with a choice of several different processors, including Intel Nehalem chip, although those machines will be offered in blade configurations with a built-in fabric architecture. It will be provided as four two-socket servers in a 2U standard rack mount footprint, accommodating up to 24 disk drives. According to Norrod, customers can configure up to 88 of Dell's new servers with 704 processing cores and 396 TB of storage, plus switching, in a single rack, for 25% higher density in comparison to similarly outfitted blade servers on a per-U basis.

When I was going through Dell DCS documents on their web site I found a reference to an Oil & Gas customer. 

The Data Center Solutions (DCS) team have an Oil & Gas customer that is always looking to push the envelope when it comes to getting the most out of GPGPU’s in order to deliver seismic mapping results faster.

On the Supercomputing 500 site going through Dell Hardware, I found Saudi Aramco with a bunch of HP and Dell Hardware.  5 systems all with 512 servers, dual processor, 4-6 core processors. 

The other 18 on the list are interesting to study.  It’s not that hard to find the Dell DCS customers

Read more