Devops Drop 022

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 



Gartner Cites Application Release Automation Tools as Key to DevOps


So many things wrong with this...   (ARA) Nolio 


 As an emerging movement, DevOps may have improved communication and collaboration between development and IT operations teams but it still hasn’t absolutely mastered its ultimate goal of unifying their work. 


 In his report, Ronni J. Colville argues that DevOps can be greatly enhanced by the use of application release automation (ARA) tools and specifically cites Nolio as one such solution.


Can tools help “Cuture”  this question was asked at Puppetconf w/Luke... 


Visibility also requires the establishment of a model of the application and its configuration for each environment. ARA tools can provide a mechanism to create application models for each environment, with externalized configuration settings that typically vary by environment.




Selenium and Nagios


I've implemented a Nagios check for Selenium test cases. With this check it is possible to put your recorded test cases from your Selenium IDE into Nagios to use them for monitoring.


Test-->Selenium IDE-->Export-->check_selenium (nagios plugin)-->Selenium Remote Control




DevOps in Milliseconds



AppNexus engineers have it good. 


They don’t lie awake at night wondering if we can handle the next increase of impressions. 


They don’t worry that our systems are down and we don’t know it. 

They don’t develop in a bubble, toss their code over the wall to a mysterious group of people, and wash their hands clean.


Monitoring: Nagios - 1200 services

Metrics: Graphite - 1 million datapoints every minute.


Nagios plugin that queries Graphite and alerts if values of certain metrics go above or below specified thresholds.


Deployment: Puppet and Maestro

Puppet backed by a MySQL database and fronted by an in-house application we call Maestro.


At AppNexus there is no wall between engineers and operations, and automation is crucial to scaling our infrasctructure. 


Engineers control their own destiny, and we give them the tools to dive deep into production problems and give them tools to dive deep into production problems, make fixes, and improve their products as quickly as they can code.





CI vs Zombies



Runaway builds.


--A runaway build occurs when not all processes created by the build exit cleanly. 

--Zombies – may hang the build, or simply stay around in the background waiting to wreak havoc. 

--They interfere with test isolation. If processes can hang around from an earlier build (or earlier test within the same build) they may affect unrelated tests.

--difficult-to-diagnose failures.

-- eventually leading to exhaustion.

--Manual intervention is required to kill them and clean up. 




Openstack Compute API v1.1 support


Implement fog support for the Openstack Compute API v1.1. Includes support for legacy v1.0 style auth and v2.0 keystone auth.





Mean time to pretty chart- DevOps meets data porn



Alex Benik is a principal at Battery Ventures. Battery Ventures is an investor in DataDog and Tracelytics. 


The current mantra in Web operations is to track, record and monitor everything. Data is valuable and storage is cheap.


Favorite Velocity  John Rauser at Amazon and Kellan Elliot-McCrea from Etsy.



Mean Time to Pretty Chart (MTPC). For full buzzword compliance, let’s say that WebOps + BigData + Information/Graphic Design = MTPC.


MTPC attempts to quantify the amount of time required to determine the root cause of an operational issue and depict it in an eye-catching way. The MTPC metric is challenging because it encompasses a number of challenges spanning large volumes of data acquisition, storage, correlation and design/representation.






A highly incomplete list of relevant commercial and open source tools would include Ganglia, Nagios, Cati, Graphite, Munin, Splunk, New Relic, Tracelytics (see disclosure), DataDog (see disclosure), and AppDynamics.


Enter the Data Scientist. While correlation doesn’t imply causation, with large enough sample sizes the old adage “where there is smoke there is usually fire” often applies. When you can visualize that smoke in a pretty chart, it’s easier to pinpoint the fire.





Jesse Robbins interview on DevOps Cafe #19 (w/ full transcript!)


Devops Drop 021

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 



Puppet Conf Recap ...


Great show.. first class.. venue, food, content...


“Operations as a Strategic Weapon”

Damon and I did our combined talks right after Luke’s Keynote.  I thought we rocked.  They will be posting the videos


Devops Cafe Roundtable with Luke, Teyo, James and Scott  ..

Basically the management team at Puppet Labs


Scott story about joining Puppetlabs... His Loudcloud experience. 


Damon killed.  We talked about Service Orchestration, PaaS, culture patterns.. great stuff... We will post the audio on Devops Cafe site and the Video should be up in a week or two...



Puppet Enterprise 2.0


A lot of new integration with Mcollective and the GUI...


New GUI, right out-of-the-box PE 2.0 automatically discovers all resources – packages, hosts, groups, and users.  Uses Mcollective to discover. 


Visually Clone Resources To Scale Quickly, Efficiently, and Reliably(From the GUI)


With PE 2.0’s new  compliance capability, you now can visually monitor for any unauthorized changes against your desired-state baseline. Can run compliance reports once a day and watch for changing trends...  Give auditors GUI control to see what they need to see...


PE 2.0’s new provisioning capability allows you to quickly and easily create new instances of VMware and Amazon EC2.  KInd of like “Knife” with the added bare metal sauce... 




“Operating at Scale”

Pedro Canahuati


SRE Manager... 


Dealing with issues at “SCALE” and I mean scale....

Switched from XEN to LXC to to overhead at scale...

Been using cFengine for years... About to change to Chef or Puppet.. Looking at both. 

All the #devops thing are going on at FB  CD, Agile in operations, collect and store everything.  Like Google, they had to build a lot of their own stuff.  

They build there own TSDB kind of like Opentsdb.  They have built there own monitoring framework, looking framework (they use Scribe).


ODS tool the abstracts and visualizes all events (very cool) 


I was able to talk to Pedro at the speakers dinner and the following day.  I am a junkie and groupie for guys like this and stuff like .. we talked about CEP and monitoring.  Also about Chef and Puppet.  




Beyond the Node: Arkestration with Noah

John Vincent




Puppet and Juju, scaling the cloud

Marc Cluet & Adam Gandelman


These boys showed up to a gin fight with a knife... 

Slideware of how you can use puppet and Juju together.  I am not a mean guy unless you propose something that you can’t explain in a presentation...


Split brain... Needs to be a hackday .. talked to Dan Bodie about this... Interesting...




Mårten Mickos


CEO Eucalyptus


Great presentation... Talked about what the cloud has done to operations.  Also acknowledges cloud needs devops.... 


My Zing question ... great answer....


We also had some one on one podcasts with the Redhat guys about Openshift and how it works.  


Ended up with an interview with Jay Lyman of 451 group... Post on DTO....


Oh yeah  on the way to have drinks with Gene KIm I got to get my picture talked with Merle Haggard.  


Devops Drop 020

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 


Goteborg 2011 - program


Friday 14 October and Saturday


Yours truly doing the keynote...



Announcing Xeround Cloud Database API


Xeround is an elastic, always-on database-as-a-service
for your MySQL applications.


AWS, Rackspce and Heroku


Benchmarks against an RDS Large at  $0.44 vs the $0.08 standard instance Xeround



How GitHub Uses GitHub to Build GitHub


Everyone can push, everyone can deploy 

Master is always deployable

Deploy 10 to 40 times a day

Pull requests are our code review

Master -> Branch -> Pull request -> Master

Pull requests are RAD no meeting, email is your interface, non techs get involved


Culture...   Hack days... make things fun... 


Hubot, our valiant Campfire bot, has continued to grow in complexity. A tiny list of his (current) capabilities:


-unlock the door to our office

-print out a list of the people currently in the office based on their wifi presence

-find an apartment in the area to rent

-deploy GitHub

-say an arbitrary string over the office speakers

-play an audio sample of deadmau5 to everyone through hacked Propane HTML5

-give you a quote from any movie or TV show

-tell you the build status of any git branch

-track and map packages

-SMS any GitHubber from Campfire

-embed a seven day weather forecast




PuppetConf as a Service (PCaaS): Sign up for the Free Live Stream


Mårten Mickos

SRE’s from Facebook and Google

John Vincent @lusis Noah dude

Luke of course

Adrian Cole jClouds

Chad Metcalf Cloudera

Jinesh Varia AWS

Mark Hinkle @mrhinkle







Puppet Change Management for DevOps


What is Puppet?

At Atlassian, we use Puppet extensively with our internal systems, our Hosted products, and our build engineering infrastructure. Here's how we do it in build engineering.


Jira, Bamboo,  Greenhopper Rapid Board


Bamboo with puppet...




IBM Infrastructure as a Service (IaaS) -



From September 12 – November 11, you can provision select virtual machines at the Toronto, Ehningen, Tokyo and Singapore IBM SmartCloud data centers—subject to availability—at no charge. You can access:


Virtual machines to run Linux® (Red Hat or Novell SUSE) or Microsoft® Windows® Server 2003/2008

1 block (256 gigabytes) of persistent storage






DataStax gets $11M, fuses NoSQL and Hadoop


Brisk, Hadoop based on Cassandra


Neo raises $10.6M for Neo4j as graph DBs take off





Building Scalable Systems: an Asynchronous Approach


Node.js and rabittMQ


DevOps Cafe Episode 19

Love it or hate it, the long form interview is back! 

Guest: Jesse Robbins (Opscode)

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 
Follow Jesse Robbins on Twitter: @jesserobbins 


  • Learn more about Jesse's company, Opscode (makers of Chef)
  • John and Damon will be speaking at and doing a live episode of this podcast from PuppetConf in Portland (9/22)


Please leave comments or questions below and we'll read them on the show!


Devops Drop 019

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 


Building data science teams

Data science teams need people with the skills and curiosity to ask the big questions.


People You May Know (PYMK)  LInkedin, Facebook

Netflix and Zynga

Google, Amazon, 


A recent report from the McKinsey Global Institute says that by 2018 the U.S. could face a shortage of up to 190,000 workers with analytical skills.





New CycleCloud HPC Cluster Is a Triple Threat: 30000 cores, $1279/Hour, & Grill monitoring GUI for Chef


We have now launched a cluster 3 times the size of Tanuki, or 30,000 cores, which cost $1279/hour to operate for a Top 5 Pharma. It performed genuine scientific work -- in this case molecular modeling -- and a ton of it. The complexity of this environment did not necessarily scale linearly with the cores.


c1.xlarge instances 3,809

cores 30,472

RAM 26.7-TB

AWS Regions 3    ( us-east, us-west, eu-west )



 Compute Years of Work 10.9 years

 Spot Instances at an average cost of 0.286 USD / instance / hour (0.036 USD / core / hour). Compare that to the 0.68 USD / instance / hour for the same On Demand instance. That’s 57% savings!




What Exactly is Complex Event Processing Today?


Colin Clark...



Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, and is a lot of fun to use!


The lack of a "Hadoop of realtime" has become the biggest hole in the data processing ecosystem.





Building a Devops team 


Brian Henerey, from Sony Computer Entertainment Europe.


First interview - remote technical test


Ec2 instance .. install Wordpress with a broken Mysql install 

Tomcat log scraping...

Using screen to watch them...


Round 2 - Face to face interview

Whiteboard test

Pair programming 







How to Think Like a Computer Scientist 

Learning with Python






Node.js and MongoDB on Ubuntu 


haproxy to catch inbound web traffic and route it to our node.js app cluster

mongodb for app storage


With a sample application....





First steps with Cloud Foundry on Amazon EC2


Setting up an IP address and domain name

Making it start the right modules at boot