Tuesday
Sep202011

Devops Drop 019



Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 

Notes:

Building data science teams

Data science teams need people with the skills and curiosity to ask the big questions.

http://radar.oreilly.com/2011/09/building-data-science-teams.html

 

People You May Know (PYMK)  LInkedin, Facebook

Netflix and Zynga

Google, Amazon, 

 

A recent report from the McKinsey Global Institute says that by 2018 the U.S. could face a shortage of up to 190,000 workers with analytical skills.

 

http://tech.fortune.cnn.com/2011/09/06/data-scientist-the-hot-new-gig-in-tech/

 

http://www.dataspora.com/2011/09/data-scientists-or-data-composers/

 

 

-------------------------------------------------------------------------

 

New CycleCloud HPC Cluster Is a Triple Threat: 30000 cores, $1279/Hour, & Grill monitoring GUI for Chef

 

http://blog.cyclecomputing.com/2011/09/new-cyclecloud-cluster-is-a-triple-threat-30000-cores-massive-spot-instances-grill-chef-monitoring-g.html

 

We have now launched a cluster 3 times the size of Tanuki, or 30,000 cores, which cost $1279/hour to operate for a Top 5 Pharma. It performed genuine scientific work -- in this case molecular modeling -- and a ton of it. The complexity of this environment did not necessarily scale linearly with the cores.

 

c1.xlarge instances 3,809

cores 30,472

RAM 26.7-TB

AWS Regions 3    ( us-east, us-west, eu-west )

 

 

 Compute Years of Work 10.9 years

 Spot Instances at an average cost of 0.286 USD / instance / hour (0.036 USD / core / hour). Compare that to the 0.68 USD / instance / hour for the same On Demand instance. That’s 57% savings!

 

-------------------------------------------------------------------------

 

What Exactly is Complex Event Processing Today?

http://blog.cloudeventprocessing.com/2011/09/18/what-exactly-is-complex-event-processing-today/

 

Colin Clark...

-------------------------------------------------------------------------

 

https://github.com/nathanmarz/storm/wiki/Rationale 

 

Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, and is a lot of fun to use!

 

The lack of a "Hadoop of realtime" has become the biggest hole in the data processing ecosystem.

 

-------------------------------------------------------------------------

 

 

Building a Devops team

http://agilesysadmin.net/building-a-devops-team 

 

Brian Henerey, from Sony Computer Entertainment Europe.

 

First interview - remote technical test

Instructions...

Ec2 instance .. install Wordpress with a broken Mysql install 

Tomcat log scraping...

Using screen to watch them...

 

Round 2 - Face to face interview

Whiteboard test

Pair programming 

 

 

 

-------------------------------------------------------------------------

 

 

How to Think Like a Computer Scientist

http://greenteapress.com/thinkpython/thinkCSpy/ 

Learning with Python

 

 

-------------------------------------------------------------------------

 

 

Node.js and MongoDB on Ubuntu

http://cloud.ubuntu.com/2011/09/node-js-and-mongodb-on-ubuntu/ 

 

haproxy to catch inbound web traffic and route it to our node.js app cluster

mongodb for app storage

 

With a sample application....

 

-------------------------------------------------------------------------

 

 

First steps with Cloud Foundry on Amazon EC2

 

http://www.cloudsoftcorp.com/blogs/first-steps-with-cloud-foundry-on-amazon-ec2

 

Setting up an IP address and domain name

Making it start the right modules at boot

 

Friday
Sep162011

Devops Drop 018



Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 

Notes:

NoSQL Benchmark

http://blog.cubrid.org/dev-platform/nosql-benchmarking/

 

Yahoo Cloud Servicing Benchmark

 

Basic operations are Insert, Update, Read, and Scan. There are basic workload sets that combine the basic operations, but new additional workloads can also be created.

 

This article contains tests conducted on the following products and versions.

 

Cassandra-0.7.4

Although Cassandra’s latest version is 0.8.0, we have decided to use the previous version known to be stable. Because when testing with the 0.8 version, the gossip protocol between nodes malfunctioned and the node up/down information was incorrect.

 

HBase-0.90.2 (Hadoop-0.20-append)
The HBase-0.90.2 (Hadoop-0.20-append) was selected because, if not the Hadoop-append version, there may be problems on decreased durability in HDFS.

 

MongoDB-1.8.1

 

Insert, Read Only and Read and Update

 

Insert - Cassandra kills 

Read and update Cassandra beats HBase by a little 

Read Hbase wins of course but only by a little against Cassandra 

Mongo get blow out...

 

Which leads me into .. why I would love to make this event...

 

------------------------------------------------------

 

Using Cassandra, Brisk, and Mahout to Manage Time Series, and Predict Future Events

http://predictingfuture.eventbrite.com/

 

Datastax ... Brisk  a cassandra based Hadoop...

 

 

------------------------------------------------------

 

What is glu?

http://linkedin.github.com/glu/docs/latest/html/index.html

 

glu is a free/open source deployment and monitoring automation platform.

 

a glu agent  is running on each of those nodes

ZooKeeper is used to maintain the live state as reported by the glu agents (blue arrows)

the glu orchestration engine is the heart of the system

 

Glu Script is a Groovy Class with named closures for the actions... (can be groovy or java)

install, configure, start, stop, unconfigure and uninstall

 

The doc is pretty cool .. however, when I started getting into the state machine stuff I had to stop...

 

Orchestration .. Zookeeper to build live state, compare live and desired state.

generate delta 

 

 

------------------------------------------------------

 

NODE.JS AND THE JAVASCRIPT AGE

http://metamarketsgroup.com/blog/node-js-and-the-javascript-age/

 

Three months ago, we decided to tear down the framework we were using for our dashboard, Python’s Django, and rebuild it entirely in server-side JavaScript, using node.js. (If there is ever a time in a start-ups life to remodel parts of your infrastructure, it’s early on, when your range of motion is highest.)

 

This decision was driven by a realization: the LAMP stack is dead. 

 

1991-1999: The HTML Age.

2000-2009: The LAMP Age.

2010-??: The JavaScript Age.

------------------------------------------------------

 

From $0-100million with no sales people. The Atlassian 10 commandments for startups.

http://blog.businessofsoftware.org/2011/09/from-0-100million-with-no-sales-people-the-atlassian-10-commandments-for-startups.html

 

Jira, Confluence 

 

3 ppl to 300 ppl... 

 

Start with two founders..  50/50 

 

Bootstrapping .. first round is 60M

 

-Sell itself, affordable, global, open 

-Use your own product.... Passionately use your own product...

-Measure everything... Capture everything.... even if you can’t analyze 

-Test everything... 5 users free .. raised money for charity 

-ABM...  ... always sponsor the beer at conference.. like Dyninc...

-Send stuff in the mail.. t-shirts... 

-Make everything into a campaign.. Turned hiring into a marketing campaign - .. send only 4 resumes otherwise you are black listed...

-Don’t be afraid to let your first product will fail.. 

 

------------------------------------------------------

 

Devops Dude of the Week....

 

Jordon Sissel

 

FPM and Logstash and now...

 

eventmachine-tail

https://github.com/jordansissel/eventmachine-tail

 

Jordon Sissel.. 

 

This project contains two EventMachine extensions.

First, it adds an event-driven file-following similar to the unix ‘tail -f’
command. For example, you could use it to follow /var/log/messages the same way
tail -f would.

Second, it adds event-driven file patterns allowing you to watch a given file
pattern for new or removed files. For example, you could watch /var/log/*.log
for new/deleted files.

 

For logstash, the log agents were
event-driven using EventMachine. The log agents mainly get their data from
logfiles. To that end, we needed a way to treat log files as a stream.

There’s a ruby gem ‘file-tail’ that implements tailing, but not in an
event-driven way. This makes it hard to use in EventMachine programs like
logstash.

Thus, eventmachine-tail was born.

Further, the usage patterns for logstash required the ability to watch a
directory (or a file pattern) for new log files.

 

rtail -x "*.gz" "/var/log/**/*"

 

-

 

 

Thursday
Sep152011

Devops Drop 017



Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 

Notes:

$3m Wellington rail project behind schedule

http://www.stuff.co.nz/dominion-post/news/5606019/3m-Wellington-rail-project-behind-schedule

 

 KiwiRail said Project Sirius, a $3 million project to install an IBM asset management system, is six months behind schedule.

 

------------------------------------------------------

 

LexisNexis Releases Code for Its Hadoop-Killer

 

http://www.devx.com/DailyNews/Article/47281?trk=DXRSS_LATEST 

 

LexisNexis Risk Solutions' division HPCC Systems has announced that it is open sourcing the code for its High Performance Computing Cluster (HPCC) software. HPCC is a data-processing-and-delivery solution that the company is marketing as an alternative to Hadoop. HPCC includes two major components: Thor, which analyzes large datasets in a manner similar to Hadoop, and Roxie, which is closer to a traditional RBDMS or a data warehouse.

 

------------------------------------------------------

 

The Apache Software Foundation Announces Apache Whirr as a Top-Level Project

https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces15 

 

Apache Whirr provides a Cloud-neutral way to run a properly-configured system quickly through libraries, common service API, smart defaults, and command line tool. Whirr is being used for proof of concepts and a way to try out new Cloud services utilizing a variety of Apache products that include Hadoop, HBase, Cassandra, and ZooKeeper. An example of this is enterprise software providers Cloudera, who use Whirr to make it easy to try out their CDH product and run distributed clustered services.

 

------------------------------------------------------

 

Chef Hack Day - Seattle

Saturday, September 24, 2011 from 9:00 AM to 5:00 PM (PT)

Seattle, WA

 

------------------------------------------------------

 

Joyent arms cloud for death match with Amazon

http://www.theregister.co.uk/2011/09/15/the_new_joyent_cloud/ 

 

The pixar of cloud Jason Hoffman: Chief Scientist, Founder - PhD in Molecular Pathology

Mark Mayo: CTO... 

 

 

A month after open-sourcing what it calls "the first major hypervisor" to arrive in half a decade, cloud computing pioneer Joyent has added this hypervisor to its flagship service, allowing Linux and Windows applications onto the Joyent Cloud for the first time.

 

Joyent and its firebrand CTO told the world they had ported the KVM hypervisor from Linux to SmartOS. They promptly open-sourced the code in an effort to "make the world a better place", and now they've rolled the hypervisor into a new incarnation of the Joyent Cloud

 

The company claims that its SmartOS virtual machines are up to 14 times faster than comparable Amazon server instances

 

------------------------------------------------------

 

Rundeck And Nagios Nrpe Checks

http://www.dzone.com/links/r/rundeck_and_nagios_nrpe_checks.html

 

I’ve played with a few different jobs so far, including triggering Puppet runs across machines triggered by a Jenkins plugin. I’ve also been looking at running all my monitoring tasks at the click of a button (or again as part of a smoke test triggered by Jenkins) and I thought that might make a nice simple example.

My checks are written as Nagios plugins, and run periodically by Nagios. I also trigger them manually, using Dean’s NRPE runner script.

Wednesday
Sep142011

Devops Drop 016



Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 

Notes:

 

 

VMware vsphere provider to Fog

https://github.com/geemus/fog/pull/505

 

Libvirt integration for fog

https://github.com/geemus/fog/pull/484

http://jedi.be/blog/2011/09/13/libvirt-fog-provider/

 

 added cloudstack support

https://github.com/geemus/fog/pull/456

 

Full AWS support sqs, sns, rds, elb, dns,cloudformation, cloud_watch

 

----------------------------------

Patrick is going crazy over there.. 

This is some crazy shit essay..

 

Automated Vmware ESX Installation - Bonus in Vmware Fusion

http://www.jedi.be/blog/2010/12/09/automated-vmware-esx-installation-even-in-vmware-fusion/

 

Using kickstart

After this exercise you should be able to completely script the installation of a Vmware ESX virtual machine and make it run inside Vmware Fusion.

-------------------------------------

 

Murder: Fast datacenter code deploys using BitTorrent

http://engineering.twitter.com/2010/07/murder-fast-datacenter-code-deploys.html

 

twitter eng.. .BitTorrent... Murder a 40 minute deploy..  12 seconds! - in ruby and python also a great video on the blog.

-------------------------------------

 

Ruby for Jenkins Goes Pre-Alpha

http://blog.thefrontside.net/2011/09/13/ruby-for-jenkins-goes-pre-alpha/

 

The prject was started to make Jenkins fit the ruby comminity stlye...

 

not forced into using jruby, or maven. 

 

boot a plugin written in pure Ruby into a Jenkins server w/o java or java knowledge.  

 

-----------------------------------

 

Installing on RHEL/CentOS 5

http://support.cloudfoundry.com/entries/20237758-installing-on-rhel-centos-5

 

Decomposed the script based install and refactored it to work with centos and layed out the steps to use yum

 

4 ways to install

 

  1. bash script that you can invoke from a curl command
  2. in the vcap repo there is a vcap_dev directory with chef cookbooks to install w/chef solo
  3. Keith Hudkins created a chef server install for the barclamp PIT
  4. Canonical ow has debian packages... 

-----------------------------------

 

LexisNexis Releases Code for Its Hadoop-Killer

http://www.devx.com/DailyNews/Article/47281?trk=DXRSS_LATEST

 

HPCC Systems a division of lexisnexis risk solutions division.

open sourcing the code for its High Performance Computing Cluster (HPCC) 

an alternative to Hadoop. 

Thor, which analyzes large datasets in a manner similar to Hadoop, 

Roxie, which is closer to a traditional RBDMS or a data warehouse.

 

-----------------------------------

 

Ensemble gets some Juju!

juju.ubuntu.com

 

 We figured it should represent the complexities and mystery that often surround those skilled in the DevOps field, and be something that played on the same “u” sound and etymology as Ubuntu.  Thus, “Juju” was born!

 

 

Tuesday
Sep132011

Devops Drop 015



Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 

Notes: