Devops Drop 019
Tuesday, September 20, 2011 at 8:34AM
DevOpsCafeAdmin



Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 

Notes:

Building data science teams

Data science teams need people with the skills and curiosity to ask the big questions.

http://radar.oreilly.com/2011/09/building-data-science-teams.html

 

People You May Know (PYMK)  LInkedin, Facebook

Netflix and Zynga

Google, Amazon, 

 

A recent report from the McKinsey Global Institute says that by 2018 the U.S. could face a shortage of up to 190,000 workers with analytical skills.

 

http://tech.fortune.cnn.com/2011/09/06/data-scientist-the-hot-new-gig-in-tech/

 

http://www.dataspora.com/2011/09/data-scientists-or-data-composers/

 

 

-------------------------------------------------------------------------

 

New CycleCloud HPC Cluster Is a Triple Threat: 30000 cores, $1279/Hour, & Grill monitoring GUI for Chef

 

http://blog.cyclecomputing.com/2011/09/new-cyclecloud-cluster-is-a-triple-threat-30000-cores-massive-spot-instances-grill-chef-monitoring-g.html

 

We have now launched a cluster 3 times the size of Tanuki, or 30,000 cores, which cost $1279/hour to operate for a Top 5 Pharma. It performed genuine scientific work -- in this case molecular modeling -- and a ton of it. The complexity of this environment did not necessarily scale linearly with the cores.

 

c1.xlarge instances 3,809

cores 30,472

RAM 26.7-TB

AWS Regions 3    ( us-east, us-west, eu-west )

 

 

 Compute Years of Work 10.9 years

 Spot Instances at an average cost of 0.286 USD / instance / hour (0.036 USD / core / hour). Compare that to the 0.68 USD / instance / hour for the same On Demand instance. That’s 57% savings!

 

-------------------------------------------------------------------------

 

What Exactly is Complex Event Processing Today?

http://blog.cloudeventprocessing.com/2011/09/18/what-exactly-is-complex-event-processing-today/

 

Colin Clark...

-------------------------------------------------------------------------

 

https://github.com/nathanmarz/storm/wiki/Rationale 

 

Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language, and is a lot of fun to use!

 

The lack of a "Hadoop of realtime" has become the biggest hole in the data processing ecosystem.

 

-------------------------------------------------------------------------

 

 

Building a Devops team

http://agilesysadmin.net/building-a-devops-team 

 

Brian Henerey, from Sony Computer Entertainment Europe.

 

First interview - remote technical test

Instructions...

Ec2 instance .. install Wordpress with a broken Mysql install 

Tomcat log scraping...

Using screen to watch them...

 

Round 2 - Face to face interview

Whiteboard test

Pair programming 

 

 

 

-------------------------------------------------------------------------

 

 

How to Think Like a Computer Scientist

http://greenteapress.com/thinkpython/thinkCSpy/ 

Learning with Python

 

 

-------------------------------------------------------------------------

 

 

Node.js and MongoDB on Ubuntu

http://cloud.ubuntu.com/2011/09/node-js-and-mongodb-on-ubuntu/ 

 

haproxy to catch inbound web traffic and route it to our node.js app cluster

mongodb for app storage

 

With a sample application....

 

-------------------------------------------------------------------------

 

 

First steps with Cloud Foundry on Amazon EC2

 

http://www.cloudsoftcorp.com/blogs/first-steps-with-cloud-foundry-on-amazon-ec2

 

Setting up an IP address and domain name

Making it start the right modules at boot

 

Article originally appeared on DevOps Cafe Podcast & Videos (http://devopscafe.org/).
See website for complete article licensing information.