Cloud Camp Chicago Dec 2012 Slides

46
Welcome to Cloud Chicago Live Tweet on the second screen by using: #cloudcamp @cloudcamp_chi 1 Sponsored by Hosted by Thursday, December 13, 12

description

The slides from the December 2012 Cloud Camp Chicago. The slides include slides from our speakers: Dave Falck, Model Metrics: node.js on AWS; Paul Mantz, CohesiveFT: Working with APIs; Bob Chojnacki, Jellyvision Labs: Hadoop on AWS; Karl Zimmerman, Steadfast: Keep control with the Private Cloud

Transcript of Cloud Camp Chicago Dec 2012 Slides

Page 1: Cloud Camp Chicago Dec 2012 Slides

Welcome to Cloud Chicago

Live Tweet on the second screen by using:#cloudcamp@cloudcamp_chi

1

Sponsored by

Hosted by

Thursday, December 13, 12

Page 2: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Agenda

6:00pm Registration, Food, Drinks and Networking6:30 Opening Remarks, Patrick Kerpan, CoehsiveFT

6:45 Lightning TalksDave Falck, Model Metrics: node.js on AWSPaul Mantz, CohesiveFT: Working with APIs Bob Chojnacki, Jellyvision Labs: Hadoop on AWSKarl Zimmerman, Steadfast: Keep control with the Private Cloud

7:45 Unpanel: “Who’s in Control of Your Cloud? Security and Visibility”

Emceed by Mike Dorosh, IBM & Patrick Kerpan, CoehsiveFT

8:30 Breakout Sessions 9:00 Wrap Up - Drinks, anyone?

Thursday, December 13, 12

Page 3: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byDave Falck, Customer Solutions Engineer

Thursday, December 13, 12

Page 4: Cloud Camp Chicago Dec 2012 Slides

Node.js  +  AWS  @davidfalck  

Page 5: Cloud Camp Chicago Dec 2012 Slides

*  LinkedIn’s  entire  mobile  software  stack  is  completely  built  in  Node  

*  Why?  Scale.  *  Huge  performance  gains  compared  to  what  they  were  

using  before  (Ruby  on  Rails)  *  Went  from  running  15  servers  with  15  instances  (virtual  

servers)  on  each  physical  machine,  to  just  four  instances  that  can  handle  double  the  traffic.    

 

Why  the  Node.js  Buzz?    

Page 6: Cloud Camp Chicago Dec 2012 Slides

*  Javascript  platform  based  on  Google  Chrome  V8  JS  Engine    

*  Ryan  Dahl  (Joyent)  *  Event-­‐driven,  non-­‐blocking  I/O  model  to  allow  your  

applications  to  scale  while  keeping  you  from  having  to  deal  with  threads,  polling,  timeouts,  and  event  loops  

*  FAST  *  Used  for  real-­‐time,  data-­‐intensive  apps  (mobile!)  

*  POPULAR    

What  is  Node.js  ?  

Page 7: Cloud Camp Chicago Dec 2012 Slides

Node.js  on  GitHub  

Page 8: Cloud Camp Chicago Dec 2012 Slides

var  http  =  require('http');  http.createServer(function  (req,  res)  {      res.writeHead(200,  {'Content-­‐Type':  'text/plain'});      res.end('Hello  World\n');  }).listen(1337,  '127.0.0.1');  

Hello  World  

Page 9: Cloud Camp Chicago Dec 2012 Slides

*  Thread-­‐based  networking  is  inefficient  and  difficult  *  Node  shows  much  better  memory  efficiency  under  high-­‐

loads  than  systems  which  allocate  2mb  thread  stacks  for  each  connection.    

*  Users  of  Node  are  free  from  worries  of  dead-­‐locking  the  process  (*there  are  no  locks*)  

*  Almost  no  function  in  Node  directly  performs  I/O,  so  the  process  never  blocks.    

*  Because  nothing  blocks,  less-­‐than-­‐expert  programmers  are  able  to  develop  fast  systems  

What  makes  Node.js  so  fast?  

Page 10: Cloud Camp Chicago Dec 2012 Slides

Under  the  Node.js  hood    

Javascript?  

Page 11: Cloud Camp Chicago Dec 2012 Slides

Under  the  Node.js  hood    

*  Javascript!  *  Platform  independent  *  Easy  to  use  *  Ubiquitous  

*  Google  Chrome’s  V8  Javascript  Engine  *  Translates  JS  into  machine  code  (not  interpreted)  

Page 12: Cloud Camp Chicago Dec 2012 Slides

When  not  to  use  Node.js    

*  Node.js  is  not  ideal  for  CPU  intensive  jobs  like  sorting,  transformations,  number  crunching,  analytics…  *  Traditional  CRUD  web  apps  that  need  to  be  highly  concurrent,  performance  degradation  will  occur  when  the  data  is  needed  to  be  transformed…    *  You  can  offload  processing  to  another  language  that  is  better  at  making  use  of  the  CPU  *  Cultural  fit?  Too  new?    You  decide…  

Page 13: Cloud Camp Chicago Dec 2012 Slides

*  Dec  6th:  AWS  released  developer  preview  of  node.js  libraries  to  access  AWS:  *  DynamoDB  *  S3  *  EC2    *  SWS  

*  Allows  you  to  manage  parallel  calls  to  several  AWS  web  services  

Node.js  +  AWS  

Page 14: Cloud Camp Chicago Dec 2012 Slides

*  Azure    *  Joyent  *  EngineYard  *  Heroku  

Node.js  +  Other  Clouds  

Page 15: Cloud Camp Chicago Dec 2012 Slides

*  http://nodejs.org  *  http://en.wikipedia.org/wiki/Nodejs  *  http://aws.typepad.com/aws/2012/12/aws-­‐sdk-­‐for-­‐nodejs-­‐now-­‐available-­‐in-­‐preview-­‐form.html  *  http://www.jamesward.com/2011/06/21/getting-­‐started-­‐with-­‐node-­‐js-­‐on-­‐the-­‐cloud/  *  http://venturebeat.com/2011/08/16/linkedin-­‐node/  

More  info  

Page 16: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byPaul Mantz, Software Engineer

Thursday, December 13, 12

Page 17: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

APIs in Cloud Environments Paul Mantz

1Thursday, December 13, 12

Page 18: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

• Benefits to Creating API Command-Line Clients

• Lowers barrier of entry

• Familiar to technical consumers

• Advanced usage cases

• Integrates into existing toolsets

2Thursday, December 13, 12

Page 19: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

Excellent Internal Developer Tool

• Excellent for testing and rapid development

• Useful operations tool

3Thursday, December 13, 12

Page 20: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

Reference Implementation

• Gives developers an example to integrate the API

• Helps users model workflows

• DSL

4Thursday, December 13, 12

Page 21: Cloud Camp Chicago Dec 2012 Slides

Copyright CohesiveFT - Dec 13, 2012

API Command-Line Clients

Excellent Demo Tool

• Quick installation, often one file

5Thursday, December 13, 12

Page 22: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byBob Chojnacki, Programmer

Thursday, December 13, 12

Page 23: Cloud Camp Chicago Dec 2012 Slides

Big  Data  in  the  Cloud  

A  Journey  into  the  unknown  

Page 24: Cloud Camp Chicago Dec 2012 Slides

Who  Jellyvision  is  and  why  are  analy9cs  important  to  us  

•  We  create  interac9ve  experiences  –  Desktop  – Mobile  

•  …  which  ask  ques9ons,  inform  people,  generate  leads  •  “Virtual  Advisors”  •  We  also  collect  analy9cs  in  real  9me  to  generate  reports  

about:  –  How  people  answered  a  ques9on  –  Where  they  dropped  out  –  Lots  of  impressive  stats!    

Page 25: Cloud Camp Chicago Dec 2012 Slides

The  Problem  

•  Longer  term  projects  and  high  volume  projects  causing  MySQL  to  bust  at  the  seams  

•  Some  types  of  reports  taking  too  long,  or  causing  MySQL  to  crash  if  we  include  too  much  data  

•  In  all  fairness,  we  could  probably  tune  MySQL,  throw  it  on  bigger  servers,  more  memory  

•  Diminishing  returns  •  MySQL  is  fine  for  collec9ng  the  data…  

Page 26: Cloud Camp Chicago Dec 2012 Slides

The  Solu9on  

•  Hadoop!  •  Why  Hadoop?  Lots  of  possibili9es  out  there,  but  which  one  to  use?  Cassandra,  CouchDB,  Hadoop,  Membase,  MongoDB,  Neo4j,  …  

•  Big  Data  meetups  tended  to  have  lots  of  people  using  Hadoop  

•  And  I  knew  others  using  it.  •  And  Hortonworks  had  a  fancy  point  and  click  solu9on  I  could  use  to  get  started  quickly  

Page 27: Cloud Camp Chicago Dec 2012 Slides

Op9ons  with  op9ons  

•  Now  that  I  picked  Hadoop,  I  had  several  op9ons,  and  op9ons  within  op9ons  to  use  to  analyze  my  data:  – Hive,  Pig,  MapReduce,  Java,  R  

•  I  knew  Java  •  MapReduce  seemed  to  make  sense  •  I’ll  probably  play  with  Hive  and  Pig  next  

Page 28: Cloud Camp Chicago Dec 2012 Slides

It’s  All  About  The  Data  

•  Visit  data  •  Event  data  •  Denormaliza9on  of  data  •  Generated  a  ton  of  fake  data:  – Started  with  600K  visits,  3M  events  – Moved  up  to  1.8M  visits,  60M  events  

Page 29: Cloud Camp Chicago Dec 2012 Slides

Make  it  so  •  First  experience:  Hortonworks  Virtual  Sandbox  

–  Single  node  AMI  at  Amazon  –  Hadoop  1.0  –  600K  visits,  3M  events  

•  On  our  exis9ng  placorm  we  needed  to  break  reports  up  into  smaller  chunks  for  some  data  because  MySQL  could  not  handle  it.  

•  Results!  What  would  have  taken  hours,  took  only  5  minutes  on  a  single  node  Hadoop  "cluster”  

•  In  reality,  some  of  the  queries  I  could  also  run  with  command-­‐line  tools  (wc,  grep,  awk)  on  the  data  considerably  faster  than  even  Hadoop.  

•  Important  lessons  learned  so  far:  –  Think  outside  the  RDBMS:  they  are  great,  but  it  may  not  make  sense  

for  all  types  data  

Page 30: Cloud Camp Chicago Dec 2012 Slides

Looking  at  more  real  data  •  Now,  lets  generate  data  that  is  much  closer  to  some  of  our  product  •  Instead  of  one  ques9on  and  answer,  how  about  15  ques9ons?    Add  

in  some  other  events  gives  a  total  of  34  events.  •  Throw  in  some  people  returning,  some  of  them  mul9ple  9mes  •  Throw  in  some  people  who  don't  start  the  conversa9on,  etc.  •  Run  my  lijle  auto-­‐data-­‐generator  and  BOOM!  20  million  events  

and  4.4GB  later  I  have  my  data…  •  …  which  took  up  too  much  disk  space  to  run  on  the  demo  system  I  

was  using.    Might  as  well  turbo-­‐charge  this  puppy...  

Page 31: Cloud Camp Chicago Dec 2012 Slides

More  disk  space!  

•  Full  install  of  Hadoop  (Hortonworks  HDP)  •  Single  node  •  600K  visits,  20M  events  – 6m  29s,  ~30s  aner  map  phase  completed  

•  1.8M  visits,  60M  events  – 18m  3s,  ~90s  aner  map  phase  completed  

Page 32: Cloud Camp Chicago Dec 2012 Slides

More  nodes  

•  3  nodes:  11m  •  4  nodes:  9m  16s  •  Yay!  Nodes!  

Page 33: Cloud Camp Chicago Dec 2012 Slides

Caveats  

•  Not  using  Hadoop  to  its  fullest  /  basically  a  weekend  job  

•  Algorithms  employed  in  this  example  probably  won't  end  up  it  a  book  alongside  Knuth’s  

Page 34: Cloud Camp Chicago Dec 2012 Slides

Next  steps  

•  Make  sure  results  on  real  data  lines  up  •  Integrate  with  team  to  generate  reports  they  need  

Page 35: Cloud Camp Chicago Dec 2012 Slides

End  stuff  

•  Thanks  to  the  folks  at  Hortonworks  who  answered  my  fran9c  and  spas9c  ques9ons.  

Page 36: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted byKarl Zimmerman, President

Thursday, December 13, 12

Page 37: Cloud Camp Chicago Dec 2012 Slides

Keep Your Control.Private Cloud with Karl Zimmerman, CEO of Steadfast.

Page 38: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:What do we mean?

Private cloud is a form of cloud computing where the customer has some control/ownership of the service implementation. It is a scalable, elastic IaaS solution based on cloud computing but with more control over resources.

Page 39: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:What are the advantages?

Security

Availability

No vendor lock-in

Ease of management

Page 40: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Security

Dedicated & segregated resources

More options to integrate with existing security

Page 41: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Availability

Understanding and control of the infrastructure

Get the resources you need, when you need them

You're not subject to the whims of other users

Page 42: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Vendor Lock-In

No "secret sauce."

Utilize true open source

Page 43: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:Management

Easier to find employees with general IT knowledge

Utilize a broader array of tools and software

Get support/assistance from multiple levels

Page 44: Cloud Camp Chicago Dec 2012 Slides

Private Cloud:To Summarize

Private cloud can deliver what you need out of a public cloud, but giving you more control. Losing control over security, availability and issues like vendor lock-in and management vanish into thin air like, well, a cloud. And the fact that it doesn’t have to cost you more is a plus, too.

Page 45: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Sponsored by

Hosted by

Unpanel: “Who’s in Control of Your Cloud? Security and Visibility”

Emceed by: Mike Dorosh, Program Manager –Cloud Technical Partnerships, IBM 

& Patrick Kerpan CEO, CoehsiveFT

Thursday, December 13, 12

Page 46: Cloud Camp Chicago Dec 2012 Slides

#cloudcamp@cloudcamp_chi

Thursday, December 13, 12