Big Data is not Rocket Science

11
1 © Cloudera, Inc. All rights reserved. Big Data is not Rocket Science European Commission Workshop December 9 th 2014 Lars George | EMEA Chief Architect

Transcript of Big Data is not Rocket Science

Page 1: Big Data is not Rocket Science

1  ©  Cloudera,  Inc.  All  rights  reserved.  

Big  Data  is  not  Rocket  Science    European  Commission  Workshop  December  9th  2014  Lars  George  |  EMEA  Chief  Architect  

Page 2: Big Data is not Rocket Science

2  ©  Cloudera,  Inc.  All  rights  reserved.  

 “It  took  51  years  before  hard  disk  drives  reached  the  size  of  1  TB.  This  happened  in  2007.  In  2009,  the  first  hard  drive  with  2  TB  of  storage  arrived.  So  while  it  took  51  years  to  reach  the  first  terabyte,  it  took  just  two  years  to  reach  the  second.”  —Royal  Pingdom  

Source:  hXp://royal.pingdom.com/2010/02/18/amazing-­‐facts-­‐and-­‐figures-­‐about-­‐the-­‐evolu\on-­‐of-­‐hard-­‐disk-­‐drives/  

Page 3: Big Data is not Rocket Science

3  ©  Cloudera,  Inc.  All  rights  reserved.  

Architectural  Changes  Trigger      

• This  is  the  third  age  of  data  processing  • We  were  always  data  driven,  but  scale  has  changed  • Today  data  is  too  fast,  too  varying,  and  too  heavy  to  move  it  around  •  IT  needs  to  follow  this  model  and  move  faster  

Page 4: Big Data is not Rocket Science

4  ©  Cloudera,  Inc.  All  rights  reserved.  

What  is  the  Problem  with  Big  Data?  

• Big  Data  is  a  Buzzword  just  as  much  as  Cloud  is  • The  defini\on  is  fuzzy  but  tries  to  describe  a  new  piece  of  technology  

•  Important  takeaway  is  that  Big  Data  is  turning  data  processing  upside  down  • Load  before  Extract  and  Transform  

• The  current  technology  stack  is  not  suited  (yet)  for  data-­‐centric  architectures  • Current  academic  educa\on  is  split  into  mul\ple,  disconnected  approaches  • Science:  Trains  mathema\cians,  sta\s\cians,  engineers  • Applied  science:  Trains  polyglot,  “generic”  developers  (coders)  • Research:  Develops  new  tools  to  store  and  process  data  

• None  of  the  training  helps  to  speed  up  the  adop\on  of  Big  Data!  

Page 5: Big Data is not Rocket Science

5  ©  Cloudera,  Inc.  All  rights  reserved.  

DaaS  is  not  PaaS  

• There  is  a  gap  between  PaaS  and  being  successful  with  Big  Data  as  a  Service  (DaaS).  • Big  Data  engineers  need  to  fill  this  gap  for  the  \me  being  •  Future  will  bring  building  blocks  to  build  data  applica\ons  

Ø   For  now  there  is  nothing  to  simplify  the  technology  for  users!  

Page 6: Big Data is not Rocket Science

6  ©  Cloudera,  Inc.  All  rights  reserved.  

Big  Data  Engineering  

Engineering  Task:  • Build  reliable,  automated,  scalable,  managed,  and  governed  data  processing  pipelines.  • Apply  all  exis\ng  knowledge  smartly    

Page 7: Big Data is not Rocket Science

7  ©  Cloudera,  Inc.  All  rights  reserved.  

Job  Requirements  

Ques\on:  What  are  the  requirements  for  a  Big  Data  engineer?    IT  systems  are  built  with  various  layers  to  handle  specific  tasks.    There  are  dis\nct  sec\ons  that  can  be  assigned  to  differently  trained  people.  

Opera\ng  System  

Hardware  

Applica\on  Solware  

OSI  Model  

Page 8: Big Data is not Rocket Science

8  ©  Cloudera,  Inc.  All  rights  reserved.  

Job  Requirements  

• Developer,  DBA,  etc.  

•  System  Administrator  

• Network  Engineer  

• Datacenter  Technician  

• Building  Facili\es  

Opera\ng  System  

Hardware  

Applica\on  Solware  

OSI  Model  

Page 9: Big Data is not Rocket Science

9  ©  Cloudera,  Inc.  All  rights  reserved.  

Job  Requirements  

The  problem  is  that  Big  Data  needs  all  of  these  skills  combined:  DevOps!    This  is  a  big  issue,  as  it  requires  change  and  training  on  every  level.    This  is  mostly  an  organisa\onal  challenge,  not  a  technology  one.  

Opera\ng  System  

Hardware  

Applica\on  Solware  

OSI  Model  

Big  Da

ta  Skills  

Page 10: Big Data is not Rocket Science

10  ©  Cloudera,  Inc.  All  rights  reserved.  

Summary  

• The  biggest  issue  is  that  the  technology  is  not  complete  yet  but  at  the  same  \me  requires  a  complete  ver\cal  adop\on  within  the  IT  department  • There  is  a  requirement  to  train  and  educate  the  exis\ng  and  new  IT  and  science  professionals  

• Ac\on  Items:  • Combine  exis\ng  educa\onal  material  to  reflect  new  challenges  • Train  staff  to  understand  challenges  concerning  their  responsibili\es  • Develop  new  middleware  that  makes  adop\on  of  planorm  easier  

Page 11: Big Data is not Rocket Science

11  ©  Cloudera,  Inc.  All  rights  reserved.  

Thank  you  Lars  George  |  EMEA  Chief  Architect  [email protected]  @larsgeorge