TheTechnologyEvaluator’s...

20
The Technology Evaluator’s Cheat Sheets Business Intelligence & Analy:cs

Transcript of TheTechnologyEvaluator’s...

Page 1: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

 The  Technology  Evaluator’s  Cheat  Sheets  

Business  Intelligence  &  Analy:cs    

Page 2: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Summary  

•  So1ware  Stacks  – Full  Stacks  (DB  +  ETL  Tools  +  Front-­‐End  So1ware)  – Back-­‐End  Stacks  (DB  and/or  ETL  Tools  Only)  – Front-­‐End  Stacks  (Front-­‐End  So1ware  Only)  

•  Technologies  – Data  Warehouse  Class  (“Big  Scale”)  – Data  Mart  Class  (“Small  Scale”)  

Page 3: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

So1ware  Stacks  

DW  

ETL  

BACK-­‐END  STACK  ETL  Features  

 Data  Warehouse  Features    Data  Mart  Features  

FRONT-­‐END  STACK  Data  VisualizaLon  Features  

Data  Analysis  &  Discovery  Features  

ETL  

Query/Import  

FULL-­‐STACK  

Page 4: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

The  Full  Stack.    When?  •  Centralized  data  management  and  storage  

–  To  deliver  a  single  version  of  criLcal  data  –  To  make  data  easier  for  non-­‐techies  to  access,  query  and  share  –  To  simplify  on-­‐going  or  ad-­‐hoc  data  management  tasks  

•  ETL  Func:onality  Is  Needed  –  MulLple  data  sources,  or  mulLple  tables  where  views  are  too  complex/slow  –  The  volume  of  data  is  expected  to  cause  slow  performance  –  Data  needs  to  be  restructured  before  being  delivered  to  users  –  Data  is  dirty  (entry  errors,  value  mismatches)  –  Required  metrics  are  in  different  tables  or  sources  

•  To  protect  the  opera:onal  systems  from  rogue  queries  •  To  access  non-­‐queryable  data  sources  

Page 5: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

End-­‐Users  (Business)  

Data  Warehouse  +  Data  Marts   Data  Extracts  (No  DW)  

DW  

OLAP  Cubes,  or  In-­‐Memory  Marts  

End-­‐Users  (Business)  

Data  Sources  

ETL  /  Mash-­‐up  

In-­‐Memory  Marts  Excel/CSV  

IT  Department  

Data  Warehouse  ETL  /  Mash-­‐up  

Data  Sources  

IT  Department  

Front-­‐end  Tools   Front-­‐end  Tools  

Full  Stack:  TradiLonal  Architectures  

Page 6: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Data  Warehouse:  Pros  &  Cons  DW  +  Data  Marts   Data  Extracts  (No  DW)  

Approach   SoluLon-­‐oriented   Project-­‐specific  

Data  Quality  &  Accuracy   Higher   Lower  

Scalability   Higher   Lower  

Single  Version  of  the  Truth   Yes   No  

IniLal  Investment   Higher   Lower  

Level  of  Detail   Summarized   Granular  

Owner   IT   IT  or  Business  (opLonal)  

ImplementaLon  Time   Longer   Shorter  

Technical  Complexity   Higher   Lower  

Advantage  /  Disadvantage  

Page 7: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Technologies  In  The  Space  

Page 8: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Backend  Technologies  •  Data  Mart-­‐Class,  we  call  it  “Small  Scale”  – Online  AnalyLcal  Processing  (OLAP)  –  In-­‐Memory  Databases  (IMDB)    

•  Data  Warehouse-­‐Class,  we  call  it  “Big  Scale”  – Database  So1ware  Appliances    – Database  Computer  Appliances  – Distributed  Databases  

Page 9: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Small  Scale.    When?  

•  When  there  is  only  a  single  data  source,  which  means  the  data  doesn’t  need  to  be  consolidated  (ETL)  prior  to  being  delivered  for  business  analyLcs  

•  When  there  aren’t  many  different  abributes  and  metrics  to  cross-­‐reference  (the  Data  Mart  doesn’t  need  to  have  many  fields)  

•  For  a  one-­‐Lme  project  (e.g.  one  dashboard),  with  no  added  requirements,  new  data  sources  or  other  changes  expected  in  the  future  

Page 10: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Big  Scale.  When?  

Big  Scale   Small  Scale  

Max.  Data  Mart  Size   Terabyte  -­‐  Petabytes   Gigabytes  

Max.  Number  of  Fields  (1  mart)   PracLcally  Unlimited   Limited  

Max.  Number  of  Records  (1  table)   Billions   Millions  

•  For  a  single  centralized  data  store  to  serve  mulLple  users  and  mulLple  business  scenarios  (single  version  of  the  truth)  

•  When  data  volumes  are  large,  are  rapidly  growing  or  may  unpredictably  spike  

 

Page 11: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Data  Mart-­‐Class  Technologies  (“Small  Scale”)  

Page 12: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

In-­‐Memory  Databases  (IMDB)  

•  Achieves  fast  performance  by  loading  the  enLre  data  mart  into  RAM,  thus  avoiding  slow  disk-­‐reads  (“I/O  Boblenecks”)  

•  Categorized  as  “Small  Scale”  because  the  size  of  data  mart  is  effecLvely  limited  by  the  size  of  RAM,  placing  in  the  Gigabyte  scale  category  

•  In  some  IMDB  technologies,  RAM  consumpLon  is  also  drasLcally  affected  by  concurrent  use.  

Page 13: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Online  AnalyLcal  Processing  (OLAP)  •  Achieves  fast  performance  by  pre-­‐calculaLng  metrics  (field  

aggregaLons)  for  all  sets  and  subsets  of  unique  values  in  all  dimensions  (fields)  ‘over-­‐night’.    This  avoids  performing  these  slow  operaLons  in  real-­‐Lme  during  the  work-­‐day.  

•  Categorized  as  “Small  Scale”  because  storing  the  results  of  these  pre-­‐calculaLons  (“The  Cube”)  takes  exponenLally  more  storage  resources  than  the  actual  raw  data  does,  limiLng  the  actual  size  of  raw  data  that  can  make  up  a  cube  to  GB  scale.  

•  The  query  engines  behind  most  OLAP  technologies  are  based  on  RDBMS  technology,  whose  own  scale  and  performance  limitaLons  OLAP  cannot  overcome  (e.g.  joining,  grouping).  

Page 14: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Data  Warehouse-­‐Class  Technologies  (“Big  Scale”)  

Page 15: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

So1ware  Appliances  

A  so1ware  appliance  is  a  soUware  applica:on  that  might  be  combined  with  just  enough  operaLng  system  (JeOS)  for  it  to  run  op:mally  on  industry  standard  hardware  (typically  a  server)  or  in  a  virtual  machine.  

Page 16: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Computer  Appliances  

A  computer  appliance  is  generally  a  separate  and  discrete  hardware  device  with  integrated  so1ware  (firmware),  specifically  designed  to  provide  a  specific  compuLng  resource.    Computer  appliances  are  generally  not  designed  to  allow  the  customers  to  change  the  so1ware,  or  to  flexibly  reconfigure  the  hardware.  

Page 17: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Distributed  Databases  

A  distributed  database  may  be  stored  in  mulLple  computers,  located  in  the  same  physical  locaLon;  or  may  be  dispersed  over  a  network  of  interconnected  computers.    A  distributed  database  system  consists  of  loosely-­‐coupled  sites  that  share  no  physical  components  (such  as  disk,  RAM  and  CPU)  

Page 18: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Big  Scale  Technologies,  Compared  SoUware  Appliance  

Computer  Appliance  

Distributed  Databases  

Hardware  Class   Commodity   Proprietary   Commodity  

Best  Architecture   1  Server   1  Server   N  Servers  

Capacity   Terabytes   Terabytes   Petabytes  

Hardware  Cost   4-­‐5  figures   6-­‐7  figures   5-­‐6  figures  

Page 19: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Full-­‐Stack  Vendors  ETL   SoUware  

Appliance  Hardware  Appliance   OLAP   IMDB     In-­‐Chip  

SiSense   Elas:Cube  Manager  

Elas:Cube  (Columnar)   -­‐   -­‐   -­‐   Elas:Cube  

(Columnar)  

Microso1   SSIS   SQL  Server  (RDBMS)   -­‐   Analysis  Services   PowerPivot   -­‐  

Oracle   Oracle  ETL  Features  

Oracle  DB  (RDBMS)   ExaData   Hyperion   ExalyLcs   -­‐  

IBM   InfoSphere  DataStage  

DB2  (RDBMS)   Netezza   Cognos   Cognos   -­‐  

SAP   NetWeaver  BW  ETL   -­‐   HANA  

Columnar  In-­‐Mem  BW  

Business  Objects  HANA  

Columnar  In-­‐Mem   -­‐  

Microstr.   MSTR  ETL  Features   -­‐   -­‐   Microstrategy  

In-­‐Mem  OLAP  Microstrategy  In-­‐Mem  OLAP   -­‐  

QlikView   QlikView  Expressor   -­‐   -­‐   -­‐   QlikView  

AssociaLve  In-­‐Mem   -­‐  

Page 20: TheTechnologyEvaluator’s CheatSheetspages.sisense.com/.../images/technology-evaluators-cheat-sheets.pdf · operang*system*( JeOS)*for*itto* runopmally * on industrystandardhardware

WWW.SISENSE.COM  

Thank  You!  Visit  us  at  www.sisense.com