Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 *...

24
Copyright © 2013 Splunk Inc. Ted Knudsen CoFounder and Engineering Manager, Message Bus #splunkconf Running a Virtualized Splunk Enterprise Infrastructure

Transcript of Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 *...

Page 1: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Copyright  ©  2013  Splunk  Inc.  

Ted  Knudsen  Co-­‐Founder  and  Engineering  Manager,  Message  Bus  #splunkconf  

Running  a  Virtualized  Splunk  Enterprise  Infrastructure  

Page 2: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Agenda  

!   Message  Bus  PlaKorm  !   Running  a  plaKorm  in  the  cloud  !   Every  day  is  Splunk  day  ! Splunk  architecture  !   How  it  all  works  together  !   Future  plans  !   Q  &  A  

 2  

Page 3: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Message  Bus  PlaKorm  

3  

Provides  email  delivery  at  scale,  done  right    

   

Page 4: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Message  Bus  PlaKorm  

4  

!   API  driven    !   Cost  effecTve  !   No  MTA  required  to  send  email  !   Scalable  (>  1000  mps)  !   100%  cloud  naTve  !   DMARC  compliant  security    

Page 5: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Message  Bus  PlaKorm  

5  

SDK’s  available  in    6  languages  

!   PHP  !   Ruby  ! nodejs  !   Python  !   C#  !   Java    

Screenshot  here  

Page 6: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Technologies  

6  

!   Queuing:  HornetQ  !   Deployment:  Chef  (hosted),  

ArTfactory  

!   Monitoring:  Splunk,  Nagios,  collectd,  PagerDuty  

!   Source  Control:  Github  

!   OS:  CentOS  6.2  !   Languages:  Scala  (JVM  1.6  &  1.7),  

nodejs,  Ruby  

!   Database:  Mysql  (Amazon  RDS),  Google  Big  Query  

!   Caching:  Redis  (2.4,  2.6)  

Message  Bus  PlaKorm  

Page 7: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Running  PlaKorm  in  the  Cloud  

7  

 Message  Bus  started  with  cloud  only  infrastructure;  Caveat  is  you  have  to  make  some  assumpTons  when  running  a  plaKorm  in  the  cloud  !   You  can’t  assume  reliable  performance  !   Build  failure  into  everything  !   Server  problems  solved  by  building  a  new  one  !   Size  of  the  server  can  help  ensure  beier  performance  

Cloud  NaTve  

Page 8: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Running  PlaKorm  in  the  Cloud  

8  

Choose  the  provider  that  best  works  for  you    !   Do  they  provide  a  specific  soluTon  for  your  needs?  !   Do  they  have  the  capacity  you  need?  !   Does  their  plaKorm  have  a  robust  API  for  automaTon?  !   What  kind  of  pricing  discounts  do  they  offer?  

Don’t  be  afraid  to  use  more  than  one  provider  

Which  Cloud  Provider?  

Page 9: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Running  PlaKorm  in  the  Cloud  Cloud  Providers  

9  

!   Google  Cloud  Pla7orm  Big  Query,  Cloud  Storage  

!   Rackspace  TesTng,  Monitoring,  ConTnuous  IntegraTon  

! Joyent  API,  message  sending,  api.messagebus.com  

!   Amazon  Federated  services,  reporTng,  global  account  informaTon,  RDS  

Page 10: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Running  PlaKorm  in  the  Cloud  Cloud  Providers  Strengths  

10  

!   Google  Cloud  Pla7orm  –  Big  Query  scaling  is  amazing    –  Cloud  Storage  is  cheap  and  easy      

!   Rackspace  –  Customer  Service  

 

! Joyent  –  ipv4  address  block  (/20)  –  Vyaia  for  SNAT  –  Custom  networking  

!   Amazon  –  Industry  leader  –  Service  available  for  almost  any  need  

Page 11: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Running  PlaKorm  in  the  Cloud  

11  

Message  Bus  plaKorm  currently  requires  80-­‐90  servers  per  cluster        Analyzing  log  data  and  monitoring  that  many  servers  can  only  be  done  effecTvely  with  Splunk  

Started  with  Splunk  from  the  very  beginning;  Engineering  team  formats  logging  with  Splunk  in  mind  

Where  does  Splunk  fit  in?  

Page 12: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Every  Day  is  Splunk  Day  

12  

Daily  operaTons  done  with  Splunk  !   Message  volumes  validaTon  !   ProducTon  trouble  shooTng  !   Data  validaTon  !   Monitoring  validaTon  !   Customer  support  !   Development/tesTng  

How  Message  Bus  uses  Splunk  

Page 13: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Every  Day  is  Splunk  Day  

13  

Engineering  team  deploys  update  and  new  features  at  least    once  per  week        Splunk  used  to  monitor  and  analyze  components  before  and    aqer  deployments    

ProducTon  monitoring  

Page 14: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Splunk  Architecture  

14  

Version:  Splunk  Enterprise  5.0.2  ! Fowarders  on  every  server  !   Indexers  !   Search  Heads  !   Receivers  

   Version  and  Layout  

Page 15: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Splunk  Architecture  

15  

IniTal  deployment  just  had  forwarders  and  indexers;  As  cluster  size  grew  we  found  that  this  had  some  draw  backs  

If  an  indexer  goes  offline,  all  forwarders  need  to  be  updated  with  chef;    This  can  take  a  while  depending  on  the  number  of  servers  and  the  level    of  automaTon  

Implemented  the  receivers  and  this  way  the  cluster  components  are  unaware  of  the  state  of  indexers  or  the  number  of  indexers  currently  running  

   Receivers    

Page 16: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Splunk  Architecture  Volume  and  Server  SpecificaTons  

16  

Splunk  Servers  !   8  CPU  !   8  GB  Memory  

!   250GB  data  volumes  !   6  indexers  per  data  center  

!   12  indexes  total,  2  search  heads  

Volume  and  Events  !   10  million  emails/day  !   400  million  log  lines  

!   200  GB/day  !   2  Data  Centers  

Page 17: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Splunk  Infrastructure  

17  

   

Splunk  Receivers  

Indexer   Indexer   Indexer   Indexer  (Offline)  

Search  Head  

Page 18: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

How  it  All  Works  Together  

18  

Message  Bus  plaKorm  runs  with  three  cloud  providers  in  mulTple  data  centers;  A  single  message  touches  all  three  providers:  ! Joyent  (east  and  west)  !   Amazon  !   Google  

   

Page 19: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

How  it  All  Works  Together  

19  

Google  Cloud  Pla7orm  

Joyent  West  

MTA  

DNS  Load  Balanced  

AWS  

Joyent  East  

API  Clients  

Messaging  Clusters  

Page 20: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Future  Plans  

20  

Splunk  summary  indexes  for  key  data  points  from  each  cluster  and  cloud  provider    Forward  key  performance  data  to  “global”  Splunk  instance  which    will  allow  high  level  analysis  by  cloud  provider;  If  further  detail  is  required  then  go  to  that  cluster  for  detailed  analysis  using  the    local  Splunk  

   

Page 21: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Future  Plans  MulT-­‐cloud  ImplementaTon  

21  

Joyent  West  

MTA  

DNS  Load  Balanced  

AWS  

Joyent  East  Messaging  Clusters  

Forward  KPI  summary  data  and  key  errors  to  master  Splunk  for  global  reporTng  

Splunk   Splunk  

Global  Splunk  

Page 22: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

Summary  

22  

Running  in  the  Cloud  –  build  components  and  services  to  handle  the  unique  qualiTes  of  cloud  servers;  Think  “cloud  naTve”    Cloud  Providers  –  each  one  has  strengths  and  weaknesses.      Choose  the  one  that  best  suits  your  needs;  Don’t  be  afraid  to  use  mulTple  vendors    Splunk  in  the  Cloud  –  it  works;  Plan  around  the  uncertainTes  of  the  cloud  and  you  will  be  successful    

   

Page 23: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

QuesTons  Ted  Knudsen  [email protected]  www.messagebus.com  github.com/messagebus  (SDK’s)    !   Co-­‐founder  of  Message  Bus  in  Oct  2010  !   Enterprise  Soqware  since  1998  ! Splunk  user  since  Jan  2011  !   Presented  at  .conf2012  –  “Using  Splunk  for  just  about  everything”  

23  

Page 24: Running*aVirtualized* Splunk* Enterprise*Infrastructure*...Running*Plaorm*in*the*Cloud* 7 * Message*Bus*started*with*cloud*only*infrastructure;*Caveatis*you* have*to*make*some*assumpTons*when*running*aplaorm*in*the*

THANK  YOU