Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service...

12
Messaging Services Update Lionel Cons IT/SDC 20 May 2014

Transcript of Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service...

Page 1: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Messaging Services Update

Lionel Cons IT/SDC

20 May 2014

Page 2: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Messaging Services @ CERN

§  1st era: “by the Grid for the Grid” §  only EGEE (then EGI) services §  primitive management and monitoring

§  2nd era: “service diversification” §  non-Grid services added (managed by IT/PES) §  improved management and monitoring

§  3rd era: “service consolidation” §  no more EGI services §  integrated management and monitoring

20  May  2014  Messaging  Services  Update   2  

Page 3: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Messaging Services @ CERN

§  1st era: “by the Grid for the Grid” §  ActiveMQ + Java6 + SL5 + physical machines §  Quattor + Nagios + YAIM

§  2nd era: “service diversification” §  ActiveMQ + Apollo + RabbitMQ §  Java6 + Java7 (+ Erlang) §  SL5 + SL6 §  physical machines + virtual machines §  Quattor + Nagios + MBSD (and friends)

§  3rd era: “service consolidation” §  ActiveMQ + Java7 + SL6 + physical machines §  Puppet + Agile Monitoring + MBSD (and friends)

20  May  2014  Messaging  Services  Update   3  

Page 4: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

ActiveMQ vs. Apollo

§  Apollo is ActiveMQ rewritten from scratch, by the ActiveMQ experts themselves

§  Very fruitful collaboration in the OpenLab spirit: “you make it, we break it”

§  Red Hat buying ActiveMQ added delays §  Some Apollo features have been backported:

§  LevelDB message store

§  runtime configuration changes

20  May  2014  Messaging  Services  Update   4  

Page 5: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Physical vs. Virtual Machines

§  Brokers are heavy disk I/O users §  to comply to JMS, persistent messages must

be “persisted to disk” (i.e. sync’ed to disk)

§  Most of the problems having affected the messaging services have been caused by the use of virtual machines

§  What do we win by using virtual machines?

20  May  2014  Messaging  Services  Update   5  

Page 6: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Java  Applica*on  

Java  Virtual  Machine  

Opera*ng  System  

Java Application Stack

20  May  2014  Messaging  Services  Update   6  

CPU   RAM   disk   NIC  

Page 7: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

ActiveMQ on a Virtual Machine

20  May  2014  Messaging  Services  Update   7  

NAS  CPU   RAM   disk   NIC  

Opera*ng  System  

Virtual  Machine  Applica*on   Other  Apps  

OS  

JVM  

Ac*veMQ  

Other  VMs  Apps  

Page 8: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Multiple ActiveMQ on a Physical Machine

20  May  2014  Messaging  Services  Update   8  

Ac*veMQ  

JVM  

Opera*ng  System  

CPU   RAM   disk   NIC  

Other  Apps  JVM   JVM  

Ac*veMQ   Ac*veMQ  

Page 9: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Management Consolidation

§  Most of the software we need comes from SL + EPEL §  we publish 14 packages in EPEL

§  All the rest is built using CERN’s Koji §  All our machines will be Puppet-managed

§  this is work-in-progress

§  We are using MBSD which can be seen as LANDB at the messaging level

20  May  2014  Messaging  Services  Update   9  

Page 10: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Monitoring Consolidation

§  We will stop using Nagios §  We will integrate as much as possible with

the “Agile Monitoring” offering §  we however have specific needs such as

aggregation-based alerts or per-client alerting

§  The final solution will be fully integrated with GNI and Roger

§  We are investigating how much can be done by Esper

20  May  2014  Messaging  Services  Update   10  

Page 11: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Messaging Services Consolidation

§  Joint effort from IT/PES and IT/SDC §  Good progress so far

§  Should be finished by the end of the year §  new services to be deployed by end of June §  monitoring migration will take longer

§  existing services to be migrated too §  ActiveMQ 5.5 to 5.9 (with Red Hat’s additions)!

20  May  2014  Messaging  Services  Update   11  

Page 12: Messaging Services Update - mig.web.cern.ch IT-PES 20-05-2014 - Messagin… · 3rd era: “service consolidation” ! ActiveMQ + Java7 + SL6 + physical machines ! Puppet + Agile Monitoring

Next Steps

§  Once the new tools are in place, we can better define the operational procedures

§  Then will come SLDs and SLAs §  Then we will chase the messaging brokers

not run by us…

20  May  2014  Messaging  Services  Update   12