CMW infrastructure Status report

7
P.Charrue – LBCM 14 Sept 2010 For the CMW team

description

P.Charrue – LBCM 14 Sept 2010 For the CMW team. CMW infrastructure Status report. Outline. Current CMW issues (3) CMW middle and long term plans How to report issues to the Controls Group. Issue #1 – Blocked socket. Description : - PowerPoint PPT Presentation

Transcript of CMW infrastructure Status report

Page 1: CMW infrastructure Status report

P.Charrue – LBCM 14 Sept 2010For the CMW team

Page 2: CMW infrastructure Status report

Current CMW issues (3)

CMW middle and long term plans

How to report issues to the Controls Group

4th May 2010 P.Charrue - LBCM 2

Page 3: CMW infrastructure Status report

Description : JAVA clients blocked (XPOC project) and not getting data anymore from

the devices Cause :

Socket blocking situation in the JacORB CORBA library (part of the CMW infrastructure) – known bug in JacORB

Occurence : Once to the XPOC client Often for the Logging infrastructure

Immediate cure : Restart the client application as the blocking situation cannot be

resolved CMW proposal :

Today: We provide a callback to the client application which detect such blocking situation and take take action (mail, sms, alarm, restart, log, …)

In 2 weeks: We will deliver a patch to this external Jacorb library to solve this blocking situation;currently tested.

4th May 2010 3P.Charrue - LBCM

Page 4: CMW infrastructure Status report

Description : CMW Proxy is blocked due slow consuming clients

Cause : ‘Slow clients’ subscribed to Proxy are not consuming the data quick enough

and block many notification threads (in Proxy) resulting in a complete blocking of the Proxy

Occurrence : BBQ, Hump Buster

Immediate cure : Kill the ‘slow client’ application as the blocking situation cannot be resolved

automatically CMW proposal :

A new version of the Proxy has been developed that handles correctly slow clients (by reserving processing resources for every subscribed client) and minimizes impact of slow consumers on the well behaving clients

Currently tested for the CMW-Proxy-BQ When the test are completed the upgraded Proxy will be deployed in close

collaboration with Operations – end this week

4th May 2010 P.Charrue - LBCM 4

Page 5: CMW infrastructure Status report

Description : Client/Server communication is lost inside the JAVA client application: busy

CMW notification thread inside the JAVA client prevents any subsequent communication (idle socket in FIN_WAIT1 left in the FrontEnd)

Cause : JAVA client CMW thread responsible for the socket operation is too busy by

doing data processing and therefore cannot cleanly close the communication Occurrence :

Collimators Immediate cure :

Restart the JAVA application as the blocking situation cannot be resolved CMW proposal :

Get more data from blocked JAVA application to confirm our hypothesis Organise code review with the authors of these JAVA clients to understand

why the communication threads are blocked Help the developers of the Java Clients to move to JAPC (as this issue is solved

using JAPC)

4th May 2010 P.Charrue - LBCM 5

Page 6: CMW infrastructure Status report

Medium term plans Deploy Proxies with support to slow clients Deploy patched Jacorb library to solve the JAVA client blocking

situation Push the usage of JAPC to avoid the loss of communication from

certain JAVA client applications Long term plans

The CMW team is currently preparing a complete technical review of the Communication Infrastructure▪ Several clients have already been interviewed▪ The issues of the present infrastructure have been captured and prioritised

along with the new functionality requested▪ Several solutions have been evaluated▪ External middleware experts have been contacted to help us confirm our

choices The actual review will take place in October 2010 https://wikis.cern.ch/display/MW/CMW+Review

4th May 2010 P.Charrue - LBCM 6

Page 7: CMW infrastructure Status report

From the e-logbook, a simple right-clic on an entry will create a JIRA issue Each JIRA issue is then assigned and is closely followed-up http://issues/browse/APS PS and SPS operators are making good use of this

From your browser, go to http://issues and fill in a new JIRA issue

As a last solution: Avoid direct email to individuals (they might be on vacation, not

reading their mail, sick, on leave, ….) Instead opt for the support mailing lists (e.g. cmw-

[email protected], [email protected], …)

4th May 2010 P.Charrue - LBCM 7