March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos...

19
March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software group

Transcript of March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos...

Page 1: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 CHEP 2003 1

Online Monitoring Software Framework in the ATLAS

Experiment

Serguei Kolos

CERN/PNPI

On behalf of the ATLAS Trigger/DAQ Online Software group

Page 2: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 2

Contents

• ATLAS Detector layout

• ATLAS HLT/DAQ/DCS system

• What to monitor and where

• Performance and Scalability Requirements

• Monitoring framework architecture

• Monitoring framework implementation

• Test results

• Summary

Page 3: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 3

ATLAS Detector Layout

• Each partition may be operated independently

• Some partitions may be operated in parallel

~1000 Read-Out Drivers (RODs) in ~100 VME crates

33 sub-detectorPartitions

Pixel TileCal LAr MDT CSCSCT TRT

CalorimeterInner DetectorMuon

Spectrometer

RPC TGC

Page 4: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 4

Data Storage

Event Filter (EF)

Event Builder (EB)

HLT/DAQ/DCS system

DetectorControlSystem(DCS)

LVL2 Trigger

Read Out Systems (ROSs)

Online Software:ConfigureControl

Monitoring

Pixel TileCal LAr MDT CSCSCT TRT

CalorimeterInner DetectorMuon

Spectrometer

RPC TGC

Page 5: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 5

Where and what will be monitored

• Detector and Physics monitoring:• ROD Crate, ROS: data quality and integrity• DCS: detector hardware status and conditions

• Event Builder: correlation between sub-detectors, consistency of LVL1 information

• Event Filter: monitoring of reconstructed events

• DAQ monitoring:• ROS, EB: operational monitoring (buffer occupancies, throughput,

s/w and h/w status, errors, etc.)

• Trigger Monitoring:• LVL1, LVL2: sample rejected events to check the trigger decision• Event Filter: information attached to a sub-set of accepted and

rejected events

Page 6: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 6

Monitoring data

Type Format Production Access

Samples of physics events

Vector of 4-byte integer values

On request On request

Errors Error ID + Error Severity + Text

In case of faults

Via subscription

Histograms One (or several) standard histogram formats

Always On request and via subscription

Other information (status, operational information, etc.)

User-defined Always On request and via subscription

Page 7: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 7

Online Monitoring Framework: Functionality and Performance Requirements

• The functionality:• Transporting monitoring data requests from consumers to providers

• Transporting monitoring data from providers to consumers

• The required performance: • There are O(1000) sources of the monitoring data

• There are O(1) consumers for each monitoring data item

• Transfer rate between one provider and one consumer:• For samples of physics events is O(1) MB/s

• For other monitoring data types O(1) kB/s

MonitoringFramework

MonitoringFramework

MonitoringData Provider

MonitoringData Consumer

commandcommand

data data

Page 8: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 8

Online Monitoring Framework: Architecture

• Specific service for each monitoring data type

• IPC: Common communication abstraction layer: implements communication domains to support partitioning

• CORBA: Common communication implementation layer

Common Object Request Broker Architecture (CORBA)

Inter Process Communication

Event Monitoring

Service

MessageReportingService

InformationService

OnlineHistogramming

Service

Page 9: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 9

Event Monitoring Service: Interfaces

• Provide unified way of getting samples of physics events from any point in the data-flow chain

• Implementation exists for C++ and Java

EventMonitoring

Service

EventMonitoring

Service Event ConsumerEventIterator

next_event

Event Provider

EventAccumulator

add_event

EventSampler start_samplingstop_sampling

Page 10: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 10

DAQ Workstation

DAQ Workstation

Event DistributorROS

RODCrate

EventSampler

RODCrate

Event Monitoring Service: Deployment

EB

EventSampler

RODCrate

EventSampler

EventSampler

EventSampler

Event Buffer

Event Consumer

2. add_event

3. next_event

1. select

1.1 start_sampling

There is no direct connection between Event Consumer and Event Sampler

Page 11: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 11

Message Reporting Service: Interfaces

• Each error has unique ID, severity and text, and optionally may have custom parameters and qualifiers.

• Error Consumer may subscribe to a messages by defining a range of values for any field of the error message

• Implementation exists for C++ and Java

MessageReporting

Service

MessageReporting

ServiceError ConsumerError Provider

MRSStream subscribe

send_error notify

MRSReceiver

MRSCallback

Page 12: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 12

DAQ Workstation

Message Reporting Service: deployment

DAQ Workstation

MRS Server

ROS

EB

DAQ Application

DAQ Application

2. send_error

3. notify

1. subscribe

RODCrate

DAQ Application

RODCrate

DAQ Application

RODCrate

DAQ Application

DAQ Application

Page 13: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 13

Information Service: Interfaces

• Allows applications to exchange user-defined information

• Information structure is defined in XML

• Information description is available at run-time

• Implementation exists in C++ and Java

InformationService

InformationService

Info ConsumerInfo Provider

InfoDistionary

subscribeinsertupdateremove notify

InfoReceiver

InfoCallback

get_value

get_description

InfoDocument

Page 14: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 14

Information Service: Deployment

DAQ Workstation

DAQ Control Workstation

IS Server

ROS

EB

DAQ Application

DAQ Application

insertupdateremove

notifysubscribe

RODCrate

DAQ Application

RODCrate

DAQ Application

RODCrate

DAQ Application

DAQ Application DAQ Control Workstation

IS Server

DAQ Workstation

IS Server

get_value

Page 15: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 15

HistogrammingService

HistogrammingService

Online Histogramming Service: Interfaces

• Specialization of the Information Service for transporting histograms

• Current implementation supports ROOT histograms and also raw histograms (vectors of data)

• Has abstract interface layer which allows to add support for other types of histograms

InformationService

Histogram Provider

RootHistoProvider

RawHistoProvider

Histogram Consumer

CustomHistoProvider

RootHistoReceiver

RawHistoReceiver

CustomHistoReceiver

Page 16: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 16

Histogramming Service: Deployment

DAQ Workstation

DAQ Control Workstation

IS Server

ROS

EB

DAQ Application

DAQ Application

insertupdateremove

DAQ Application DAQ Control Workstation

IS Server

DAQ Workstation

IS Server

get_histogram

RODCrate

DAQ Application

RODCrate

DAQ Application

RODCrate

DAQ Application

Implementation is based on the Information Service

Page 17: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 17

IS Monitor (Motif, C++) Event Dump (Java)

Histogram Display (ROOT,C++)

Graphical User Interfaces

Page 18: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 18

Information Service Performance Tests

200 400 600 800 100000

11

22

33

44

55

66

77

88 1 receiver5 receivers10 receivers15 receivers

Number of information providers

Mean time for one information update (ms)

• 222 Dual Pentium III PC (600-1000 Mhz) connected via Fast Ethernet

• Linux RedHat 7.3

• Single IS server was running on a dedicated PC

• Up to a 15 IS receivers were running on 15 dedicated PCs

• Up to 1000 IS information providers where equally distributed over 200 PCs

• Each provider published one information and then updated it once per second

Page 19: March 2003 CHEP 2003 1 Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.

March 2003 Online Monitoring Software Framework in the ATLAS Experiment 19

Summary

• The Online Monitoring framework in the ATLAS experiment is responsible for transportation different type of monitoring data from provider to consumers

• The framework consists of 4 services to handle different types of monitoring data

• All the services have APIs in both C++ and Java

• The tests show that the performance and scalability of the current implementation is very close to the ATLAS requirements