2013/03/00 - Karabo Design Concepts
-
Upload
epics-qt-collaboration -
Category
Engineering
-
view
172 -
download
2
Transcript of 2013/03/00 - Karabo Design Concepts
Karabo: The European XFEL software framework
Burkhard Heisen for WP76
Novemeber, 2013
Karabo: The European XFEL software framework
Design Concepts
The star marks concepts, which are not yet implemented in the current release
Karabo: The European XFEL software framework
Functional requirements
2
Burkhard Heisen (WP76)
DAQdata readout
online processing
quality monitoring (vetoing)
SCprocessing pipelines
distributed and GPU computing
specific algorithms (e.g. reconstruction)
Controldrive hardware and complex experiments
monitor variables & trigger alarms
DMstorage of experiment & control data
data access, authentication authorization etc.
setup computation & show scientific results
allow some control & show hardware status
show online data whilst running
A typical use case:
Accelerator Undulator Beam Transport
Sample Injector
DM SC
ControlDAQ
Tight integration of applications
Karabo: The European XFEL software framework
Functionality: What are we dealing with?1. Distributed end points and processes
2. Data containers (Hash, Schema, Image, …)
3. Data transport (data flow, network protocol)
4. Process control (automation, feedback)
5. States (finite state machines, sequencing, automation…)
6. Data acquisition (front end hardware)
7. Time synchronization/tagging (time stamps, cycle ids, etc.)
8. Real-time needs (where necessary)
9. Central services (archive, alarm, name resolution, …)
10.Security (who’s allowed to do what from where?)
11.Statistics (control system itself, operation, …)
12.Logging (active, passive, central, local)
13.Processing workflows (parallelism, pipeline execution, provenance)
14.Clients / User interfaces (API, languages, macro writing, CLI, GUI)
15.Software management (coding, building, packaging, deployment, versioning, …)
3
Burkhard Heisen (WP76)
Karabo: The European XFEL software framework
Distributed end points and processes
4
Burkhard Heisen (WP76)
Concept: Device Server Model Similar to: TANGO, DOOCS, TINE*
Elements are controllable objects managed by a device server.
Instance of such an object is a device, with a hierarchical name.
Device classes can be loaded at runtime (plugins)
Actions pertaining to a device given by its properties and commands
i.e. get, set, monitor some property or execute some command
Properties, commands, and (optionally) associated FSM logic statically
defined and further described (attributes) in device class. Dynamic (runtime)
extension of properties and commands possible.
Devices can be written in either C++ or Python (later maybe also Java) and
run on either Linux, MacOSX or Windows (later)
Karabo: The European XFEL software framework
DETAIL: Distributed endpointsConfiguration - API
5
Class: MotorDevicestatic expectedParameters( Schema& s ) { FLOAT_ELEMENT(s).key(“velocity”) .description(“Velocity of the motor”) .unitSymbol(“m/s”) .assignmentOptional().defaultValue(0.3) .maxInc(10) .minInc(0.01) .reconfigurable() .allowedStates(“Idle”) .commit();
INT32_ELEMENT(s).key(“currentPosition”) .description = “Current position of the motor” .readOnly() .warnLow(10) […]
SLOT_ELEMENT(s).key(“move”) .description = “Will move motor to target position” .allowedStates(“Idle”) […]}
// Constructor with initial configuration MotorDevice( const Hash& config ) { […] }
// Called at each (re-)configuration requestonReconfigure( const Hash& config ) { […] }
Any Device uses a standardized API to describe itself. This information is used to automatically create GUI input masks or for auto-completion on the IPython console
No need for device developers to validate any parameters. This is internally done taking the expectedParameters as white-list
We distinguish between properties and commands and associated attributes, all of them can be expressed within the expected parameters function
Properties and commands can be nested, such that hierarchical groupings are possible
Burkhard Heisen (WP76)
Attribute
Property
Command
Karabo: The European XFEL software framework
DETAIL: Distributed end points and processesCreating a new device
6
Burkhard Heisen (WP76)
plugins
1. Write a class (say: MyDevice) that derives
from Device
2. Compile it into a shared library (say
libMyDevice.so)
3. Select a running Device-Server or start a
new one
4. Copy the libMyDevice.so to the plugins
folder of the Device-Server
5. The Device-Server will emit a signal to the
broker that a new Device class is
available, it ships the expected parameters
as read from static context of the
MyDevice classGUI
libMyDevice.so
signalNewDeviceClassAvailable (.xsd)
Master
Central DB
GUI-Srv
Karabo: The European XFEL software framework
DETAIL: Distributed end points and processesCreating a new device
7
Burkhard Heisen (WP76)
plugins
GUI
Master
Central DB
GUI-Srv
MyDevice1
factory: create(“MyDevice”, xml)6. Given the mask of possible parameters the
user may fill a valid configuration and emit
an instantiate signal to the broker
7. The configuration will be validated by the
Device factory and if valid, an instance of
MyDevice will be created
8. The constructor of the device class will be
called and provided with the configuration
9. The run method will be called which starts
the state-machine and finally blocks by
activating the event-loop
10. The device will asynchronously listen to
allowed events (slots) guided by the internal
state machine
signalInstantiate(“MyDevice”, xml)
Karabo: The European XFEL software framework
Data containers (Hash, Image/Matrix/Vector)
8
Burkhard Heisen (WP76)
Concept: Have some containers for which Karabo provides special support Hash
String-key, any-value associative container Keeps insertion order (iteration possible), hash performance for random lookup Provide (string-key, any-value) attributes per hash-key Fully recursive structure (i.e. Hashes of Hashes) Serialization: XML, Binary, HDF5, DB Usage: configuration, device-state cache, database interface, message protocol,
etc. Schema
Describes possible/allowed structures for the Hash. In analogy: Schema would be for Hash, what an XSD document is for an XML file
Associates meta-data (called attributes) to properties Image/Matrix/Vector
Some default containers needed for scientific computing Seamless switching between CPU and GPU representation Optimized serialization (network transfer)
Karabo: The European XFEL software framework
Data transport (data flow, network protocol)
9
Burkhard Heisen (WP76)
Concept: Separation between broker based (less frequent, smaller data size) and point-to-point (frequent, large data size) communication Communication is cross-network, cross-language and cross-platform Broker based
Highly available full N x N communication between devices of any category
(Control, SC, DAQ, DM) via broker
Patterns: signal/slots, request/response, simple call
Point-to-Point
Transient (run-time) establishment of direct (brokerless) connections between
devices
TCP-based, high performance for huge data
Asynchronous IO, memory optimization if local
Karabo: The European XFEL software framework
DETAIL: Data transportCommunication: Event-Driven vs. Scheduled
10
Burkhard Heisen (WP76)
Device
1
Device
2
Device
3
Device
4
Emit
Notify
Notify
Notify
Device 1
Device
2
Device
3
Device
4
Request
Response
Event-driven communication
“Push Model”A minimal set of information is passed
System is scalable (maintains performance)Failure is harder to detect
Scheduled communication “Poll Model”
Direct feedback on requestNodes may be spammed (DOS)
Growing systems loose performance
Typically, lots of extra traffic is generated
Karabo: The European XFEL software framework
DETAIL: Data transportBroker based communication - API
Communication happens between ordinary (member, or free-standing) functions
Functions on distributed instances are identified by a pair of strings, the instanceId
and the functionName
The instanceId uniquely identifies a (e.g. device-)instance connected to a specific
topic of the broker
The functionName uniquely identifies an ordinary function registered under a given
instanceId
Functions of any signature (currently up to 4 arguments) can be registered to be
remotely callable
Registration can be done at runtime without extra tools
Function calls can be done cross-network, cross-operating-system and cross-
language (currently, C++ and Python, Java will follow)
The language’s native data types are directly supported as arguments
A generic, fully recursive, key to any-value container (Hash) is provided as a data-
type for complex arguments
11
Burkhard Heisen (WP76)
Karabo: The European XFEL software framework
DETAIL: Data transportBroker based communication – Three Patterns
● Signals & Slots SLOT ( function, [argTypes] ) SIGNAL ( funcName, [argTypes] ) connect ( signalInstanceId, signalFunc, slotInstanceId, slotFunc ) emit ( signalFunc, [args] )
12
Burkhard Heisen (WP76)
SLOT(onFoo, int, std::string);
void onFoo(const int& i, std::string& s) { }SIGNAL(“foo”, int, std::string);
connect(“Device1”, “foo”, “Device2”, “onFoo”);
connect(“”, “foo”, “Device3”, “onGoo”);
connect(“”, “foo”, “Device4”, “onHoo”);
emit(“foo”, 42, “bar”);
Device1
Device2
Device3
Device4
Emit
Notify
Notify
Notify
SLOT(onGoo, int, std::string);
void onGoo(const int& i) { }
SLOT(onHoo, int, std::string);
void onHoo(const int& i, std::string& s) { }
Karabo: The European XFEL software framework
DETAIL: Data transportBroker based communication - Patterns
● Direct Call call ( instanceId, funcName, [args] )
13
Burkhard Heisen (WP76)
Device2
Call Notify
Device1
SLOT(onFoo, std::string);
void onFoo(const std::string& s) { }call(“Device2”, “onFoo”, “bar”);
● Request / Reply request ( instanceId, funcName, [reqArgs] ).timeout( msec ).receive( [repArgs] )
Device2Device1
SLOT(onFoo, int);
void onFoo(const int& i) { reply( i + i ); }
int number;
request(“Device2”, “onFoo”, 21).timeout(100).receive(number);
Request
Notify
Notify
Reply
Karabo: The European XFEL software framework
DETAIL: Data TransportIllustration
14
Burkhard Heisen (WP76)
HV Pump
Simulate
Store
Cali- brate1
Cali-
brate2
Load
APD
Logger
RDB
Disk
Storage
GUI
Server
GUI(s)
Terminal(s)
Camera
Device-Server
Application
Message Broker
(Event Loop)
Device Instance
Device
Sub
Control
Karabo: The European XFEL software framework
Process control (automation, feedback)
15
Burkhard Heisen (WP76)
Concept: Single device processes vs. multiple device processes Processes which involve a single device and e.g. some hardware
Implementation of a software FSM that mirrors the hardware FSM Automation and feedbacks implemented using software FSM events.
Events may be internally triggered (auto) or exposed to control system (interactive/manual)
Processes which involve coordination of multiple devices (non real-time) Process is abstracted into parent device which sub-instructs children
devices (composition). Control system protects children devices from direct user control. Parent devices FSM describes process automation/feedback. Parent device is device and device-controller in person.
Karabo: The European XFEL software framework
DETAIL: Process controlA standardized hardware device
16
Burkhard Heisen (WP76)
Concepts The hardware is always safe even without software Coupling between h/w devices at a “lower” lever than Karabo can exist (real time) The authority (h/w or s/w) may be different and even change during runtime A generic state transition table design exists, which allows for flexible h/w control
Ok
HardwareError
CommunicationError
Error
ReadjustingonOutOfSync
onHwError
onComError
reset*
none
onException
reset
reset / action
[ autonomous]
Enter HardwareError
1. generic h/w error status bit is set (by PLC)
Exit HardwareError
2. click reset button calls resetHardwareAction() which should make any actions to ‘reset’ h/w , if not successful HardwareError
3. Is reentered (eventually – timeout?)
Enter CommunicationError
1. Heartbeat from PLC not received by BeckhoffCom
2. BeckhoffCom dead
3. Broker dead
Exit CommunicationError
4. reset*, the * means driven by internal recovery where no user action required (or possible)
Enter Error
1. on exception which is not caught in FSM s/w thread
2. s/w device’s call to onError() (only used in composite devices)
Exit Error
3. click reset button which moves s/w device to AllOk’s Initialization, where the h/w
status is requested and the correct state (or Error) moved to depending on the reply
Karabo: The European XFEL software framework
States (finite state machines, sequencing, automation…)
17
Burkhard Heisen (WP76)
Concept: Devices optionally run finite state machines (FSMs) inside Devices can implement a custom or inherit a common FSM
Events into the FSM can be triggered internally (automation, sequencing) or
made device commands (remotely trigger-able)
The FSM provides four hooks fitting into the event-driven API style of devices
(onGuard, srcStateOnExit, onTransitionAction, tgtStateOnEntry)
Any (writable) property or command can be access restricted according to the
device’s current state. This is done using the attribute allowedStates.
As allowedStates is an attribute (and thus part of the static XSD) any UI system
is able to pro-actively reflect the currently (state dependent) settable properties
and commands. The command-line interface uses this information to provide
state-aware auto-completion whilst the GUI uses it for widget-disabling (grey
out).
Karabo: The European XFEL software framework
Detail: Device – Finite state machine (FSM)
18
Burkhard Heisen (WP76)
OK
Initialization
Stopped
Started
none
start
errorFoundreset
stop
// Ok MachineFSM_TABLE_BEGIN(OkTransitionTable)// SrcState Event TgtState Action GuardRow< Started, StopEvent, Stopped, StopAction, none >,Row< Stopped, StartEvent, Started, StartAction, none >FSM_TABLE_ENDFSM_STATE_MACHINE(Ok, OkTransitionTable, Stopped, Self)
// Top MachineFSM_TABLE_BEGIN(TransitionTable)Row< Initialization, none, Ok, none, none >,Row< Ok, ErrorFoundEvent, Error, ErrorFoundAction, none >,Row< Error, ResetEvent, Ok, ResetAction, none >FSM_TABLE_ENDKARABO_FSM_STATE_MACHINE(StateMachine, TransitionTable, Initialization, Self)
Start Stop State Machine
Error
Any device uses a standardized way to express
its possible program flow The state machine calls back device functions (guard,
onStateExit, transitionAction, onStateEntry)
The GUI is state-machine aware and enables/disables
buttons proactively
Karabo: The European XFEL software framework
DETAIL: StatesFinite state machines – There is a UML standard
19
Burkhard Heisen (WP76)
State Machine: the life cycle of a thing. It is made of states, transitions and processes incoming events.
State: a stage in the life cycle of a state machine. A state (like a submachine) can have an entry and exit behaviors
Event: an incident provoking (or not) a reaction of the state machine
Transition: a specification of how a state machine reacts to an event. It specifies a source state, the event triggering the transition, the target state (which will become the newly active state if the transition is triggered), guard and actions
Action: an operation executed during the triggering of the transition
Guard: a boolean operation being able to prevent the triggering of a transition which would otherwise fire
Transition Table: representation of a state machine. A state machine diagram is a graphical, but incomplete representation of the same model. A transition table, on the other hand, is a complete representation
Karabo: The European XFEL software framework
DETAIL: StatesFSM implementation example in C++ (header only)
20
Burkhard Heisen (WP76)
// AllOkState MachineFSM_TABLE_BEGIN(AllOkStateTransitionTable)// SrcState Event TgtState Action GuardRow< StartedState, StopEvent, StoppedState, StopAction, none >,Row< StoppedState, StartEvent, StartedState, StartAction, none >FSM_TABLE_ENDFSM_STATE_MACHINE(AllOkState, AllOkStateTransitionTable, StoppedState, Self)
// EventsFSM_EVENT2(ErrorFoundEvent, onException, string, string)FSM_EVENT0(EndErrorEvent, endErrorEvent)FSM_EVENT0(StartEvent, slotMoveStartEvent)FSM_EVENT0(StopEvent, slotStopEvent)
// StatesFSM_STATE_EE(ErrorState, errorStateOnEntry, errorStateOnExit)FSM_STATE_E(InitializationState, initializationStateOnEntry)FSM_STATE_EE(StartedState, startedStateOnEntry, startedStateOnExit)FSM_STATE_EE(StoppedState, stoppedStateOnEntry, stoppedStateOnExit)// Transition ActionsFSM_ACTION0(StartAction, startAction)FSM_ACTION0(StopAction, stopAction)
// StartStop MachineFSM_TABLE_BEGIN(StartStopTransitionTable)Row< InitializationState, none, AllOkState, none, none >,Row< AllOkState, ErrorFoundEvent, ErrorState, ErrorFoundAction, none >,Row< ErrorState, EndErrorEvent, AllOkState, EndErrorAction, none >FSM_TABLE_ENDKARABO_FSM_STATE_MACHINE(StartStopMachine, StartStopMachineTransitionTable, InitializationState, Self)FSM_CREATE_MACHINE(StartStopMachine, m_fsm);FSM_SET_CONTEXT_TOP(this, m_fsm)FSM_SET_CONTEXT_SUB(this, m_fsm, AllOkState)FSM_START_MACHINE(m_fsm)
Transition table element
Regular callable function (triggers event)
Transition table element
Regular function hook (will be call-backed)
Karabo: The European XFEL software framework
DETAIL: StatesFSM implementation example in Python
21
Burkhard Heisen (WP76)
# AllOkState MachineallOkStt = [# SrcState Event TgtState Action Guard (‘StartedState’, ‘StartEvent’, ‘StoppedState’, ‘StartAction’, ‘none’), (‘StoppedState’, ‘StopEvent’, ‘StartedState’, ‘StopAction’, ‘none’)]FSM_STATE_MACHINE(‘AllOkState’, allOkStt, ‘InitializationState’)
# EventsFSM_EVENT2(self, ‘ErrorFoundEvent’, ‘onException’)FSM_EVENT0(self, ‘EndErrorEvent’, ‘slotEndError’)FSM_EVENT0(self, ‘StartEvent’, ‘slotStart’)FSM_EVENT0(self, ‘StopEvent’, ‘slotStop’)
# StatesFSM_STATE_EE(‘ErrorState’, self.errorStateOnEntry, self.errorStateOnExit )FSM_STATE_E( ‘InitializationState’, self.initializationStateOnEntry )FSM_STATE_EE(‘StartedState’, self.startedStateOnEntry, self.startedStateOnExit)FSM_STATE_EE(‘StoppedState’, self.stoppedStateOnEntry, self.stoppedStateOnExit)
# Transition ActionsFSM_ACTION0(‘StartAction’, self.startAction)FSM_ACTION0(‘StopAction’, self.stopAction)
# Top MachinetopStt = [ (‘InitializationState’, ‘none’, ‘AllOkState’, ‘none’, ‘none’), (‘AllOkState’, ‘ErrorFoundEvent’, ‘ErrorState’, ‘none’, ‘none’), (‘ErrorState’, ‘EndErrorEvent’, ‘AllOkState’, ‘none’, ‘none’)]FSM_STATE_MACHINE(‘StartStopDeviceMachine’, topStt, ‘AllOkState’)self.fsm = FSM_CREATE_MACHINE(‘StartStopMachine’)self.startStateMachine()
Karabo: The European XFEL software framework
Data acquisition
22
Burkhard Heisen (WP76)
Concept: FEM -> PC-Layer -> Online-Cache PCL machines run highly tuned devices which write data to file (online cache)
as fast as possible. Online cache is (one possible) data source for Karabo’s workflow system.
Karabo: The European XFEL software framework
Real time needs (where necessary)
23
Burkhard Heisen (WP76)
Concept: Karabo itself does not provide real time processes Real time processes (if needed) must be defined and executed in layers below
Karabo. Karabo devices will only start/stop/monitor real time processes. Examples: Beckhoff motor-coupling, Beckhoff feedback systems, etc…
Karabo: The European XFEL software framework
Time synchronization (time stamps, cycle ids, etc.)
24
Burkhard Heisen (WP76)
Concept: Any changed property will carry timing information as attribute(s) Time information is assigned per property
Karabo’s timestamp consists of the following information:
Seconds since unix epoch, uint64
Fractional seconds (up to atto-second resolution), uint64
Train ID, uint64
Time information is assigned as early as possible (best: already on hardware) but
latest in the software device
On event-driven update, the device ships the property key, the property value and
associated time information as property attribute(s)
Real-time synchronization is not subject to Karabo
Correlation between control system (monitor) data and instrument data will be
done using the archived central DB information (or information previously
exported into HDF5 files)
Karabo: The European XFEL software framework
DETAIL: Time synchronizationDistributed Train ID clock
25
Burkhard Heisen (WP76)
Concept: A dedicated machine with a time receiver board
(h/w) distributes clocks on the Karabo level
Scenario 1: No time information from h/w
Example: commercial cameras
Timestamp is associated to the event-driven data in
the Karabo device
If clock signal is too late, the next trainId is calculated
(extrapolated) given the previous one and the interval
between trainId's
The interval is configurable on the Clock device and
must be stable within a run. Error is flagged if clock
tick is lost.
Scenario 2: Time information is already provided by h/w
The timestamp can be taken from the h/w or the
device (configurable). The rest is the same as in
scenario 1.
Clock
Device
Time receiver board
signals:
1. trainId
2. epochTime
3. interval
creates timestamp and associates to trainId
Karabo: The European XFEL software framework
Central services (archive, alarm, name resolution, …)
26
Burkhard Heisen (WP76)
Concept: Karabo’s central aspects will be reflected within a database All properties of all devices will be archived into DB in an event-driven way by default
Any property carries an “archive policy” attribute to reduce or switch-off archiving
Karabo is user centric (login at client start-up), the DB will provide all needed
information to perform later access control on devices
Any user-specific GUI settings will be saved to DB
The DB gives access to all pre-configuration (user-centric) of future device instances
Name resolution is handled by the message broker (filtering on broker, not client)
Besides the broker, other central services are technically not needed.
GUI clients are not directly talking to the broker but are going through a GUI server
Distributed alarm conditions are planned to be handled by python devices that can
check any (distributed) condition and can be instantiated (armed) at need
Karabo: The European XFEL software framework
Central services - Name resolution/access
27
Burkhard Heisen (WP76)
Concept: The only central service needed is the broker, others are optional Start-up issues
A fixed ID can (optionally) be provided prior start-up (via command line or file) If no instance ID is provided the ID is auto-generated locally
Servers: hostname_Server_pid Devices: hostname-pid_classId_counter
Any instance ID is validated (by request-response trial) prior startup
Running system issues The engine for all inter-device communication is the DeviceClient class The DeviceClient abstracts the SignalSlotable layer into a set of functions
instantiate, kill, set, execute, get, monitor etc. The DeviceClient can act without a central entity and be started anytime The DeviceClient can act as master itself and boost performance of other
DeviceClients Master DeviceClients can come and go, everything is handled transparently
Karabo: The European XFEL software framework
Central services – Data archiving
28
Burkhard Heisen (WP76)
Concept: A central data logger device
collects event driven data and persists
The data logger is a device which is
listens to all other devices
The event-driven information is
cached in form of a Hash object for
some time and then persisted to
either file or DB or both
Information is stored in a per
parameter manner
Next to the parameter values the
current valid schema is saved as
well
Logger
Central DB
GUI-Srv
Device-Server
InstanceMessage
Broker
Device Instances
GUI-Client
Device-Server
Instance
Device Instance
GUI-Client
Master Device-Server
Instance
Karabo: The European XFEL software framework
DETAIL: Access levels We will initially have five access levels (enum) with intrinsic ordering
ADMIN = 4 EXPERT = 3 OPERATOR = 2 USER = 1 OBSERVER = 0
Any Device can restrict access globally or on a per-parameter basis Global restriction is enforced through the “visibility” property (base class)
Only if the requestor is of same or higher access level he can see/use the device The “visibility” property is part of the topology info (seen immediately by clients)
Parameter restriction is enforced through the “requiredAccessLevel” schema-attribute Parameter restriction typically is set programmatically but may be re-configured
at initialization time (or even runtime?) The “visibility” property might be re-configured if the requestors access level is higher
than the associated “requiredAccessLevel” (should typically be ADMIN) The default access level for settable properties and commands is USER The default access level for read-only properties is OBSERVER The default value for the visibility is OBSERVER
29
Burkhard Heisen (WP76)
Karabo: The European XFEL software framework
DETAIL: Access levels A role is defined in the DB and consists of a default access level and a device-
instance specific access list (overwriting the default level) which can be empty. SPB_Operator
defaultAccessLevel => USER accessList
SPB_* => OPERATOR Undulator_GapMover_0 => OPERATOR
Global_Observer defaultAccessLevel => OBSERVER
Global_Expert defaultAccessLevel = EXPERT
After authentication the DB computes the user specific access levels considering current time, current location and associated role. It then ships a default access and an access level list back to the user. If the authentication service (or DB) is not available, Karabo falls back to a
compiled default access level (in-house: OBSERVER, shipped-versions: ADMIN) For a ADMIN user it might be possible to temporarily (per session) change the
access list of another user.
30
Burkhard Heisen (WP76)
Karabo: The European XFEL software framework
DETAIL: Security
31
Burkhard Heisen (WP76)
Header […]__uid=42__accessLevel=“admin”
Body […]
Broker-Message
Device
Locking:if is locked: if is __uid == owner then ok
Access control:if __accessLevel >= visibility: if __accessLevel >= param.accessLevel then ok
GUI-Srv
Central DB
1. Authorizes2. Computes context based access levels
usernamepasswordproviderownIP*
brokerHost*brokerPort*
brokerTopic*
userIdsessionToken
defaultAccessLevelaccessList
GUI or CLI
Karabo: The European XFEL software framework
Statistics (control system itself, operation, …)
32
Burkhard Heisen (WP76)
Concept: Statistics will be collected by regular devices OpenMQ implementation provides a wealth of statistics (e.g. messages in
system, average flow, number of consumers/producers, broker memory used…)
Have a (broker-)statistic device that does system calls to retrieve information
Similar idea for other statistical data
Karabo: The European XFEL software framework
Logging (active, passive, central, local)
33
Burkhard Heisen (WP76)
Concept: Categorized into the following classes Active Logging Additional code (inserted by the developer) accompanying the
production/business code, which is intended to increase the verbosity of what is currently
happening.
Code Tracing Macro based, no overhead if disabled, for low-level purposes
Code Logging Conceptual analog to Log4j, network appender, remote and at runtime
priority (re-)configuration
Passive Logging Recording of activities in the distributed event-driven system. No extra
coding is required from developers, passive logging transparently records system relevant
events.
Broker-message logging Low-level debugging purpose, start/stop, not active during
production
Transactional logging Archival of the full distributed state
Karabo: The European XFEL software framework
Processing workflows (parallelism, pipeline execution, provenance)
34
Burkhard Heisen (WP76)
Concept: Devices as modules of a scientific workflow system Configurable generic input/output channels on devices One channel is specific for one data structure (e.g. Hash, Image, File, etc.) New data structures can be “registered” and are immediately usable Input channel configuration: copy of connected output’s data or share the data with
other input channels, minimum number of data needed ComputeFsm as base class, developers just need to code the compute method IO system is decoupled from processing system (process whilst transferring data) Automatic (API transparent) data transfer optimization (pointer if local, TCP if remote) Broker-based communication for workflow coordination and meta-data sharing GUI integration to setup workflows graphically (drag-and-drop featured) Workflows can be stored and shared (following the general rules of data privacy and
security) executed, paused and stepped
Parallel execution
Karabo: The European XFEL software framework
DETAIL: Processing workflowsParallelism and load-balancing by design
35
Burkhard Heisen (WP76)
TCP
Memory
Devices within the same device-server: Data will be transferred by handing over pointers
to corresponding memory locations Multiple instances connected to one output
channel will run in parallel using CPU threads
Devices in different device-servers: Data will be transferred via TCP Multiple instances connected to one output
channel will perform distributed computing
CPU-threads
Distributed processing Output channel technically is TCP server, inputs are clients Data transfer model follows an event-driven poll architecture, leads to load-balancing
and maximum per module performance even on heterogeneous h/w Configurable output channel behavior in case no input currently available: throw, queue,
wait, drop
Karabo: The European XFEL software framework
DETAIL: Processing workflowsGPU enabled processing
36
Burkhard Heisen (WP76)
Concept: GPU parallelization will happen within a compute execution The data structures (e.g. image) are prepared for GPU parallelization Karabo will detect whether a given hardware is capable for GPU computing at runtime,
if not falls back to corresponding CPU algorithm Differences in runtime are balanced by the workflow system
IO whilst computing
Pixel parallel processing
(one GPU thread per pixel)Notification about new data possible to obtain
GPU
CPU
Karabo: The European XFEL software framework
Clients / User interfaces (API, languages, macro writing, CLI, GUI)
37
Burkhard Heisen (WP76)
Concept: Two UIs – graphical (GUI) and scriptable command line (CLI) GUI
Have one multi-purpose GUI system satisfying all needs See following slides for details
Non-GUI We distinguish APIs for programmatically set up of control sequences (others call
those Macros) versus and API which allows interactive, commandline-based control (IPython based)
The programmatic API exists for C++ and Python and features: Querying of distributed system topology (hosts, device-servers, devices, their
properties/commands, etc.): getServers, getDevices, getClasses instantiate, kill, set, execute (in “wait” or “noWait” fashion), get, monitorProperty,
monitorDevice Both APIs are state and access-role aware, caching mechanisms provide proper
Schema and synchronous (poll-feel API) although always event-driven in the back-end
The interactive API integrates auto-completion and improved interactive functionality suited to iPython
Karabo: The European XFEL software framework
GUI: What do we have to deal with?
Client-Server (network protocol, optimizations)
User management (login/logout, load/save settings, access role support)
Layout (panels, full screen, docking/undocking)
Navigation (devices, configurations, data, …)
Configuration (initialization vs. runtime, loading/saving, …)
Customization (widget galleries, custom GUI builder, composition, …)
Notification (about alarms, finished pipelines, …)
Log Inspection (filtering, configuration of log-levels, …)
Embedded scripting (iPython, macro recording/playing)
Online documentation (embedded wiki, bug-tracing, …)
38
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Client-Server (network protocol, optimizations)
39
Master
Central DB
GUI-Srv
Message
Broker
GUI-Client
I only see device “A”
onChange information only
related to “A”
Concept: One server, many clients, TCP Server knows what each client user sees (on a
device level) and optimizes traffic accordingly
Client-Server protocol is TCP, messages are
header/body style using Hash serialization (default
binary protocol)
Client side socket will be threaded to decouple from
main-event loop
On client start server provides current distributed
state utilizing the DB, later clients are updated
through the broker
Image data is pre-processed on server-side and
brought into QImage format before sending
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
User management (login/logout, load/save settings, access role support)
40
Concept: User centralized, login mandatory Login necessary to connect to system
Access role will be computed (context based)
User specific settings will be loaded from DB
View and control is adapted to access role
User or role specific configuration and wizards are
available
Central DB
1. Authorizes
2. Computes context based access role
username
password
userId
accessRole
session
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Layout (panels, full screen, docking/undocking)
41
Concept: Six dock-able and slide-able (optionally tabbed) main panels Panels are organized by functionality
Navigation
Custom composition area (sub-GUI building)
Configuration (non-tabbed, changes view based on selection elsewhere)
Documentation (linked and updated with current configuration view)
Logging
Notifications
Panels and their tabs can be undocked (windows then belongs to OS’s window
manager) and made full-screen (distribution across several monitors possible)
Custom composition area (central panel) will be optimized for phones and tablets
GUI behaves natively under MacOSX, Linux and Windows
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
DETAIL: LayoutDefault panel arrangement, docking and sliding
42
Navigation
Custom composition area
Configuration
Notifications
Logging / Scripting console
Documentation
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Navigation (devices, configurations, data, …)
43
Concept: Navigate device-servers, devices,
configurations, data(-files), etc. Different views (tabs) on data
Hierarchical distributed system view
Device ownership centric (view compositions)
Available configurations
Hierarchical file view (e.g. HDF5)
Automatic (by access level) filtering of items
Auto select navigation item if context is selected
somewhere else in GUI
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Configuration (initialization vs. runtime, loading/saving, …)
44
Concept: Auto-generated default widgets for
configuring classes and instances Widgets are generated from device information (.xsd
format)
2-column layout for class configuration (label,
initialization-value)
3-column layout (label, value-on-device, edit-value)
for instance configuration
Allows reading/writing properties (all data-types)
Allows executing commands (as buttons)
Is aware about device’s FSM, enables/disables
widgets accordingly
Is aware about access level, enables/disables
widgets accordingly
Single, selection and all apply capability
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Customization (widget galleries, custom GUI builder, composition, …)
45
Concept: Combination of PowerPoint-like editor and online
properties/commands with changeable widget types Tabbed, static panel (does not change on navigation)
Two modes: Pre-configuration (classes) and runtime configuration (instances)
Visual composition of properties/commands of any devices
Visual composition of devices (workflow layouting)
Data-type aware widget factory for properties/commands (edit/display)
PowerPoint-like tools for drawing, arranging, grouping, selecting, zooming of text,
shapes, pictures, etc.
Capability to save/load custom panels, open several simultaneously
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
DETAIL: CustomizationProperty/Command composition
46
drag & drop
Display widget (Trend-Line)
Display widget
Editable widget
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
DETAIL: CustomizationProperty/Command composition
47
drag & drop
Display widget (Image View)
Display widget
(Histogram)
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
DETAIL: CustomizationDevice (workflow) composition
48
Workflow node (device)
drag & drop
Draw connection
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
DETAIL: CustomizationExpert panels - Vacuum
49
Change between“Design/Control” mode
Open/Save panel view
Insert text, line, rectangle, …
Cut, copy, paste, remove item
Rotate, scale item
Group items
Bring to front/back
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Notification (about alarms, finished runs, …)
50
Concept: Single place for all system relevant notifications, will link-out to more
detailed information Can be of arbitrary type, e.g.:
Finished experiment run/scan
Finished analysis job
Occurrences of errors, alarms
Update notifications, etc.
Intended to be conceptually similar to now-a-days smartphone notification bars
Visibility and/or acknowledgment of notifications may be user and/or access role
specific
May implement some configurable forwarding system (SMS, email, etc.)
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Log Inspection (filtering, configuration of log-levels, …)
51
Concept: Device’s network appenders provide active logging information which
can be inspected/filtered/exported Tabular view
Filtering by: full-text, date/time, message type, description
Export logging data to file
Logging events are decoupled from main event loop (threading)
Uses Qt’s model/view with SQLite DB as model (MVC design)
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Embedded scripting (iPython, macro recording/playing)
52
Concept: Have the best of two worlds – embed Karabo-CLI into Karabo-GUI Give users the possibility to work with both interfaces seamlessly
Integrate IPython console into Qt widget (as karabo-CLI is IPython based)
Display for any GUI event the corresponding script commands
Have macro recording/playing possibilities
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Online documentation (embedded wiki, bug-tracing, …)
53
Concept: Make the GUI a rich-client having embedded internet access. Use it
for web based device documentation, bug tracking, feature requests, etc. Any device class will have an individual (standardized) wiki page. Pages are
automatically loaded (within the documentation panel) as soon as any
property/command/device is selected elsewhere in GUI (identical to configuration panel
behavior). Depending on access role, pages are immediately readable/editable.
Device wiki pages are also readable/editable via European XFEL’s document
management system (Alfresco) using standard browsers
For each property/command the coded attributes (e.g. description, units, min/max
values, etc.) is shown.
European XFEL’s bug tracking system will be integrated
Kerstin Weger (WP76)
Karabo: The European XFEL software framework
Software management (coding, building, packaging, deployment, versioning, …)
54
Burkhard Heisen (WP76)
Concept: Spiced up NetBeans-based build system, software-bundle approach Clear splitting of Karabo-Framework (distributed system) from Karabo-Packages
(plugins, extensions)
Karabo-Framework (SVN: karabo/karaboFramework/trunk)
Coding done using NetBeans (for c++ and python), Makefile based
Contains: karabo-library (libkarabo.so), karabo-deviceserver, karabo-
brokermessagelogger, karabo-gui, and karabo-cli
Karabo-library already contains python bindings (i.e. can be imported into python)
Makefile target “package” creates self-extracting shell-script which can be installed
on a blank (supported) operating system and is immediately functional
Embedded unit-testing, graphically integrated into NetBeans (c++ and python)
Karabo-Packages (SVN: karabo/karaboPackages/category/packageName/trunk)
After installation of Karabo-Framework packages can be build
SVN checkout of a package to any location and immediate make possible
Everything needed to start a full distributed Karabo instance available in package
A tool for package development is provided (templates, auto svn integration, etc.)
Karabo: The European XFEL software framework
DETAIL: Software managementThe four audiences and their requirements
55
Burkhard Heisen (WP76)
Framework Developer SVN interaction, versioning, releases Code development using Netbeans/Visual Studio Addition of tests, easy addition of external dependencies Tools for packaging the software into either binary + header or source bundles Allow for being framework developer and package developer (see below) in one person at the
same time Package Developer
Flexible access to the Karabo framework ($HOME/.karabo encodes default location) Allow "one package - one software" project mode (each device project has its own versioning
cycle, individual Netbeans project) Standards for in-house development or XFEL developers need to be fullfilled: use parametrized
templates provided, development under Netbeans, use SVN, final code review Possibility to add further extern dependencies to the Karabo framework (see above)
System Integrator/Tester Simple installation of Karabo framework and selected Karabo packages as binaries Start broker, master, i.e. a full distributed system Flexible setup of device-servers + plugins, allow hot-fixes, sanity checks
XFEL-User/Operator Easy installation of pre-configured (binary framework + assortment of packages) karabo systems Run system (GUI, CLI)
Karabo: The European XFEL software framework
DETAIL: Software managementUnit-testing
56
Burkhard Heisen (WP76)
PythonC++
Karabo: The European XFEL software framework
DETAIL: Software managementContinuous integration
57
Burkhard Heisen (WP76)
Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. [Wikipedia]
Required Features: Support for different build systems and different OS Automated builds – nightly builds Continuous builds – on demand, triggered by SVN commit Build matrix – different OS, compiler, compiler options Web interface – configuration, results Email notification Build output logging – easy access to output of build errors Reporting all changes from SVN since last successful build – easy trace of guilty developer Plugin for any virtualization product (VirtualBox, VMWare, etc.) Netbeans plugin for build triggering Easy uploading of build results (installation packages) to web repository
CI systems on the market: Hudson, CruiseControl, buildbot, TeamCity, Jenkins …
Karabo: The European XFEL software framework
DETAIL: Software managementContinuous integration
58
Burkhard Heisen (WP76)
Karabo: The European XFEL software framework
Conclusions
59
Burkhard Heisen (WP76)
XFEL.EU software will be designed to allow simple integration of existing algorithm/packages
The provided services focus on solving general problems like data-flow, configuration, project-tracking, logging, parallelization, visualization, provenance
The ultimate goal is to provide a homogenous software landscape to allow fast and simple crosstalk between all computing enabled categories (Control, DAQ, Data Management and Scientific Computing)
The distributed system is device-centric (not attribute-centric), devices inherently express functionality for communication, configuration and flow control