Building a Large Location Table to Find Replicas of Physics Objects Koen Holtman Heinz Stockinger...
-
Upload
leslie-pope -
Category
Documents
-
view
214 -
download
0
Transcript of Building a Large Location Table to Find Replicas of Physics Objects Koen Holtman Heinz Stockinger...
Building a Large Location Table to
Find Replicas of Physics Objects
Koen HoltmanHeinz Stockinger
CERN/CMS
CHEP ’2000
Feb 7-11, 2000
Replication modelsDifferent replication models:• File based:
– FTP run files, Objectivity DB files, ...– Import/attach them to local data store
– (Objectivity DRO/FTO)
• Linkable Objectivity AMS server tricks:– Transparent local / remote file access– Page-level replication??
• Object based (this talk):– Can replicate every single object / set of objects– Transparent local / remote object access– Requires large Object Location Table (OLT)
• In Objectivity, go from logical OID to physical OID
The problem• >1010 physics objects (event related
objects), all potentially replicated
• Lookup: ‘Find the best replica of the physics object with (logical) OID xyz’– Automatic (grid-like)– Best replica is not always nearest– Lookup operation needs to be fast, scalable
Regional
CentreRegional
Centre
Regional
Centre
Goal of this talk• Goal: show that an OLT can be built
• By showing one viable implementation– This implementation is a spin-off of the work
on automatic reclustering
• >1010 scalability problem can be solved by– Exploiting specific properties of physics
analysis– Limiting freedom of lookup operations
• No Objectivity-style ‘on demand’ navigation
Divide and conquer• Partition events into chunks, 1 subjob
per chunk
• Definition oflogical OID:
(chunk ID,event sequence number,type/version ID)
Job and storage model• Job model: job consists of
– set of event IDs (chunk ID, event seq nr)– 1 or more
type/version IDs– physics code
• code is run usingsubjobs, these get handed iterators
• Storage model:– unit of storage is collection– subset of horizontal bar above
•Structure…
•Insert and delete collections...•Next slide: 1 subjob, 1 type...
1 subjob, 1 type• Say subjob at CERN
wants to read objects (1,3,1),(1,4,1),(1,7,1), ...– At start of subjob,
global schedule is computed, say:try collection 1 first, thentry collection 3.
• Subjob iteration is in increasing sequencenumber, step throughindices alongside withiteration
Computing the schedule• Optimal schedule computed at start of
subjob– Optimiser finds schedule
with minimal cost– Cost(‘read 3,4,7,…’,[1,3]) =
Ccoll(‘read 3,4,… from 1’) +Ccoll(‘read 7,… from 3) = ….
– Value of Ccoll( ) depends on location of collection, network load, access method, etc.
– Calculations use collection contents indices only • Finding optimal schedule is an NP-complete problem,
exponential time complexity, but in practice....
Runtime of scheduler•Time to compute optimal schedule
(mic
ro s
ec)
•Can also timeout, gives good schedule
1 sec
Conclusions• Large object location tables can be built!
– They are an important component in implementing the grid-like aims of the CMS CTP
– Implementation shown is spinoff from work on reclustering
• Scalability by using specific properties of physics analysis, and by limiting the freedom of lookup operations– No Objectivity-style ‘on demand’ navigation, all
objects needs to be requested from the start
• Cost functions can model any access method– Optimisation is both generic and cheap
• Long paper will appear in future