Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May...

20
Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004

Transcript of Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May...

Page 1: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Metadata: Plans and Progress of the Metadata Working Group

Rick St. Denis

Glasgow University

May 13,2004

Page 2: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Cast • Alessandra Forti Manchester University Babar• Carmine Cioffi Oxford University LHCb• Gavin McCance CERN EGEE• Solveig Albrand Grenoble Atlas• Stefan Stonjek Fermilab/Oxford CDF• Tim Barrass Bristol CMS• Wyatt Merrit Fermilab CDF/D0• Adam Lyon Fermilab CDF/D0• Morag Burgon-Lyon Glasgow CDF• Rick St.Denis Glasgow CDF• Julie Trumbo Fermilab CDF• Paul Millar Glasgow General• Steven Hanlon Glasgow Metadata• Caitriona Nicholson

Page 3: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Format• Fired up by a workshop on April 26-28, 2004 in

Glasgow• Goal: Answer the question “What is Metadata” in

our document• Method: Provocateurs• Topics list: augmented at workshop• Got acquainted, divide and study topics,

presentations together, course of action• Output of workshop: Revamped deliverables• Output of Group: Package services for release in

SourceForge

Page 4: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Rough Agenda

• Mon: – 2-3 5 min on who we are– 3-3:30 Decide on topics– 3:30-5:00 Get to Stepps, Hotel– 5:00 Meet in 2 West Ave

• Tues: Provocateur sessions and research• Wed: Final Document with deliverables,

Plans for future: MO, CHEP abstract

Page 5: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Topics

1. Metadata Architecture and components– Replica Catalogs, file catalogs, physics catalogs

2. Use Cases3. Query Languages4. Implementations and Performance. Technology

Considerations, Performance reqs5. Service architectures, Deployment Architectures6. Database implementations:

text/mysql/postgres/oracle/enth

Page 6: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Informing ourselves

• SAM Services (Julie)• Arda/OGSA-DAI(Gav will outline)• AMI (Solveig)• Pool and Graphical Visualization (Carmine)• Spitfire (Paul)• SamTV (Adam)• PNPA-GGF (Rick)• Project Management (Tony)

Page 7: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Use Cases Steve,Rick,Solveig,Tony,Wyatt,Adam

Services Gavin,Wyatt,Rick, Julie

Deployment Architecture

Solveig,Julie,Gavin, Paul

Monitoring Adam,Caitriana,Carmine

Query Languages and Interfaces

Carmine,Rick,Wyatt

Tools Julie,Rick,Gavin,Solveig,Adam

Deliverables Tony,Gavin

Page 8: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Next Steps

• Design for Keyword-Value

• Schema evolution and self-describing schema

• Use previous 2 to automate transition from keyword-value to query-efficient schema and determination of which queries need to be satisfied.

• Unique dataset tool

Page 9: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Deliverables

• Docs from next steps• Use case filtered for our group (draft)• Services: Decomposition of ER-Diagram into

collab diagram• Deployment Arch: Enumerate problems• Monitoring: Stats on queries(accumluate/doc)• QueryLang/Int: Survey of QL(Pool.C&L)• Tools:Wrap corba w/xml• Deliverables: longer term

Page 10: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Schedules

• Monthly meeting Last Tues of month at 8:30/14:30/15:30 First: May 25. H323: 8272634

• Mailing list (Paul)

Page 11: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Metadata for the Common Physicist

A working group on metadata with representatives from ATLAS, BaBar, CDF, CMS, D0, and LHCB in cooperation with EGEE have identified overlapping user requirements that may be supported by common service implementations. Classes of metadata specific to each service and their relations are described. These include a set of use cases based on compilation of various HEP documents. These documents are used to inform interfaces in existing and planned services as described in metadata schema.

Emphasis is placed on the evolution of schema using keyword-value pairs that are then transformed into a normalised performant database schema. A report is made of self-description mechanisms, which coupled with updating processes, allow the APIs to remain static as the schema evolves. A presentation is made of the way use cases drive performance. Requirements are presented for the physical and logical arrangement of service implementations, dictating the degree to which the databases containing the metadata may be distributed or centralised. A set of existing monitoring tools expose the validity and completeness of the use cases for experiments in various stages of maturity. A survey of the query languages, web service interfaces and tools in use across the experiments is presented.

Page 12: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Future

• Work to deliverables

• Meet according to deadlines

• Workshops according to major deadlines

Page 13: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Rick

Julie

Solveig

Adam

Wyatt

Steve

Mog

Cait

Carm

Tony

Gav

Paul

1Use Cases x x x x L x

Services X x x L

Deployment x L x x

1Monitoring L x x

QueryLang/Int x x x L

1Security(&!£) x L x

Tools x L x x x

Deliverables L x

Page 14: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Use Cases

• CDF5858: physicist use case (Rick)• HEPCAL II (Solveig,Tony)

– Production– Analysis– ADA: Atlas catalogs – David Adams(Steve)

• D0: Wyatt• Schema Update Document: use cases?

(Adam)

Page 15: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Services• Compare Arda and SAM approaches: Arda

architecture:Gavin

• Given Use cases: Define services

• List Services from SAM:Services to services

• Interfaces: The SAM service with one schema – the Grid services implemented in several schemas.

• Interfaces: Physics catalog impact from failure of lower level services. “file content status”.

• Action: outline models of access: physical/logical

• Discrete or related bits of functionality: dependencies between services.Performance implications on interfaces.

• Wyatt, Gavin, Rick, Julie

Page 16: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Deployment Architectures• Where do the services run? Application servers?

Tiers of applications and databases• Replication for HA. At what tier? Application or

DB? Oracle? Is it replication or mirroring.• What is the time constant for replication?• When do metadata become stale?Freshness date:

status bits.• Centralized catalogs as a single point of failure:

what are single points of failure.• HA strategies• Federation of metadata• Julie,Gavin,Paul,Solveig

Page 17: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Tools

• DB: jdbc,phpi,text, mysql, msql, oracle,xml,soap,python

• Dbserver• Tools on top of *sql.• Relation to deployment architectures: db access

directly or application server.• Replication• Data Virtualization• Rick, Gavin, Solveig, Adam,Julie

Page 18: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Query Languages and Interfaces

• SQL

• Chains and Links (rick)

• General Dimensions (Wyatt)

• Queries against multiple databases. Related to deployment architecture (dimensions, c&l,SBIR II/enth)

• POOL (Carmine)

Page 19: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Monitoring

• Sam TV (Adam)

• Mining and instrumenting (Caitriana)

• MonAlisa

• File access patterns

• stats

Page 20: Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.

Security

• Table Access in a distributed architecture

• Server to Server security

• Access to the Server by the user

• A standard certification protocol

• VOMs

• Spitfire security