Lehrstuhl Informatik III: Datenbanksysteme Astrometric Matching - E-Science Workflow 1 Lehrstuhl...

Post on 26-Mar-2015

217 views 3 download

Tags:

Transcript of Lehrstuhl Informatik III: Datenbanksysteme Astrometric Matching - E-Science Workflow 1 Lehrstuhl...

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow 1

Astrometric Matching - E-Science Workflow

Lehrstuhl Informatik III:1Datenbanksysteme1Fakultät für Informatik1Technische Universität München

Max-Planck-Institut2für Astrophysik

Max-Planck-Institut3für extraterrestrische3Physik

Full paper at the:

2nd IEEE International Conference on e-Science and Grid Computing, Dec. 4-6, 2006, Amsterdam

2

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

The SED Scenario

1. Catalog query

2. Astrometric (spatial) matching

3. Assembly of raw photometry

4. Photometric transformation

5. SED classification

3

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Astrometric (Spatial) Matching

Current solutions … … load all data into main memory

Uses a lot of memory Infeasible if memory size is insufficient

… process all data at once and deliver the complete result at the end Inefficient No results until all processing has completed

4

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Our Contributions

In-network processing Early filtering Parallelization Pipelining (data streaming) Load-balancing

Mobile user-defined operators Dynamic integration Extensible framework Legacy applications

5

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

The StarGlobe Architecture

Super-Peer BackboneQuery 1

Stream 0

Publish

Subscribe

6

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Mobile User-Defined Operators

Infrastructure provided by StarGlobe encapsulated operators provided by community

Load user-defined operators from function provider servers in the network

Common interface for integrating external operators

Flexibility

7

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Communication between Stream Processor and Stream Iterator

8

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Astrophysical Example Workflow

Input ListRASS-BSC

2MASS FIRST USNOB1

NVSS GSC-2

SED assembly

9

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Astrophysical Example Workflow (Setup)

Input ListRASS-BSC

2MASS FIRST USNOB1

NVSS GSC-2

SED assembly

10

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Distributed Query Evaluation Plan

χ²filter-0

join-0

enrichσ-1

transform-1

stream-1

enrichσ-0

transform-0

stream-0

11

Lehrstuhl Informatik III: Datenbanksysteme

Astrometric Matching - E-Science Workflow

Conclusion

StarGlobe Prototype Handling large data volumes efficiently

Early filtering, parallelization, pipelining Returning first results early on

Pipelining (data streaming) Flexible support of domain-specific application logic

Mobile user-defined operators

Results also applicable to other domains