From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories

22
eScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST

description

From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories. R. Chris Smith NOAO/CTIO, LSST. Challenges for the Operational VO. Providing Content - PowerPoint PPT Presentation

Transcript of From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories

Page 1: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

From Photons to Petabytes:Astronomy in the Era of Large Scale Surveys and Virtual Observatories

R. Chris SmithNOAO/CTIO, LSST

Page 2: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

Challenges for the Operational VO Providing Content

capturing and archiving data from diverse instruments, AND capturing metadata (system & science) to make that data useful

Providing Access implementing the VO standards and services, plus network

infrastructure, needed for wide access to the content Ensure not only access, but long-term support and

documentation of datasets & metadata (curation) Providing User Interfaces and Tools

developing and operating user interfaces which enable effective scientific use of ALL of the distributed resources of the VO

Page 3: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

A Case Study:NOAO Data Management Management of data from all NOAO and some

affiliated facilities = CONTENT 3 mountaintops (Cerro Tololo, Cerro Pachon, Kitt Peak) 11 telescopes More than 30 instruments

Virtual Observatory “back end” = ACCESS Provide effective access to large volume (TBs to PBs) of

archived ground-based optical & infrared data and data products through VO standard interfaces and networks

Virtual Observatory “front end” = UI and TOOLS Enable science by developing VO user interfaces, tools,

and services to work with distributed data sources and large volumes of data

Page 4: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

Page 5: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

BIG Question: How does this model SCALE?

Capturing, moving, & processing the data Making the data AVAILABLE through VO

interfaces Making the data USEFUL for scientific analysis

Why do we worry about scaling?

Page 6: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

Turning Photonsinto Petabytes Today

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

MOSAIC, WFI, IMACS: 64 Mpix cameras ~10 to 20 GB/night

Builds up quickly! in only 3 years of two MOSAIC cameras ~20TB raw data ~40-60TB processed

IMACS image, Las Campanas Observatory (Danny Steeghs, Jan'04)

Page 7: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

Coming Soon: Dark Energy Camera

Focal Plane:• 64 2K x 4K detectors

• Plus guiding and WFS• 530 Mpix camera

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 8: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

The Data:Dark Energy Survey Each image = 1GB 350 GB of raw data / night Data must be moved to supercomputer center

(NCSA) before next night begins (<24 hours) Need >36Mbps internationally

Data must be processed within ~24 hours Need to inform next night’s observing

Total raw data after 5 yrs ~0.2 PB TOTAL Dataset 1 to 5 PB

Reprocessing planned using TeraGrid resources

Page 9: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST: The Large Synoptic Survey Telescope

Survey the entire sky every 3 to 5 nights, to simultaneously detect and study: Dark Matter via Weak gravitational lensing Dark Energy via thousands of SNe per year Potentially hazardous near earth asteroids Tracers of the formation of the solar system Fireworks in the heavens – GRBs, quasars… Periodic and transient phenomena ...…the unknown

Massively PARALLEL Astronomy

Page 10: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST: The Instrument

8.2m telescope Optimized for WIDE

field of view

3.5 degree FOV 3.5 GIGApixel camera

Deep images in 15s Able to scan whole

sky every 3 to 5 nights

Page 11: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST: Deep, Wide, FastField of view (FOV)

KeckTelescope

0.2 degrees

10 m

3.5 degrees

LSST

Page 12: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST Site: Cerro Pachon, Chile

Soar

Gemini

LSST ~1.5m caltelescope

Support

LSST site plan

ElPenon

Gemini (South)SOAR

Page 13: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST: Distributed Data Mgmt

Long-Haul CommunicationsData transport & distribution

Base FacilityReal time processing

Mountain Sitedata acquisition, temp. storage

Archive/Data Access CentersData processing, long term storage, & public access

Page 14: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST: The Data Flow Each image roughly 6.5GB Cadence: ~1 image every 15s 15 to 18 TB per night

ALL must be transferred to U.S. “data center” Mtn-base within image timescale (15s), ~10-20Gbps Internationally within <24 hours, >2-10Gbps

REAL TIME reduction, analysis, & alerts Send out alerts of transient sources within minutes Provide automatic data quality evaluation, alert to

problems Processed data grows to >100TB per night!

Just catalogs = Petaybytes per year!

Page 15: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

LSST Needs

Computing Requirements by Year

0.0

50.0

100.0

150.0

200.0

250.0

300.0

2014 2016 2018 2020 2022

Year

Tera

_Flo

ati

ng P

oin

t O

pe

rati

on

s (

TF

)

Science/OperationsSparesTransientsRed. ImagesDQ AnalysisQueriesDeep Det.RoutineNightlyInitial

ArchiveCenter

Base

Data AccessCenter

Page 16: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

Turning Photonsinto Petabytes: Summary Today, ~10 to 20 GB/night

MOSAIC, WFI, IMACS: 64 Mpix cameras Soon, ~300 to 500 GB/night

VISTA: 67 Mpix camera VST: 256 Mpix camera DECam/DES: 520 Mpix camera

On the horizon, ~15 TB/night LSST Project: 3 Gpix camera

And these are just survey instruments in Chile!

Page 17: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

DES, LSST, … the REST of the Science?

Ongoing (MOSAIC, WFI, IMACS) and future (DES, LSST, etc.) projects will provide PETABYTES of archived data

Only a small fraction of the science potential will be realized by the planned investigations

How do we maximize the investment in these datasets and provide for their future scientific use?

Page 18: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

VO ChallengesProvider Perspective

How do we effectively capture, transport, and manage Petabytes of data? Need advanced IT infrastructure

How do we provide effective access to Petabytes of data? Need advanced data mining interfaces

Fundamentally IT challenges, in support of the astronomical community

Page 19: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

VO ChallengesScientific Perspective Data Discovery

From those Petabytes, what data exists that might be useful to help address my scientific query?

Data Understanding Which data are best suited for my analysis?

Data Movement How do I get the data from where it is to where it is most

useful?

Data Analysis How do I extract the information I need from the data?

Page 20: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

NVO portal @ NOAO Focus on Scientific USER

4 Keys: Data Discovery, Data Understanding, Data Access, Data Analysis

First focus on supporting data DISCOVERY Discovery in spatial coordinates: NOAO Sky Discovery in temporal coordinates: Timeline

NOAO NVO portals: http://nvo.noao.edu

And for South America… http://nvo.ctio.noao.edu Foundation for exploring partnerships with S.A.

communities

Page 21: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

Summary:VO Challenges In Infrastructure

Collect and maintain petabytes of content Provide for effective access, including networks,

hardware, and software In User Interaction

Provide effective user interfaces Support distributed analysis

Support large queries across distributed DBs Support statistical analysis and processing across

distributed resources (Grid processing & storage) TOOLS & SERVICES to enable SCIENCE

Page 22: From Photons  to Petabytes: Astronomy in the Era of Large Scale Surveys and  Virtual Observatories

eScience May 2007

How?Strategic Partnerships

In Local Systems Vendors: Local Storage, Processing, Servers

In Remote Systems Distributed computer centers to provide bulk storage, large

scale processing Linked together for Grid processing, Grid storage

In Connectivity High-speed national and international bandwidth

Scientific VO Partners to develop standards, provide tools (IVOA) Developing tools and services optimized for scientific

analysis over large datasets (e.g., statistical methods)