eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ?...
Transcript of eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ?...
![Page 1: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/1.jpg)
1
eScience
Presenters:
Tai Tri Nguyen 10070939
Thang Quyet Nguyen 10070940
April – 11 - 2011
![Page 2: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/2.jpg)
2Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 3: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/3.jpg)
3Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 4: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/4.jpg)
4Grid Computing
What is eScience ?
• eScience: the e isn‟t an abbreviation. The e ineScience is hard, complex and difficult. eScience ina manner can be known as to aim to “open science”.
• Endless discussed problem:
Socio Economic >< Science share culture
• “eScience is about global collaboration in key areasof science and the next generation of infrastructurethat will enable it”
created by John Taylor,
the Director General of the UK's Office of Science and Technology in 1999
was used to describe a large funding initiative starting in November 2000
![Page 5: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/5.jpg)
5Grid Computing
What is eScience ?
• eScience is not just high bandwidth communication andHPC (High Performance Computers) running simulationslinked through “the GRID”
• eScience is about exploiting digital technology to supportall aspects of scientific activity
• eScience is about support for large-scale sciencethrough distributed global collaborations
• eScience is about formation of virtual co-laboratoriesallowing scientists to work together irrespective oflocation
– Universal access to scientific resources
– Support for scientific community
![Page 6: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/6.jpg)
6Grid Computing
What is eScience ?
Understand eScience in general aspect:
• e-Science is all about furthering technology in order toadvance the scientific discipline.
• If scientific research is to go above and beyond, andreach heights that we would not have thought possiblebefore, then e-Science will be the infrastructure to pavethe way.
• There is a large group of professionals and researcherswho are currently working to make sure scientists areable to reach their goals
• eScience an global collaboration infrastructure forscience life
![Page 7: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/7.jpg)
7Grid Computing
What is eScience ?
e-Science (UK) and Cyberinfrastructure (US)
• “e-Science is about global collaboration in key areas of science and the next generation of [computing] infrastructure that will enable it."
John Taylor, Director Office of Science and Technology, UK
• "Cyberinfrastructure is the coordinated aggregate of software,hardware and other technologies, as well as human expertise,required to support current and future discoveries in science andengineering. The challenge of Cyberinfrastructure is to integraterelevant and often disparate resources to provide a useful, usable,and enabling framework for research and discovery characterized bybroad access and 'end-to-end' coordination.“
Fran Berman, San Diego Supercomputer Center, UCSD
![Page 8: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/8.jpg)
8Grid Computing
What is eScience ?
e-Science Grid in UK
![Page 9: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/9.jpg)
9Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 10: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/10.jpg)
10Grid Computing
Why eScience ?
• The scientific imperative
new modes of scientific inquiry
data-intensive science
simulation-based science
remote access to experimental apparatus
virtual community science
• The industrial imperative
![Page 11: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/11.jpg)
11Grid Computing
The scientific imperative
New modes of scientific inquiry
Data-intensive science:
The way of researching changes from few data, lots of thinking, to …
NOW: Lots of Data & Analysis
eScience is driven by data Data-driven scientific discovery!
You are here …
The Data Deluge
![Page 12: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/12.jpg)
12Grid Computing
The scientific imperative
Data-intensive science:
LHC(Large Hadron Collider) 60TB/day Apache Point Telescope SDSS 15TB/day
Large Synoptic Survey Telescope 30TB/day Illumina Genome Analyzer 1TB/day
![Page 13: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/13.jpg)
13Grid Computing
The scientific imperative
Simulation-based science
Numerical simulation represents another new problem-solving methodology which physical experiments cannot easily be performed but computational simulations are feasible.
• The Japanese Earth Simulator:
Allowing simulations to be performed at an unprecedented 10-km horizontal resolution and generating many tens of terabytes of data in a single run
• The UK Comb-e-Chem project:
http://www.combechem.org/
The goal of this project is to “synthesize” large numbers of new compounds by high-throughput combinatorial methods and then map their structure and properties.
![Page 14: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/14.jpg)
14Grid Computing
The scientific imperative
Simulation-based science
• U.S. Encyclopedia of Life (EOL) project
http://www.eol.org/
Intent to document all of the 1.8-1.9 million livingspecies known to science. It aims to build one "infinitelyexpandable" page for each species, including video,sound, images, graphics, as well as text.
Seeks to produce a database of putative functional and 3D structure assignments for all known publicly available complete or partial genomes.
![Page 15: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/15.jpg)
15Grid Computing
The scientific imperative
Remote Access to Experimental Apparatus
- The emergence of high-speed networks facilitate tointegrate the experimental apparatus into the scientificproblem-solving process.
- Earthquake Engineering Simulation (NEES)
http://nees.org/
Is an ambitious national program whose purpose is to advancethe study of earthquake engineering and to find new ways toreduce the hazard earthquakes represent to life and property
Collaborative tools aid (middleware) in experiment planningand allow engineers at remote sites to perform teleobservationand teleoperation of experiments, and enable access tocomputational resources and open source analytical tools forsimulation and analysis of experimental data
![Page 16: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/16.jpg)
16Grid Computing
The scientific imperative
Remote Access to Experimental Apparatus
Sharing engineering research equipment, data resources, and leading
edge computing resources.
![Page 17: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/17.jpg)
17Grid Computing
The scientific imperative
Virtual community science
• The global collaborative will lead to create a virtual community science.
• The most significant impact of Grid technologies on science may be global virtual communities of scientists able to address the fundamental problems of today and tomorrow.
• The Grid as an Enabler for Virtual Organisations.
“Virtual Organization: A set of individuals and/or institutions defined by such sharing rules. This concept is becoming fundamental to much of modern computing”
![Page 18: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/18.jpg)
18Grid Computing
The industrial imperative
The new model: on-demand computing !!!
Computational resources on demand
![Page 19: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/19.jpg)
19Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 20: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/20.jpg)
20Grid Computing
New infrastructure for eScience
Requires major investments in physical infrastructure (petabyte archival storage, terabit networks, sensor networks, teraopsupercomputers), software infrastructure (Grid middleware, collaboratories), and new application concepts and software
Governments are realizing the importance of these investments as a means of enabling scientific progress and enhancing national competitiveness
Development based on Grid infrastructure
![Page 21: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/21.jpg)
21Grid Computing
New infrastructure for eScience
![Page 22: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/22.jpg)
22Grid Computing
New infrastructure for eScience
Challenges ( by Tony Hey Director of UK e-Science Core Program [email protected])
• Building a Future Infrastructure
- Developing a Semantic Grid
- Trusted Ubiquitous Systems
- Rapid Customized Assembly of Services
- Autonomic Computing : self-managing characteristics of distributed computing resources, adapting to unpredictable changes whilst hiding intrinsic complexity to operators and users
• Putting the Infrastructure to work
- Support for New Forms of Community
- Socio-Economic Impact
- Collaboratory Intellectual Properties Register and legal issues
![Page 23: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/23.jpg)
23Grid Computing
New infrastructure for eScience
A eScience Grid based framework
![Page 24: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/24.jpg)
24Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 25: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/25.jpg)
25Grid Computing
The future holds for e-Science
• Innovation
It is the general consensus that the technology of tomorrow must beready to meet the inspirational thinking of scientists.
• Business
There is a desire not only to make the technology of e-Scienceavailable to scientists, but also commercial entities, such asengineers.
• Collaboration
Partnership is a vital element to the development of better storagefacilities and the enhancement of Grid infrastructures.
![Page 26: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/26.jpg)
26Grid Computing
The future holds for e-Science
• Complex ideas
Scientists are trying to research things that they have never eventouched upon before. E-Science and its conglomerates are wellaware that advances in information technology are the only wayforward for the advancement of science.
• Education
If e-Science is to improve its image and further its impact uponscience, then it is essential that the students of tomorrow are trainedin the use of advanced computing technology.
• International development
It is essential to the future success of e-Science that its methods andtechnology are used across the globe, not just within UK.
![Page 27: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/27.jpg)
27Grid Computing
UK eScience projects
• GRIDPP (PPARC)
• ASTROGRID (PPARC)
• Comb-e-Chem (EPSRC)
• DAME (EPSRC)
• DiscoveryNet (EPSRC)
• GEODISE (EPSRC)
• myGrid (EPSRC)
• RealityGrid (EPSRC)
• Climateprediction.com (NERC)
• Oceanographic Grid (NERC)
• Molecular Environmental Grid (NERC)
• NERC DataGrid (NERC + OST-CP)
• Biomolecular Grid (BBSRC)
• Proteome Annotation Pipeline (BBSRC)
• High-Throughput Structural Biology (BBSRC)
• Global Biodiversity (BBSRC)
• Biology of Ageing (BBSRC + MRC)
• Sequence and Structure Data (MRC)
• Molecular Genetics (MRC)
• Cancer Management (MRC + PPARC)
• Clinical e-Science Framework (MRC)
• Neuroinformatics Modeling Tools (MRC)
• MIASGRID (OST-CP)
• AKTing (OST-CP)
• EquatorGrid (OST-CP)
• DIRCGrid (OST-CP)
• MB-NG (OST-CP/PPARC)
• UK EDG (OST-CP/PPARC)
• OGSA-DAI (OST-CP)
![Page 28: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/28.jpg)
28Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 29: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/29.jpg)
29Grid Computing
Astronomy Grid Application
• Why choose astronomy application:
– Scale of data: tetrabyte now, petabyte soon.
– Datasets are distributed.
– Modern data are carefully peer reviewed and collected with rigorous statistical and scientific standards.
– Data provenance is tracked and derived datasets are curated fairly carefully.
– Most data are publicly available and will remain available for the foreseeable future.
– Old data (may be less accurate) are essential to study time-varying phenomena.
– Can not download full copy of each archive for local processing but request small subset from each archive.
![Page 30: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/30.jpg)
30Grid Computing
The Virtual Observatory
• Also called the World-Wide Telescope.
• Wiki Definition
– Virtual Observatory (VO) is a collection of interoperating data archive and software tools which utilize the internet to form a scientific research environment in which astronomical research programs can be conducted.
• Functions:
– Provide portals, protocols, and standards that unify the world‟s astronomy archives into a giant database containing all astronomy literature, images, raw data, derived datasets and simulation data
– Integrated as a single intelligent telescope.
![Page 31: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/31.jpg)
31Grid Computing
The Virtual Observatory
• IVOA (International Virtual Observatory Alliance) is a standard body created by VO projects to develop and agree the vital interoperability standards upon which the VO implementations are constructed.
• Examples
– AstroGrid: the UK‟s Virtual Observatory Service
– Ero-VO: the European VO
– National Virtual Observatory: the USA‟s VO
– Virtual Observatory, India.
– Iran Virtual Observatory.
![Page 32: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/32.jpg)
32Grid Computing
The Virtual Observatory
• Traditional model for publishing scientific data
– Authors:
• Individual or small group.
• Create the experiments that provide data.
• Write papers that contains and explain the data.
– Publishers:
• The scientific journals.
• Print papers.
– Curators:
• Organize and store the journals and make them available for consumers.
– Consumers:
• Scientists who want to use and cite the data in their own research.
Suitable only when all scientific data relevant for research could easily be included in the publication.
![Page 33: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/33.jpg)
33Grid Computing
The Virtual Observatory
• Publishing scientific data on astronomy
– The role of author belong to collaborations.
– Projects can be also publishers and curators.
– Take 5 to 10 years to build the experiment before the author start producing data.
– Data volume is so large to be contained in a journal.
– Data published to the collaboration through Web-based archives.
– Consumers must deal with data from many sources.
![Page 34: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/34.jpg)
34Grid Computing
The Virtual Observatory
• Metadata and provenance
– Important to capture the detail of how the data were derived and calibrated.
– UDC (unified content descriptor): words in a compressed dictionary derived by automatically detecting the most commonly used terms in over 150000 tables in the astronomical literature.
– Find common and comparable attributes in different archives
![Page 35: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/35.jpg)
35Grid Computing
Web Services
• Access data from a VO:
– Most of data will be remote data access need to be as transparent as if it were local.
– Remote data volume may be huge move as much the data processing as near the data as possible.
– Data may be extracted from databases by a query
– Data may not exist at time of request.
![Page 36: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/36.jpg)
36Grid Computing
Web Services
• Web services in the Virtual Observatory
– The core services can be combined into more complex portal to:
• Talk to several services
• Create more complex results.
– Modular components, standard interfaces, and access to commercially built toolkits for the lowest level communication tasks.
– Need to carefully define VO framework and core services that provides clear standards, interfaces, documentation and reference implementations.
![Page 37: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/37.jpg)
37Grid Computing
Hierarchical architecture
• A
![Page 38: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/38.jpg)
38Grid Computing
Hierarchical Architecture
• Archive
– Refer to a collection of historical records, as well as the physical place they are located.
– Store text, images, and draw data in blobs or files and store their schematized data in relational databases.
– Provide data mining tools to allow easy search and sub-setting of the data objects at each archive.
– Provide web service interface for on-demand queries.
– Provide a file transfer service for answers that involve substantial computation or data transfer.
– Contains metadata about their contents (both physical units and the provenance of the data).
![Page 39: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/39.jpg)
39Grid Computing
Hierarchical Architecture
• Web services
– Support a common core schema that extends the VOTabledata model.
– The VO Table specify:
• A standard coordinate system, standard representations for core astronomical concepts
• Standard ways to represent both values and error
– Built on top of SOAP and XML Schema Definitions (XSDs).
– Most are interactive tasks that extract data on demand for portals and for interactive client tools.
![Page 40: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/40.jpg)
40Grid Computing
Hierarchical Architecture
• Registries:
– One or more registries are declared in each archive.
– Record what kinds of information the archive provides
– Be widely replicated and given the overlaps of astronomy with other disciplines.
– Be used by portals: serve answers user queries by integrating data from many archives.
• Portals:
– Use registries to serve to answer user queries by integrating data from many archives.
– Individuals may build their own custom portals to solve particular problems.
– Sample portals: MAST, GLU, AstroGrid, SkyQuery.
![Page 41: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/41.jpg)
41Grid Computing
Hierarchical Architecture
• Portal Example: SkyQuery
– Integrate 5 different Web services: SDSS, 2MASS, Faint Images, the Isaac Newton Telescope Wide Field Survey and Image web services.
– Archives located on 2 continents at several geographic locations.
– Accepts queries specifying the desired object properties.
– Decide which archives have relevant data (by querying each of them) and calculate an optimal query plan to answer the question.
– Resulting answer set is delivered to the user in tabular form along with images of the object.
– Its self a web service.
– Can be used as a component of some other portals.
– Built using SQL and the .NET tools
![Page 42: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/42.jpg)
42Grid Computing
Hierarchical Architecture
• Astronomy application:
– Require access at the granularity of objects rather than entire files.
– Data resides in read-intensive database, accessed by associative query interface.
– Comparing observation requires access and compare individual records in several different archives.
– The use of spatial and other indices, a heavy use of databases.
– Access control must be addressed but less important.
– Resource management is important.
![Page 43: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/43.jpg)
43Grid Computing
Hierarchical Architecture
• Data, networking and computation economics
– All data can be kept online.
– The data and derived products were collected at great expemse should safty stored at 2 or more locations.
– If a query is small, just be sent to one of the archive servers.
– If a query exceed the limit, some planning is required.
![Page 44: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/44.jpg)
44Grid Computing
The Virtual Observatory and the Grid
• Compute-intensive tasks
– Transformation of raw instrument data into calibrated and cataloged data
– Software constantly refined and improved old data need to be reprocessed about once a year.
![Page 45: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/45.jpg)
45Grid Computing
The Virtual Observatory and the Grid
• Data mining and statistics of tetrabytes
– Correlation algorithm involves the computations of pairwisedistances.
– Typical matrix sizes today are in range 10000^2 to 1000000^2.
– Even N log N algorithms are infeasible for datasets involving billions of objects the use of approximate and heuristic algorithms.
![Page 46: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/46.jpg)
46Grid Computing
Agenda
What is eScience ?
Why eScience ?
New infrastructure enabling for eScience
The future holds for e-Science
Summary
Scientific Data Federation
![Page 47: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/47.jpg)
47Grid Computing
Summary
e-Science and the Grid
„e-Science will change the dynamic of the way science isundertaken.‟
John Taylor, 2001
‘[The Grid] intends to make access to computing power,scientific data repositories and experimental facilities aseasy as the Web makes access to information.’
Tony Blair, 2002
![Page 48: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/48.jpg)
48Grid Computing
Reference
1. Ian Foster and Carl Kesselman, The Grid 2 Blueprint for a New Computing Infrastructure. Morgan Kauffman Publishers, 2004.
2. NSF office of Cyberinfrastructurehttp://www.nsf.gov/dir/index.jsp?org=OCI
3. A group of UK eSciencehttp://www.escience-grid.org.uk/
4. Collaborative Research in e-Science and Open Access to Information- Paul A. David Stanford University - Matthijs den Besten Oxford e-Research Centre - Ralph Schroeder Oxford Internet Institute – Spring 2009-SIEPR Discussion Paper No. 08-21
5. Computer Challenges to emerge from eScience -Talk- Malcolm Atkinson (NeSC), Jon Crowcroft (Cambridge), Carole Goble (Manchester), John Gurd(Manchester), Tom Rodden (Nottingham),Nigel Shadbolt (Southampton), Morris Sloman (Imperial College), Ian Sommerville (Lancaster), Tony Storey (IBM)
6. The Encyclopedia Wikipediahttp://en.wikipedia.org/wiki/
7. National e-Science centre:
http://www.nesc.ac.uk/action/esi/
![Page 49: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/49.jpg)
49Grid Computing
![Page 50: eScienceptvu/gc/2011/pub/GridandeScience... · 2016. 11. 4. · Grid Computing 6 What is eScience ? Understand eScience in general aspect: •e-Science is all about furthering technology](https://reader035.fdocuments.us/reader035/viewer/2022062609/60f88ae55800dd2b5372b0b2/html5/thumbnails/50.jpg)
50