November 18
-
Upload
cameroon45 -
Category
Technology
-
view
361 -
download
0
Transcript of November 18
Scientific Databases Lecture:Virtual Observatories for Space Science
Dr. Kirk Borne, GMU SCS
November 18, 2003
GMU CSI 710
11/18/2003 Virtual Observatories for Space Science 2
Outline
• Quick Review of Astronomy Data • The National Virtual Obseratory (NVO)• Other Virtual Observatories for Space Science• Why Virtual Observatories?• NVO – It’s all about the Science:
– IT-enabled, Science-enabling
• The Enabling Computational Science Technologies for the NVO – where you can help!
• Distributed Data Mining in the NVO
11/18/2003 Virtual Observatories for Space Science 3
The Nature of Astronomical Data
• Imaging– 2D map of the sky at multiple wavelengths
• Derived catalogs– subsequent processing of images
– extracting object parameters (400+ per object)
• Spectroscopic follow-up– spectra: more detailed object properties
– clues to physical state and formation history
– lead to distances: 3D maps
• Numerical simulations
• All inter-related!
11/18/2003 Virtual Observatories for Space Science 4
NOAO Deep Wide-Field Survey: http://www.noao.edu/noao/noaodeep/
11/18/2003 Virtual Observatories for Space Science 5
NOAO Deep Wide-Field Survey: http://www.noao.edu/noao/noaodeep/
11/18/2003 Virtual Observatories for Space Science 6
NOAO Deep Wide-Field Survey: http://www.noao.edu/noao/noaodeep/
11/18/2003 Virtual Observatories for Space Science 7
NASA Astronomy Mission Data:the tip of the data mountain
http://nssdc.gsfc.nasa.gov/astro/astrolist.html
NSSDC’sastrophysicsdataholdings:
One of manyscience datacollectionsfor astronomyacross the USand the world!
NSSDC =NationalSpace ScienceData Center@ NASA/GSFC
11/18/2003 Virtual Observatories for Space Science 8
“Quote of the day”
• “It's just as unpleasant to get more than you expected as it is to get less.”
– George Bernard Shaw
11/18/2003 Virtual Observatories for Space Science 9
Why so many Telescopes? …
Many great astronomicaldiscoveries have comefrom inter-comparisonsof various wavelengths:- Quasars- Gamma-ray bursts- Ultraluminous IR galaxies- X-ray black-hole binaries- Radio galaxies- . . .
Overlay
Because …
11/18/2003 Virtual Observatories for Space Science 10
Therefore, our science dataarchive systems should enablemulti-wavelength interdisciplinarydistributed database access, discovery, mining, and analysis.
11/18/2003 Virtual Observatories for Space Science 11
How does one integrate and use these distributed data archives? …
11/18/2003 Virtual Observatories for Space Science 12
Emerging Computational Environment
• Standardizing distributed data– Web Services, supported on all platforms
– Custom configure remote data dynamically
– XML: Extensible Markup Language
– SOAP: Simple Object Access Protocol
– WSDL: Web Services Description Language
– UDDI: Universal Description, Discovery and Integration
• Standardizing distributed computing– Grid Services
– Custom configure remote computing dynamically
– Build your own remote computer, use it, then discard it
– Virtual Data: new data sets on demand
11/18/2003 Virtual Observatories for Space Science 13
…The National Virtual Observatory (NVO)
• National Academy of Sciences “Decadal Survey” recommended
NVO as highest priority small (<$100M) project :
“Several small initiatives recommended by the committee span both ground
and space. The first among them—the National Virtual Observatory (NVO)—
is the committee’s top priority among the small initiatives. The NVO will
provide a “virtual sky” based on the enormous data sets being created now
and the even larger ones proposed for the future. It will enable a new mode of
research for professional astronomers and will provide to the public an
unparalleled opportunity for education and discovery.” (p.14)
11/18/2003 Virtual Observatories for Space Science 14
Why is it Virtual?
A Virtual Data System : has multiple components is (geographically) distributed is interoperable provides seamless user access to
distributed data system components provides “one-stop shopping” for data
end-user
11/18/2003 Virtual Observatories for Space Science 15
Why is it Necessary?
To maximize cross-enterprise multi-institutional resources
To minimize duplication of effort
To streamline operations through shared development
To serve multiple user communities To facilitate simultaneous data mining, knowledge discovery,
and information retrieval from multiple distributed data collections
Because data volumes are huge & growing rapidly ... For example, in Astronomy : a few terabytes "yesterday” (10,000 CDROMs) tens of terabytes "today” (100,000 CDROMs) petabytes "tomorrow" (within 10 years) (100,000,000 CDROMs)
11/18/2003 Virtual Observatories for Space Science 16
National Virtual Observatoryhttp://www.us-vo.org/
NVO is a concept. It was recommended by the Astronomy Decadal Survey Committee to the National Academy of Sciences. Currently funded by NSF ($10M Information Technology Research grant); and NASA next year(?).
NVO is not just “National”. It is actually “Global”: http://www.ivoa.net/
Will link geographically distributed astronomical data archives and information resources = provides “one-stop shopping” for data end-user
Will be heterogeneous, interoperable, and federated (autonomy maintained at local sites) … therefore, we are using XML and Web Services.
Requires middleware standards for : metadata, resource descriptions (including the Dublin Core), queries, query results, the data (including the Data Model – see next slide), and semantics (… we are using Unified Content Descriptors = UCDs).
Requires innovative computational science technologies for :
data discovery, data mining, data fusion, distributed querying, and code-shipping (“Ship the code, not the data”)
11/18/2003 Virtual Observatories for Space Science 17
Virtual Observatory Data Model
A data model is the structure in which a computer program stores persistent information.
11/18/2003 Virtual Observatories for Space Science 18
11/18/2003 Virtual Observatories for Space Science 19
VxO: becoming an operational system (high TRL)
• What is a VxO?
– Virtual “anything” Observatory – where “anything” currently includes
Astrophysical, Solar, Magnetospheric, Heliospheric, Ionospheric, …
• Summary statement for any VxO …Researchers should be able to find and access seamlessly all existing data relevant to the research they are considering, that data should be independently and correctly useable, and that data should be available in useful ways and in useful contexts.
• Without exception, full VxO efforts aim in this direction by providing multi-mission data access and easy browse functionality.
11/18/2003 Virtual Observatories for Space Science 20
(Trajectories)(Trajectories)
Tools & Services
Science Data Facility
Acquisition & Ingest
Science User Support
http://spdf.gsfc.nasa.gov/
ModelsWeb
CDAWLibHelioWeb
Capabilities of Space Physics Science Databases.The VxO Challenge: to Integrate Data, Tools, Services
11/18/2003 Virtual Observatories for Space Science 21
How do Space Science Databases Change in a Future that has an Increasingly
Rich/Robust VxO Framework?
• One definition for this VxO framework could be …
"The distributed implementation of an integrated space sciences data environment"
• The broad goals of the data centers don't fundamentally change with this definition.– They still must enable new science by adding unique value to the Space Science research
community through strong multi-discipline and cross-discipline data resources, with unique services tied to unique databases.
• These services (data, functions, software) should (and will) be increasingly supplied as a key element of that new, broader VxO environment.
– Logically, the data center’s services eventually become consumers as well as providers.
– Visible early user impact of VxO is critical.
• VxOs should develop a good long-term hybrid solution = PIs + missions/projects + Science Data Centers + (other) specialized services
11/18/2003 Virtual Observatories for Space Science 22
Science Data Formats – part of the glue
• Several key data formats are standard in space science: FITS (Astrophysics & Solar Physics), CDF and netCDF (Space Physics & Earth Science), HDF (Space Physics, Earth Science, & Computational Science).
• Why?
– These provide a baseline data format for all data sets in that discipline and in joint international projects.
– They provide the base for many data center services, data analysis tools, data integration tools, visualization packages.
– They are a key enabling technology for many different space missions and space science projects.
• Plans:
– Translation tools: from FITS <–> CDF <–> HDF <–> netCDF
– Substantial work on format translators via XML and XSLT.
11/18/2003 Virtual Observatories for Space Science 23
Interfaces to a VxO Environment• "Web Services" interface to existing data services
– Web Services interfaces and software libraries complement existing FTP and interactive user web interfaces.
– Web Services provides application-to-application interface, without human intervention.
– Web Services provides distributed data registry (WSDL), data/resource discovery (UDDI), and data services (SOAP).
– Scientific database services have unique scope and functionality that must be accessible by the VxO environment for it to gain user acceptance. • e.g., SOAP/XML interface for Space Physics data now enables 3-D
interactive graphics of distributed multi-mission data.
– Plans for data format translators and converters
11/18/2003 Virtual Observatories for Space Science 24
Why Virtual Observatories?
• Because:– The data are highly distributed.
– Multi-mission data lead to new discoveries.
– The data volumes are HUGE and growing.
– And maybe because of Augustine’s Law …
“Software is like entropy; it always increases.”
- Norman Augustine
11/18/2003 Virtual Observatories for Space Science 25
Szalay’s Law:The utility of N comparable datasets
increases as N2
• Metcalf’s Law: The value of a network scales as n2, where n is the number of nodes connected.
• Hagel & Armstrong’s Axiom: The aggregation of resources is more important than the amount of resources owned.
• Metcalf’s law applies to telephones, the Internet …• Szalay argues as follows:
– Each new dataset gives new information. – 2-way combinations give even more new information.
11/18/2003 Virtual Observatories for Space Science 26
Size of a Typical Archived Astronomical Data Repository
• Size of the archived data for an all-sky survey -- 40,000 square degrees is two Trillion pixels --– One band 4 Terabytes– Multi-wavelength 10-100 Terabytes– Time dimension 10 Petabytes– LSST project (10 yrs) ~100 Petabytes @ http://www.lsst.org/
All-sky distribution of526,280,881 stars fromthe MACHO survey.
11/18/2003 Virtual Observatories for Space Science 27
Ongoing Surveys of the Sky• Large number of new surveys
– multi-TB in size, 100 million objects or more
– individual archives planned, or under way
• Multi-wavelength view of the sky– more than 13 wavelength coverage in 5 years
• Impressive early discoveries– finding exotic objects by unusual colors
• L,T dwarfs, high-z quasars
– finding objects by time variability• gravitational microlensing
MACHO2MASSDENISSDSSGALEXFIRSTDPOSSGSC-IICOBE MAPNVSSFIRSTROSATOGLE ...
MACHO2MASSDENISSDSSGALEXFIRSTDPOSSGSC-IICOBE MAPNVSSFIRSTROSATOGLE ...
11/18/2003 Virtual Observatories for Space Science 28
Sloan Digital Sky Survey Data Productshttp://www.sdss.org/
• Full Data Collection ~20 TB
• Object catalog 400 GB parameters of >108 objects
• Redshift Catalog 1 GB parameters of 106 objects
• Atlas Images 1.5 TB 5 color cutouts of >108 objects
• Spectra 60 GB in a one-dimensional form
• Derived Catalogs 20 GB - clusters - QSO absorption lines
• 4x4 Pixel All-Sky Map 60 GB heavily compressed
11/18/2003 Virtual Observatories for Space Science 29
Highly ranked in Decadal ReviewHighly ranked in Decadal Review
Optimized for surveysOptimized for surveys
• scan modescan mode
• deep modedeep mode
7 square degree field7 square degree field
6.9m effective aperture6.9m effective aperture
2424thth mag in 20 sec mag in 20 sec
> 20 Tbytes/night> 20 Tbytes/night
Real-time analysisReal-time analysis
“Celestial Cinematography”
Simultaneous multiple science Simultaneous multiple science
goalsgoals
Large Synoptic Survey Telescope
11/18/2003 Virtual Observatories for Space Science 30
Large Mirror Fabrication(for large telescopes, such as LSST)
(Univ. of Arizona Mirror Laboratory)(Univ. of Arizona Mirror Laboratory)That’s big!
11/18/2003 Virtual Observatories for Space Science 31
NVO – It’s all about the Science
11/18/2003 Virtual Observatories for Space Science 32
Science Discovery - the Old Way
11/18/2003 Virtual Observatories for Space Science 33
Science Discovery - The New Way -Different!
Systematic data exploration
– will have a central role
– statistical analysis of the
“typical” objects
– automated search for the
“rare” events
The discovery process will rely heavily on distributed data access and multi-archive data mining.
11/18/2003 Virtual Observatories for Space Science 34
Conceptual Architecture for a Distributed Data Mining System
Data ArchivesData Archives
Discovery toolsDiscovery tools
Analysis toolsAnalysis toolsUser
Gateway
11/18/2003 Virtual Observatories for Space Science 35
The Discovery Process
discover significant patternsfrom the analysis of statistically rich and unbiased image/catalog databases
understand complex astrophysical systems via confrontation between data and large numerical simulations
Past: observations of small, carefully selected samplesof objects in a narrow wavelength band
Future: high quality, homogeneous multi-wavelengthdata on millions of objects, allowing us to
The discovery process will rely heavily on advanced visualization, data mining, and statistical analysis tools.
11/18/2003 Virtual Observatories for Space Science 36
The NVO in 5 words or less:
“The archive is the sky!”
11/18/2003 Virtual Observatories for Space Science 37
NVO: It is all about the Science
• There is a huge scientific interest in the new data collections --large sky surveys, multiple telescopes, multiple-wavelength coverage of the sky, time domain coverage ... And it is all available on-line from your desktop …
– “The archive is the sky!”• Something is needed to help scientists access, mine, and
explore these huge data collections.– 1 Terabyte at 10 Mbyte/s takes 1 day to transmit
– Hundreds of intensive queries and thousands of casual queries per-day
– Data will reside at multiple locations, in many different formats
– Existing analysis tools do not scale to Terabyte data sets
• Acute need in a few years; solution will not just happen.
11/18/2003 Virtual Observatories for Space Science 38
• Rare and exotic objects– Very high redshift quasars– Dark matter in the galactic halo– Time-variable objects, transient events:
distant supernovae and microlensing– Brown dwarfs– Variable stars– Asteroids...
• ...incoming!!
– Serendipity!
NVO Enables New Sciencehttp://www.us-vo.org/
11/18/2003 Virtual Observatories for Space Science 39
NVO Science Cases & Drivers(from Aspen 2001 NVO Workshop)
Solar System : NEOs, Long-Period Comets, TNOs, Killer Asteroids!!! The Digital Galaxy : Find star streams and populations -- relics of past/present
assembly phase. Identify components of disk, thick disk, bulge, halo, arms, ?? The Low-Surface Brightness Universe : spatial filtering, multi-wavelength
searches, intersection of the image and catalog domains Panchromatic Census of AGN (Active Galactic Nuclei) : Complete sample of
the AGN zoo, their emission mechanisms, and their environments Precision Cosmology & Large-Scale Structure : Hierarchical Assembly History
of Galaxies and Structure, Cosmological Parameters, Dark Matter and Galaxy Biasing as f(z)
Precision science of any kind that depends on very large sample sizes "Survey Science Deluxe" Search for rare and exotic objects (e.g., high-z QSOs, high-z Sne, L/T dwarfs) Serendipity : Explore new domains of parameter space (e.g., time domain, or
"color-color space" of all kinds)
11/18/2003 Virtual Observatories for Space Science 40
Enabling Computational Science Technologies for the NVO
11/18/2003 Virtual Observatories for Space Science 41
Major Functions of the NVO and the related Enabling Computational Science Technologies
To facilitate data mining and knowledge discovery within the very large astronomical databases -- Requires: indexing for fast queries, filtering of large queries, data
subsetting, visualization, parallelization (queries, access), ...
To facilitate linkages and cross-archive investigations -- Requires: distributed computing, scalable architectures, load balancing, thin
middleware layer, interoperability, code libraries, code-shipping, data-finding services, data standards & interchange formats, query/results protocols, data fusion, quality assessment, archive/metadata profiles, user profiles, intelligent agents, ...
To serve a broad community of users (professionals, amateur astronomers, schools, general public) -- must support thousands of queries per day
11/18/2003 Virtual Observatories for Space Science 42
Some General Challenges for NVO (and all Virtual Data Systems)
Data Discovery: Finding data within distributed data systems
Transparent User Access to Data: across heterogeneous environments
(Distributed) Data Mining and Analysis: of terabytes!
Interoperability: of systems, data, metadata, tools
New Technology Infusion: across multiple distributed systems
Sociology: "We don't need it" or "We already have it”
11/18/2003 Virtual Observatories for Space Science 43
How do you get all of these distributed
science databases working together?
Virtual Observatory team motto:
“It’s the middleware, stupid.”
11/18/2003 Virtual Observatories for Space Science 44
National Virtual Observatoryhttp://www.us-vo.org/
NVO is a concept. It was recommended by the Astronomy Decadal Survey Committee to the National Academy of Sciences. Currently funded by NSF ($10M Information Technology Research grant); and NASA next year(?).
NVO is not just “National”. It is actually “Global”: http://www.ivoa.net/
Will link geographically distributed astronomical data archives and information resources = provides “one-stop shopping” for data end-user
Will be heterogeneous, interoperable, and federated (autonomy maintained at local sites) … therefore, we are using XML and Web Services.
Requires middleware standards for : metadata, resource descriptions (including the Dublin Core), queries, query results, the data (including the Data Model – see next slide), and semantics (… we are using Unified Content Descriptors = UCDs).
Requires innovative computational science technologies for :
data discovery, data mining, data fusion, distributed querying, and code-shipping (“Ship the code, not the data”)
11/18/2003 Virtual Observatories for Space Science 45
Tools for the NVO & other Virtual Data Systems
XML (eXtensible Markup Language) = "the language of interoperability" - ADC/XML Project was most comprehensive and advanced application of XML to NASA astrophysics data archives - including the XDF (eXtensible Data Format) and FITSML data standards [ http://xml.gsfc.nasa.gov/ ]
Comprehensive Data Mining Resource Guide for Large Scientific Databases - [follow the link at http://nvo.gsfc.nasa.gov/ ] "The trouble with facts is that there are so many of them."
- Samuel McChord Crothers, in "The Gentle Reader"
ISAIA (Interoperable Systems for Archival Information Access) : resource description profiles to enable access to distributed data providers
MOCHA (Middleware based On a Code-sHipping Architecture): middleware tools for search, retrieval, & data fusion from heterogeneous databases using heterogeneous interfaces - transparently federates distributed data access - "Ship the code, not the data“
The GRID! …
11/18/2003 Virtual Observatories for Space Science 46
What is The Grid?
• The GRID is “a distributed computing infrastructure that facilitates resource-sharing and coordinated problem-solving in dynamic, multi-institutional virtual organizations.”
http://www.globus.org/datagrid/
http://www.gridforum.org/
http://www.nas.nasa.gov/About/IPG/ (NASA’s Information Power Grid)
11/18/2003 Virtual Observatories for Space Science 47
The Grid: by Foster & Kesselman (Argonne National Laboratory)
Internet computing and GRID technologies promise to change the way we tackle complex problems. They will enable large-scale aggregation and sharing of computational, data and other resources across institutional boundaries …. Transform scientific disciplines ranging from high energy physics to the life sciences
11/18/2003 Virtual Observatories for Space Science 48
Data Grids vs. Computational Grids
11/18/2003 Virtual Observatories for Space Science 49
Slide shown earlier:
Conceptual Architecture for a Distributed Data Mining System
Data ArchivesData Archives
Discovery toolsDiscovery tools
Analysis toolsAnalysis toolsUser
Gateway
11/18/2003 Virtual Observatories for Space Science 50
A Concept for a Data Grid Nodefor Distributed Data Mining**
Hardware requirements• Large distributed database engines
– with few Gbyte/s aggregate I/O speed
• High speed (>10 Gbit/s) backbones– cross-connecting the major archives
• Scalable computing environment– with hundreds of CPUs for analysis
ObjectivityObjectivity
RAIDRAIDObjectivityObjectivity
RAIDRAIDObjectivityObjectivity
RAIDRAIDObjectivityObjectivity
RAIDRAIDObjectivityObjectivity
RAIDRAIDObjectivityObjectivity
RAIDRAIDObjectivityObjectivity
RAIDRAID
RAIDRAID
Database layer 2 GBytes/sec
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute nodeCompute nodeCompute nodeCompute node
Compute layer200 CPUs
Interconnect layer 1 Gbits/sec/node
Other nodes10 Gbits/s
** Slide provided by Alex Szalay (JHU)
HPC comes to the rescue!
11/18/2003 Virtual Observatories for Space Science 51
An HPC Application:Parallel Data Mining
Figure: How Parallel Processing Speeds Up Data Mining
The application of parallel computing resources and parallel data access (e.g., RAID) enables concurrent drill-downs into large data collections
11/18/2003 Virtual Observatories for Space Science 52
Distributed Data Mining in the NVO
11/18/2003 Virtual Observatories for Space Science 53
Data Mining: connecting the dots?
Reference: http://homepage.interaccess.com/~purcellm/lcas/Cartoons/cartoons.htm
11/18/2003 Virtual Observatories for Space Science 54
Scaling the VO Mountain: Role of Data MiningDiscoveries
Data MiningVisualization
DataServices
Existing Data Centers and Archives
You arehere
11/18/2003 Virtual Observatories for Space Science 55
Exploration of new domains of the
observable parameter space :
The Time Domain -Part 1
Moving object appears as little rainbow in multiple-color image overlays
In-coming Killer Asteroid?
NVO = Data Mining in Action
11/18/2003 Virtual Observatories for Space Science 56
Data Mining through Data Processsing:Simple Multiple-Frame Subtraction
SUPERNOVAdiscovered !!
11/18/2003 Virtual Observatories for Space Science 57
Mega-Flares on normalSun-like stars = a star like
our Sun increased in brightness 300X one night!
… say what??
The Time Domain -- Part 2
NVO = Data Mining in Action
11/18/2003 Virtual Observatories for Space Science 58
SETI@home searches for E.T. -- An equivalent data mining tool VO@home on anyone’s desktop can find
new comets, asteroids, exploding stars, quasars -- Chunks of data are sent to user’s screensaver, which
begins to mine data for special or one-of-kind astronomical events.
The Time Domain -- Part 3
NVO = Data Mining in Action
11/18/2003 Virtual Observatories for Space Science 59
VO@home brings science discovery to
the desktop of everyone! … a great
tool for space science and computational science education.Requires: access to distributed science databases and data
mining & analysis tools.
11/18/2003 Virtual Observatories for Space Science 60
1. Potential tool for Distributed Data Mining:http://skyserver.pha.jhu.edu/VOconeprofile/
ConeSearch• Find all astronomical objects
within a radius of a point on the sky (= cone).
• Find cross-identifications (e.g., a radio galaxy in one catalog = an Infrared galaxy in another catalog)
• >70 services are now queried.
• Results are returned in XML format (VOTable).
11/18/2003 Virtual Observatories for Space Science 61
2. Potential Tool for Distributed Data Mining:Data Inventory Service
http://us-vo.org/news/dis.html/
Response from the Data Inventory Service, showing links to relevant images and catalogs:
Uses ConeSearch Profile Service.
11/18/2003 Virtual Observatories for Space Science 62
3. Potential tool for Distributed Data Mining: http://www.skyquery.net/main.htm
Submits queries to large distributed databases!
2nd placeWinner in MicrosoftContest
11/18/2003 Virtual Observatories for Space Science 63
Sample Data Mining Applications within the NVO:·Discover data stored in geographically distributed heterogeneous systems.
·Search huge databases for trends and correlations in high-dimensional parameter spaces: identify new properties or new classes of objects.
·Search for rare, one-of-a-kind, and exotic objects in huge databases.
· Identify temporal variations in objects from millions or billions of observations.
· Identify moving objects in huge survey catalogs and image databases.
· Identify parameter glitches / anomalies / deviations either in static databases (e.g., archives) or in dynamic data (e.g., science / telemetry / engineering data streams from remote satellites).
·Find clusters, nearest neighbors, outliers, and/or zones of avoidance in the distribution of astrophysical objects or other observables in arbitrary parameter spaces.
·Serendipitously explore the huge databases that will be part of the NVO, through access to distributed, autonomous, federated, heterogeneous, multi-wavelength, multi-mission astrophysics data archives.
Summary - Applications of Data Mining to the NVOData Mining Resource Guide for Space Science:
http://nvo.gsfc.nasa.gov/nvo_datamining.html• Purpose and Content -- to assist NASA scientists in data mining activities by providing
comprehensive summaries of: NASA-funded data mining projects, data mining tutorials, algorithms, techniques, software, organizations, conference links, conference summaries, publications, lessons learned, related I.T. technologies, science applications, expert interviews, and applications of data mining to the new National Virtual Observatory (NVO).
http://www.us-vo.org/
11/18/2003 Virtual Observatories for Space Science 64
Web References• General:
• http://xml.gsfc.nasa.gov/
• http://nvo.gsfc.nasa.gov/
• http://www.us-vo.org/
• Specific:• VOTable - XML language for queries and tabular query results:
http://www.us-vo.org/VOTable/
• Data Mining Resource Guide:http://nvo.gsfc.nasa.gov/nvo_datamining.html
• Scientific Data Mining Workshop and Reports:http://www.anc.ed.ac.uk/sdmiv/
11/18/2003 Virtual Observatories for Space Science 65
VO: Creating the Future of Astrophysics Data Analysis
11/18/2003 Virtual Observatories for Space Science 66
Summary
• Quick Review of Astronomy Data • The National Virtual Obseratory (NVO)• Other Virtual Observatories for Space Science• Why Virtual Observatories?• NVO – It’s all about the Science:
– IT-enabled, Science-enabling
• The Enabling Computational Science Technologies for the NVO – where you can help!
• Distributed Data Mining in the NVO
11/18/2003 Virtual Observatories for Space Science 67
Next Lecture
• November 25 – Intelligent Archives of the Future