Spatial Data Mining Hari Agung Departemen Ilmu Komputer FMIPA IPB [email protected].

21
Spatial Data Mining Hari Agung Departemen Ilmu Komputer FMIPA IPB [email protected]

Transcript of Spatial Data Mining Hari Agung Departemen Ilmu Komputer FMIPA IPB [email protected].

Spatial Data Mining

Hari Agung

Departemen Ilmu Komputer FMIPA IPB

[email protected]

22004/09/09

• Motivation and General Description• Data Mining: Basic Concepts • Data Mining Techniques • Spatial Data Mining• Spatial Data Mining Scenarios in Meteorology

and Weather Forecasting• Conclusions• Questions & Discussions

32004/09/09

Spatial Data Mining• Spatial Patterns

– Spatial outliers– Location prediction– Associations, co-locations– Hotspots, Clustering, trends, …

• Primary Tasks– Mining Spatial Association Rules– Spatial Classification and Prediction – Spatial Data Clustering Analysis– Spatial Outlier Analysis

• Example: Unusual warming of Pacific ocean (El Nino) affects weather in USA…

42004/09/09

Spatial Data Mining Results• Understanding spatial data, discovering

relationships between spatial and nonspatial data, construction of spatial knowledge bases, etc.

• In various forms– The description of the general weather patterns in a set

of geographic regions is a spatial characteristic rule.– The comparison of two weather patterns in two

geographic regions is a spatial discriminant rule.– A rule like “most cities in Canada are close to the

Canada-US border” is a spatial association rule• near(x,coast) ^ southeast(x, USA) ) hurricane(x), (70%)

– Others: spatial clusters,…

52004/09/09

What is Spatial Data?

Used in/for: GIS - Geographic Information Systems Meteorology Astronomy Environmental studies, etc.

• The data related to objects that occupy space– traffic, bird habitats, global

climate, logistics, ... • Object types:

– Points, Lines, Polygons,etc.

62004/09/09

Basic Concepts (1)• Spatial data mining follows along the same functions

in data mining, with the end objective to find patterns in geography, meteorology, etc.

• The main difference (Spatial autocorrelation)– the neighbors of a spatial object may have an influence on

it and therefore have to be considered as well

• Spatial attributes– Topological

• adjacency or inclusion information

– Geometric• position (longitude/latitude), area, perimeter, boundary polygon

72004/09/09

Basic Concepts (2)

• Spatial neighborhood– Topological relation

• “intersect”, “overlap”, “disjoint”, …

– distance relation• “close_to”, “far_away”,…

– direction/orientation relation• “left_of”, “west_of”,…

• Global model might be inconsistent with regional models

Global Model

Local Model

82004/09/09

Applications

• NASA Earth Observing System (EOS): Earth science data

• National Inst. of Justice: crime mapping• Census Bureau, Dept. of Commerce: census

data• Dept. of Transportation (DOT): traffic data• National Inst. of Health(NIH): cancer

clusters• ……

92004/09/09

Example: What Kind of Houses Are Highly Valued?—Associative Classification

102004/09/09

Meteorological Data Mining

• Motivation– Lot of analysis methods must be applied to fast growing

data for climate studies

• Result– Appropriate presentation instruments (graphs, maps,

reports, etc) must be applied

• Examples– Spatial outliers can be associated with disastrous natural

events such as tornadoes, hurricane, and forest fires– Associations between disaster events and certain

meteorological observations

112004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

• SKICAT(SKy Image Cataloging and Analysis Tool ) (Caltech, US)

• The Palomar Observatory discovered 22 quasars with the help of data mining

• the Second Palomar Observatory Sky Survey (POSS-II) – decision tree methods– classification of galaxies, stars and other

stellar objects• About 3 TB of sky images were

analyzed

Case Studies (1): Astronomy

122004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

Case Studies (2): NCAR & UCAR• National Center for Atmospheric Research (NCAR) &

University Corporation for Atmospheric Research(UCAR), US– http://www.ucar.edu/

• “Automatic Fuzzy Logic-based systems now compete with human forecasts”

• Richard Wagoner, Deputy Director at Research Applications Program(RAP), NCAR

• Intelligent Weather System (IWS)– Detection and forecast in the areas of en-route turbulence,

en-route icing, ceiling/visibility, and convective hazards in the aviation community

– Road winter maintenance, airport operations, and flash flood forecasting

132004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

Case Studies (3): CrossGrid (EU)

• Objective– To develop, implement and exploit new Grid components

for interactive compute and data intensive applications like flooding crisis team decision support systems, air pollution combined with weather forecasting

• Main tasks in Meteorological applications package– Data mining for atmospheric circulation patterns

• Find a set of representative prototypes of the atmospheric patterns in a region of interest

– Weather forecasting for maritime applications– Ocean wave forecasting by models of various complexity

142004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

• Data– ERA-15 using a T106L31 model (from 1978 to 1994) with 1.125◦ resolution– Terabytes– Comprises data from approx. 20 variables (such as temperature,humidity,

pressure, etc.) at 30 pressure levels of a 360x360 nodes grid

6

SOM Application for DataMining

Downscaling Weather Forecasts

AdaptiveCompetitive

Learning

Sub-grid details scape from numerical models

152004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

Dept. of Applied Mathematics

Universidad de Cantabria

Santander, Spain

162004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

Case Studies (4): Typhoon Image Data Mining• Objective

– To establish algorithms and database models for the discovery of information and knowledge useful for typhoon analysis and prediction

– Content-based image retrieval technology to search for similar cloud patterns in the past

– Data mining technology to extract spatio-temporal pattern information which is meaningful from the meteorology viewpoints

• Result– Alignment of Multiple Typhoons, Explore by Projection to

2D Plane, Diurnal Analysis

172004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

182004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

Case Studies (6): Rainfall Classification University of Oklahoma Norman• To classify significant and interesting features within a

two-dimensional spatial field of meteorological data– Observed or predicted rainfall

• Data source– Estimates of hourly accumulated rainfall– Using radar and raingage data

• “Attributes” for classification– Statistical parameters representing the distribution of rainfall

amounts across the region

• Classification Method– Hierarchical cluster analysis

192004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

What we can learn from those scenarios?

• Data Mining is a promising way for meteorological analysis

• Very strong interaction between scientists and the knowledge discovery system is necessary

• The users define features of the meteorological phenomena based on their expert knowledge

• The system extracts the instances of such phenomena

• Then, further analysis of phenomena is possible

202004/09/09 Hong Kong Observatory Hong Kong Meteorological Society

Conclusions

• Data mining: discovering interesting patterns from large amounts of data

• A natural evolution of database technology, in great demand, with wide applications

• A KDD process includes data mining, and other steps• Data Mining can be performed in a variety of

information repositories• Data mining Tasks: characterization, discrimination,

association, classification, clustering, outlier and trend analysis, etc.

212004/09/09

And now discussion