Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial...

16
Spring 2003 Data Mining by H. Liu, ASU 1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms

Transcript of Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial...

Page 1: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 1

6. Spatial Mining

Spatial Data and Structures

Images

Spatial Mining Algorithms

Page 2: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 2

Definitions

• Spatial data is about instances located in a physical space

• Spatial data has location or geo-referenced features

• Some of these features are: – Address, latitude/longitude (explicit)

– Location-based partitions in databases (implicit)

Page 3: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 3

Applications and Problems

• Geographic information systems (GIS) store information related to geographic locations on Earth– Weather, climate monitoring, community infrastructure

needs, disaster management, and hazardous waste

• Homeland security issues such as prediction of unexpected events and planning of evacuation

• Remote sensing and image classification• Biomedical applications include medical imaging

and illness diagnosis

Page 4: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 4

Use of Spatial Data

• Map overlay – merging disparate data– Different views of the same area: (Level 1) streets, power

lines, phone lines, sewer lines, (Level 2) actual elevations, building locations, and rivers

• Spatial selection – find all houses near ASU• Spatial join – nearest for points, intersection for areas• Other basic spatial operations

– Region/range query for objects intersecting a region

– Nearest neighbor query for objects closest to a given place

– Distance scan asking for objects within a certain radius

Page 5: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 5

Spatial Data Structures

• Minimum bounding rectangles (MBR)• Different tree structures

– Quad tree

– R-Tree

– kd-Tree

• Image databases

Page 6: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 6

MBR

• Representing a spatial object by the smallest rectangle [(x1,y1), (x2,y2)] or rectangles

(x1,y1)

(x2,y2)

Page 7: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 7

Tree Structures

• Quad Tree: every four quadrants in one layer forms a parent quadrant in an upper layer– An example

Page 8: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 8

R-Tree

• Indexing MBRs in a tree– An R-tree order of m has at most m entries in one node

– An example (order of 3)

R8

R1

R2R3

R6

R5R4

R7

R8

R7R6

R3R2R1 R5R4

Page 9: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 9

kd-Tree

• Indexing multi-dimensional data, one dimension for a level in a tree– An example

Page 10: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 10

Common Tasks dealing with Spatial Data

• Data focusing– Spatial queries– Identifying interesting parts in spatial data– Progress refinement can be applied in a tree structure

• Feature extraction– Extracting important/relevant features for an

application

• Classification or others– Using training data to create classifiers– Many mining algorithms can be used

• Classification, clustering, associations

Page 11: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 11

Spatial Mining Tasks

• Spatial classification• Spatial clustering• Spatial association rules

Page 12: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 12

Spatial Classification

• Use spatial information at different (coarse/fine) levels (in different indexing trees) for data focusing

• Determine relevant spatial or non-spatial features• Perform normal supervised learning algorithms

– e.g., Decision trees, NBC, etc.

Page 13: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 13

Spatial Clustering

• Use tree structures to index spatial data• Cluster locally

– DBSCAN: R-tree

– CLIQUE: Grid or Quad tree

Page 14: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 14

Spatial Association Rules

• Spatial objects are of major interest, not transactions• A B

– A, B can be either spatial or non-spatial (3 combinations)

– What is the fourth combination?

• Association rules can be found w.r.t. the 3 types

Page 15: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 15

Summary

• Spatial data can contain both spatial and non-spatial features.

• When spatial information becomes dominant interest, spatial data mining should be applied.

• Spatial data structures can facilitate spatial mining.

• Standard data mining algorithms can be modified for spatial data mining, with a substantial part of preprocessing to take into account of spatial information.

Page 16: Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms.

Spring 2003 Data Mining by H. Liu, ASU 16

Bibliography

• M. H. Dunham. Data Mining – Introductory and Advanced Topics. Prentice Hall. 2003.

• R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification, 2nd edition. Wiley-Interscience.

• J. Han and M. Kamber. Data Mining – Concepts and Techniques. 2001. Morgan Kaufmann.