Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial...
-
Upload
ethelbert-sanders -
Category
Documents
-
view
212 -
download
0
Transcript of Spring 2003Data Mining by H. Liu, ASU1 6. Spatial Mining Spatial Data and Structures Images Spatial...
Spring 2003 Data Mining by H. Liu, ASU 1
6. Spatial Mining
Spatial Data and Structures
Images
Spatial Mining Algorithms
Spring 2003 Data Mining by H. Liu, ASU 2
Definitions
• Spatial data is about instances located in a physical space
• Spatial data has location or geo-referenced features
• Some of these features are: – Address, latitude/longitude (explicit)
– Location-based partitions in databases (implicit)
Spring 2003 Data Mining by H. Liu, ASU 3
Applications and Problems
• Geographic information systems (GIS) store information related to geographic locations on Earth– Weather, climate monitoring, community infrastructure
needs, disaster management, and hazardous waste
• Homeland security issues such as prediction of unexpected events and planning of evacuation
• Remote sensing and image classification• Biomedical applications include medical imaging
and illness diagnosis
Spring 2003 Data Mining by H. Liu, ASU 4
Use of Spatial Data
• Map overlay – merging disparate data– Different views of the same area: (Level 1) streets, power
lines, phone lines, sewer lines, (Level 2) actual elevations, building locations, and rivers
• Spatial selection – find all houses near ASU• Spatial join – nearest for points, intersection for areas• Other basic spatial operations
– Region/range query for objects intersecting a region
– Nearest neighbor query for objects closest to a given place
– Distance scan asking for objects within a certain radius
Spring 2003 Data Mining by H. Liu, ASU 5
Spatial Data Structures
• Minimum bounding rectangles (MBR)• Different tree structures
– Quad tree
– R-Tree
– kd-Tree
• Image databases
Spring 2003 Data Mining by H. Liu, ASU 6
MBR
• Representing a spatial object by the smallest rectangle [(x1,y1), (x2,y2)] or rectangles
(x1,y1)
(x2,y2)
Spring 2003 Data Mining by H. Liu, ASU 7
Tree Structures
• Quad Tree: every four quadrants in one layer forms a parent quadrant in an upper layer– An example
Spring 2003 Data Mining by H. Liu, ASU 8
R-Tree
• Indexing MBRs in a tree– An R-tree order of m has at most m entries in one node
– An example (order of 3)
R8
R1
R2R3
R6
R5R4
R7
R8
R7R6
R3R2R1 R5R4
Spring 2003 Data Mining by H. Liu, ASU 9
kd-Tree
• Indexing multi-dimensional data, one dimension for a level in a tree– An example
Spring 2003 Data Mining by H. Liu, ASU 10
Common Tasks dealing with Spatial Data
• Data focusing– Spatial queries– Identifying interesting parts in spatial data– Progress refinement can be applied in a tree structure
• Feature extraction– Extracting important/relevant features for an
application
• Classification or others– Using training data to create classifiers– Many mining algorithms can be used
• Classification, clustering, associations
Spring 2003 Data Mining by H. Liu, ASU 11
Spatial Mining Tasks
• Spatial classification• Spatial clustering• Spatial association rules
Spring 2003 Data Mining by H. Liu, ASU 12
Spatial Classification
• Use spatial information at different (coarse/fine) levels (in different indexing trees) for data focusing
• Determine relevant spatial or non-spatial features• Perform normal supervised learning algorithms
– e.g., Decision trees, NBC, etc.
Spring 2003 Data Mining by H. Liu, ASU 13
Spatial Clustering
• Use tree structures to index spatial data• Cluster locally
– DBSCAN: R-tree
– CLIQUE: Grid or Quad tree
Spring 2003 Data Mining by H. Liu, ASU 14
Spatial Association Rules
• Spatial objects are of major interest, not transactions• A B
– A, B can be either spatial or non-spatial (3 combinations)
– What is the fourth combination?
• Association rules can be found w.r.t. the 3 types
Spring 2003 Data Mining by H. Liu, ASU 15
Summary
• Spatial data can contain both spatial and non-spatial features.
• When spatial information becomes dominant interest, spatial data mining should be applied.
• Spatial data structures can facilitate spatial mining.
• Standard data mining algorithms can be modified for spatial data mining, with a substantial part of preprocessing to take into account of spatial information.
Spring 2003 Data Mining by H. Liu, ASU 16
Bibliography
• M. H. Dunham. Data Mining – Introductory and Advanced Topics. Prentice Hall. 2003.
• R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification, 2nd edition. Wiley-Interscience.
• J. Han and M. Kamber. Data Mining – Concepts and Techniques. 2001. Morgan Kaufmann.