Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 ·...

36
Data Mining Prof. Dr. Eng. habil. Mihai Datcu

Transcript of Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 ·...

Page 1: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Data Mining

Prof. Dr. Eng. habil. Mihai Datcu

Page 2: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 2

EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as text or semantic web technologies, aims at making the image content accessible. It enables: • Automatically highlighting of satellite image subsets • Generates semantic catalogues • Explores and helps finding specific objects of interest • Assists image content assessment and understanding • Supports time-consuming visual interpretation • Performs analysis on multi-temporal and multi-sensor datasets.

Page 3: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 3

EO images

Geodata

Metadata

GIS

Geoinformation extraction

Reasoning & AI

Semantic annoation

Shareable information

Data Features Semantics Information

The Concept: from Data to Information

Page 4: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 4

Physical Parameters

Geometrical Parameters

Patterns Objects Structures

Atmosphere Meteorology LR Superspectral

Single Pass InSAR Altimeter PS InSAR

PolSAR

Multispectral

VHR Optical VHR SAR

A  taxonomy  of  EO  sensors  

Page 5: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 5

A [log]History 2014 Sentinel 1 Free data at 5 m resolution

2010 TanDEM-X Free scientific data at 1-3m resolution

2007 CosmoSky Med, RADRSAT 2 TerraSAR-X Free scientific data at 1m resolution 2000 SRTM First time > 80% of continental DEM in 30 m grid 1992 ERS-1 First time a lot of free data !

1974 Sea-SAT First EO spaceborne SAR

1954 Wiley Patent on SAR

WW I – WW II RADAR

1904 Hulsmeyer Patent on Telemobilskope

1897 Marconi First wireless communication

1888 Hertz Experimental validation

1865 Maxewll Theory of electromagnetical waves

100 years 10 years

Page 6: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 6

Title •  First idea

•  Second idea

History 1900s

Page 7: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 7

Page 8: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 8

Page 9: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 9

The EO Big Data

"  EO Archive = O {10 000 000} EO Products "  EO product = “image” + “xml file” "  “image” = 30 000 x 30 000 pixels "  “pixel” = 16/32…/1000s bits, real/complex/multiband "  “xml file” = a lot about the imaging mode, orbit, position, timing

… "  TOTAL = many ZettaBytes

Page 10: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 10

Information vs. data

www.DLR.de • Chart 10 > Lecture > Jagmal Singh • IGARS2012 > 25 July 2012

TerraSAR-X 11-OCT-2008

512x512 pixels

ERS1 24-JUL-1992

512x512 pixels

Page 11: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 11

SPATIAL CONTEXT & INFORMATION CONTENT

At meter resolution, if context is ignored (the analysis windows being used are relatively small as compared to the object scale), very different scene classes may be confused. The example shows a bridge (left) and buildings (right) which are comprehensible only when the window size is sufficiently large to incorporate relevant context. Otherwise, both scene classes “look” similar and can be confused.

Page 12: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 12

Page 13: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 13

1 HS TerraSAR-X Scene = up to10 000 image patches (100 x 100 m)

SCENE CATEGORIES & INFORMATION CONTENT

Page 14: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 14

EXOGENOUS SOURCES & INFORMATION CONTENT

Page 15: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 15

Concept and Components

Data Model Generation

Visual Data Mining

Interpretation

&

Understanding

Query, Data Mining & Knowledge Discovery

Users DBMS

Data Sources

Page 16: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 16

Visual Data Mining

Interpretation

&

Understanding

Query, Data Mining

&

Knowledge Discovery

Users

DBMS

Data Sources Content Analysis Context Analysis

Image Content

Metadata Content

GIS Content

Context Analysis

GIS

Metadata

EO images

Ø  Metadata extraction Ø  Patch generation Ø  Feature extraction methods: Gabor filters, Grey-

Level Co-occurrence Ø  Matrix (GLCM), Bag of words, dictionary-based

compression features, etc.

Data Model Generation

Page 17: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 17

Interpretation & Understanding

Visual Data Mining Data Model

Generation Users

Data Sources

Strabon

Ø SQL_RETRIEVE_QUERY_BY_DICTIONARY=""" WITH

user_dict_size AS (

SELECT uls.id AS id, uls.label as label, COUNT(*) AS cnt

FROM user_labels AS uls, user_dictionaries AS uds

WHERE uls.label LIKE '%(user_label_label)s' AND uls.id=uds.user_label_id

GROUP BY uls.id, uls.label ),

DBMS

Ø  Database scheme relation DB Ø  Some data mining functions

•  Similarity metrics Ø  SQL functions for querying

• Distances

Query, Data Mining &

Knowledge Discovery

Database Management System

Page 18: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 18

Database Management System

Example of data model for Earth-Observation images

Ø SQL_RETRIEVE_QUERY_BY_DICTIONARY=""" WITH

user_dict_size AS (

SELECT uls.id AS id, uls.label as label, COUNT(*) AS cnt

FROM user_labels AS uls, user_dictionaries AS uds

WHERE uls.label LIKE '%(user_label_label)s' AND uls.id=uds.user_label_id

GROUP BY uls.id, uls.label

),

inter_size AS (

-- size of dictionary intersection for all pairs of patches

SELECT

ud1.user_label_id AS user_label_1 ,

ul1.label AS label_1 ,

d2.patch_id AS patch_2 ,

p2.label AS label_2 ,

count(*) AS cnt_1_2

FROM

……………….

Fast Compression distance computation

Page 19: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 19

Interpretation & Understanding

Visual Data Mining

Data Model Generation

Query, Data Mining & Knowledge Discovery

Users DBMS

Data Sources

Ø  Queries: •  Query Builder •  Ontologies: Strabon •  SQL •  Query by Example Ø  Data Mining Ø  Semantic Definition

Primitive Feature Extraction Blocks

NL-STFT QMF

Gabor GLCM

GUI Relevance Feedback

SVM

Semantic Class DB

Query, Data Mining and Knowledge Discovery

Page 20: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 20

TerraSAR-X metadata used for RDFs and Queries: •  XML file contains information about productComponent, annotation, imageData, missionInfo, acquisitionInfo,

sceneInfo, etc

Extraction of Metadata from XML annotation file

queries

Ontology

Query by Metadata and Ontologies

Page 21: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 21

Query by Semantics

Semantic annotation of TerraSAR-X image content

queries

Earth-Observation data model

SELECT label_id, name, FROM annotation a Join label l on a.label_id=l.label_id

Ur

ba n

+

W at

er

U

rban

type

1

Vege

tatio

n an

d W

ater

B

uoy

W

ater

type

1 W a

te

r an d

B oa t

s

Fo re s

t t

ype

1

1 8 1 7

Gra

ssl

an

d F

ores

t typ

e 2

B

ridge

type

2 W

at

er

an

d U

rba

n

R

oa

d

an

d

Stru

ctur

e ro

of

Rai

lway

trac

ks

or V

eget

atio

n

V

eget

ati

on

ertic

al

Urb

an

type

2

G

rass

land

Gra

ssla

nd

wit

h ob

jec

ts

B

uil

ding

s

ha

pe

Urb

an

type

3 B

uil

ding

re

flec

tio

n

Veg

etat

ion

and

w

ith

rect

ang

ular

sh

ap

e

U

rban

R o a ds

a n d

Ur

b a n Tr

e es

a n d

B uil

di

n gs

W at

er

wi

th

la n d

C h a n n el

Ai

rp or

t

F or

es t

U

rb a n

ty p e

4

8 1 7

4 4 4

B ri

d g e

ty p e

1 H ar b or R iv er

d e p o si

ts A gr ic ul

tu re B re a ki

n g

w a v e s F or e st

a n d

W at er

V e g et at io n 1 7

B uil

di

n g

re fle cti

o n

R o a d

+

fo re st

U

rb a n

ty p e

5 R o a ds

h

ig h w ay

U

rb a n

ty p e

6

W a

te

r ty p e

2

C h a n n el

6 1

Sky

scr

ape

rs

Urb

an ty

pe 7

Str

ee

ts

wit

h bu

ildin

gs

Spo

rt

fi

eld

s

R

ail

wa

y tra

cks

S

kysc

rape

rs

typ

e1

horiz

onta

l

ty

pe

2

Page 22: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 22 11/09/15 22

Semantic catalogues

- Bangkok (Thailand); - Shenyang (China); - Nazca Lines (Peru); - Havana (Cuba); - Venice (Italy); - Vasteras (Sweden); - Oran (Algeria); - Bogota (Columbia).

Page 23: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 23

Interpretation & Understanding

Visual Data Mining

Data Model Generation

Query, Data Mining & Knowledge Discovery

Users DBMS

Data Sources

Ø  Queries: •  Query Builder •  Ontologies: Strabon •  SQL •  Query by Example Ø  Data Mining Ø  Semantic Definition

Primitive Feature Extraction Blocks

NL-STFT QMF

Gabor GLCM

GUI Relevance Feedback

SVM

Semantic Class DB

Query, Data Mining and Knowledge Discovery

Page 24: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 24

PF algorithm Classification SVM with RF

Ground truth

Annotated category

Optimal parameters:

•  product type (MGD) , •  mode (High resolution Spotlight), •  geometric resolution configurations (RE), •  patch size (160 x 160 pixels); •  PF algorithm (Gabor filters)

Semantic Patches

Collections

Methodology:

Semantic annotation

Page 25: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 25

Data Model composed of: ~110000 patches ~320 semantic categories

Location of the 100 TerraSAR-X scenes and the distribution of the scenes over the World

Data Model Generation: Ingested images

C.O. Dumitru and M. Datcu, “Information Content of Very High Resolution SAR Images: Study of Feature Extraction and Imaging Parameters”, IEEE Trans. Geoscience and Remote Sensing. To be published

Page 26: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 26

Visual Data Mining

Data Model Generation

Query, Data Mining & Knowledge Discovery

Users

DBMS Data

Sources

Ø  Rapid Mapping generation

Interpretation and Understanding

Page 27: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 27

Example of Image understanding The damages in the agriculture can be clearly seen by comparing the classification in pre disaster image (left figure) with the post disaster image (right figure).

Agriculture

Bridges

Aquaculture

H. Voltage poles

Flooded areas

Bridges

Debris

H. Voltage poles

TerraSAR-X scene before Tsunami – 20.10.2010 TerraSAR-X scene after Tsunami – 12.03.2011

Page 28: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 28

Automated Rapid Mapping scenario: Tsunami in Japan 2011

28

Two multitemporal TerraSAR-X images used for Japan Tsunami scenario

Page 29: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 29

Damage Assessment

29

Flooded areas Ocean Bridges

after Flooded areas

Flooded areas Ocean Flooded

areas Debris

Agriculture Bridges before

High voltage poles

Mountains Structures

Changes 504 20 0 8 12 75 16 2

0

100

200

300

400

500

600

Bridges before Tsunami Bridges after Tsunami

Debris caused by Tsunami

Mountains 9% Debris

0%

Flooded area 7%

High voltage poles 0%

Ocean 78%

Bridges before

0% Bridges

after 0%

Agriculture

5%

Structures 1%

Semantic class Total of annotated

patches Mountains 1874 Debris 33 Flooded area 1428

High voltage poles 50 Ocean 16562

Bridges before 66 Bridges after 66 Agriculture 981 Structures 287

Semantic Catalogue

Page 30: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 30

Interactive Learning for Rapid Mapping

City Industry

Page 31: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 31

Visual Data Mining

Query, Data Mining & Knowledge Discovery

Interpretation and Understanding

Visual Data Mining

Data Model Generation

Users

DBMS Data

Sources

Ø  Representation of the data in the 3D space using a Laplacian eigenmap

Page 32: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 32

Immersive Visual Information Mining for EO image archives Navigation inside the EO image collections using the CAVE automatic virtual environment

Page 33: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 33

EO data and Users

Volume and multimodality of data is growing Data and information is spatio-temporal and unstructured Users want to have the knowledge Interactive is the only way of operation Exploration is the predominant mode of interaction Context is critical and relevant Users are interested in information and knowledge independent of conjecture

Page 34: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 34

Big Data trends

Information must be obtained from the data Databases and search engines were not designed to provide contents Visualization is very important HMI are crucial "Cognitive vs. instruments""Reactive systems (may be more then concious ...)""All data must be availabe (no data, no discovery)""Simple, but not naïve""Selective, not only addaptive

Page 35: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 35

Processing and learning

Pre-processing integrated with post processing Data cleaning integrated with semantic understanding Non-visual integrated in visual Latent integrated in intuitive learning Learning controlled by unlearning Information integrated with human centric knowledge Only realtime

Page 36: Data Mining - ESA SEOMseom.esa.int/landtraining2015/files/Day_5/D5T1a_LTC2015... · 2015-10-21 · EO Data Mining EO Data Mining, in analogy to other forms of Data Mining, such as

Institut für Methodik der Fernerkundung bzw. Deutsches Fernerkundungsdatenzentrum

Folie 36

Algorithms and tools

Pattern Recognition and Data Mining Using Data Compression EO Ontologies Geospatial Information Retrieval and Indexing Systems Content Mining, Semantics Modeling, and Complex Queries Mining Image Time Series DM supported by Web Services, CLOUD technologies