Data Driven Modeling Beyond Idealization

Post on 17-Jan-2017

94 views 3 download

Transcript of Data Driven Modeling Beyond Idealization

svm@arch.ethz.chSEC

Data Driven Modeling Beyond Idealization

Vahid MoosaviPhD Student at Chair for Computer Aided Architectural Design (CAAD), Professor Ludger Hovestadt, ETH ZurichResearcher at ETH-Singapore Centre, Future Cities Laboratory (FCL)

24 May 2014

1

2

Landscape of Scientific Modeling

First Section

3

Models as the way we conceive of the real phenomena are one of the fundamental elements of any investigation…

5

On the other hand, what we encounter…

…A Landscape of Modeling Approaches in Competition, Challenging Complex Systems

How to find a unify

ing

(abstract)

perspecti

ve for

assessm

ent, while keeping

the diversities?

First Try: Formal Definitions

• No specific definition so far (Stanford Plato) but some classifications: • Models and Representation

– Scale models. (Black 1962)

– Idealized models (Michael Weisberg)• Aristotelian (Minimal)• Galilean (McMullin 1985) • Caricatures

– Analogical models (Hesse 1963)– Approximations– Phenomenological models: (McMullin 1968)– …

Idealization toward perfection or simplification?!!

9

Idealization toward perfection or simplification?!!

10

A Model of the Modeling Process“Rational Models: Models, based on Ideals”

Second Section

11

Natural System

Formal System

Decoding

Encoding

Infe

renc

eCausality

A Model of Modeling Process (Let’s call it Rational Modeling.)

By: Robert Rosen

12

A Normal Distribution with outlier or a “unique case”?

Fourier Transformation: any form is a linear combination of some ideal forms

Each Code Follows On An Ideal Form

13

Each code consists of certain aspects (features) of the natural system (Models as Pairs of Glasses): Minimalist Idealizations and multi-models

Networks: Structural thinking

Agents (actors): Interactions between different agencies

System Dynamics: Process Oriented View

But which

glasses?

15

Challenge: What to do in dealing with Complex Adaptive Systems? Which glasses (i.e. modeling approach) is sufficient, when in principle each view is arbitrary?

Hypothesis: Majority of current modeling approaches are fundamentally limited in dealing with complex systems and what we need is an abstraction from the concept of “rational modeling”.

Idea: Is it conceptually possible to have all the views at once?

16

Complex Numbers

Real Numbers

What do we mean by “Abstraction”? (A Metaphore)

Rational Numbers

Natural Numbers

17

Current Trend: Parametricism (multi-model Idealization) and the Curse of Dimensionality…

…Complicated, but not complex models

Properties of the system for modeling

Possible Relations (types and num

bers)

Complex Systems

Simple Systems

Minimal idealization

Multi-model idealization

18

A new realm of modeling?!

Properties of the system for modeling

Possible Relations (types and num

bers)

Multi-Agent Systems

Urban Cellular automata

Urban Dynamics

Basic Statistics(Hypothesis Testing)

Urban Metabolism

Urban ScalingSocial Physics

Fractal Models

19

Toward a new formalism for the concept of modeling“Models without Ideals or All the Potential Ideals”

Third Section

20

How to avoid the curse of dimensionality? (Or How to Encapsulate all the potentialities?)

Selected Features to Represent the Objects

Objects

Encapsulation RelationalityRationality

Examples:• Cities• Streets• Buildings• People• Companies• Food • Energy• Medicine• Internet• Words in a text

Abstract Universals (ideal forms)Concrete Universals

21

It is a self-referential Setup

Page Rank

..Can be local or global

22

Relational ModelingAn Example in Natural Language Modeling

Rational Modeling Relational Modeling

External Reference

We have the Ideal model of the language

No External Reference

We have A Huge Corpus of language Main Assumption

Relational Representation of symbols in a language

Noam Chomsky MarkovHeroes

Based on

Approach

23

However, it took a century…

Markov (1907) Shannon(1948) Google 2000-

“For Linguists it is hard to believe it as a practical

approach”

“Interesting idea, but Computationally

Expensive”

“Getting Feasible! With Billions of text documents”

Relational Representation of symbols in a language

Data Deluge

24

…this Data Deluge has inverted the concept of empirical research

25

Classical Simulation SpaceSyntax, London

“The social logic of space,(1984)”

33,000+ taxicabs

GPS Trajectory of Taxicabs, Beijing, 2012

Inversion in Modeling

26

Link

27

Complicated-N

ess!!! Multi-Model Idealization

(Agent Based Transportation Modeling)

28

Symboliza

tion of Complexit

y

(Encapsulating a

ll the potential id

eals)Using GPS tracks of cars within a city:Taking urban cells as a word in a language, each individual driver is a unique story teller, while driving within urban grid cells…

...A Markov Chain Model of traffic dynamicsCan be developed for :

• Simulation• community detection• Network Engineering• Sensitivity Analysis

29

Fourth Section

Self Organizing MapsAnd

Data Deluge

30

How to explain SOM or What is a good story for SOM?

31

SOM from the Context of (Nonlinear) Transformation: Dimensionality Reduction

32

• Finding an ideal (global) transformation: e.g. PCA• Observations are instances of an abstract representation

Selected Features to Represent the Objects

ObjectsFirst General Approach: Direct Transformation

X TW

33

• Each observation is a dimension itself: e.g. MDS, LLE, ISOMAP,… • There is always a mechanism to preserve neighborhood topology.

Second General Approach: Indirect Transformation

34

35

Self Organizing Map (SOM) : A generic setup, based on symbolic indexes

• SOM as a transformation based on topology preserving mechanism, but at the same time creating an abstraction from observations.A Primal-Dual Representation

X TSOM

36

Pre-Specific City Modeling

Footprint of buildings in Orchard area, Singapore

Similar buildings are in the same area of SOM

37

Data Driven Urban Pollution Modeling beyond Idealization

Idealization in traditional simulation models

38

Data Driven Urban Pollution Modeling beyond Idealization

39

SOM: Approximating joint probability distribution

40

Frequencies of occurrence

P.E. Bieringer et al. / Atmospheric Environment 80 (2013)

41Original Distribution SOM Based Distribution

42

SOM: Computing with contextual numbers (signs!?)

43

The classic notion of (natural) number is based on a one directional arrow.

44

This is the classical time series analysis.

45

46

47

48

contextual numbers

49

contextual numbers

50

1-Median list Price2-Median sale price3-Median list price -sq. ft.4-Median sale price-sq. ft.5-Sold for loss6-Sold for gain7-Increasing values8-Decreasing values9-Listings with price cut10-Median price cut11-Sold in past year12-Homes for Rent13-Homes foreclosed14-Foreclosure re-sales15-Sale-to-list price ratio16-Price to rent ratio

Multi-Dimensional Time Series Modeling (Real Estate Dynamics)

51

52

But it is more than visualization…

53

It improves the overall prediction accuracy

54

In general, it can be a part of larger computing machine.

55

SOM: A Generic Computing Machine Beyond Ideal Forms

Democratic Computing Social computing(Computing with any function)

Observed Data

Resamples of Data by SOM

56

AdditionSubtractionMultiplication…SOMification as any operation in coexistence with data!!

57

Thanks!