Algebraic Manipulation of Scientific Datasets Bill Howe and David Maier OGI School of Science and...

Post on 19-Dec-2015

217 views 0 download

Tags:

Transcript of Algebraic Manipulation of Scientific Datasets Bill Howe and David Maier OGI School of Science and...

Algebraic Manipulation of Scientific Datasets

Bill Howe and David MaierOGI School of Science and

Engineering at Oregon Health and Science University

Portland State University

Environmental Observation and Forecastingon the Columbia River

Sensors

Simulation

Data Products

Gridded Scientific Datasets

3.2 3.1

3.8

12.1C

12.6C

13.1C

13.2C

12.8C

12.5C

3.63.23.33.4

3.63.6

4.0

4.1

4.0

4.0 4.04.0

4.0

4.0

Some CORIE Grids

H = 2d Horizontal Grid

T = 1d Time Grid

V = 1d Vertical Grid

Mean Sea Level

Underground

(not shown)

Thesis

• Grid topology requires explicit data model support.

• Transformations can be expressed via composition of a few logical operators.

• Performance can be preserved via algebraic optimization and specialized operator implementations.

Roadmap

• Domain Introduction

• Model Introduction

• Conventional Approaches

• Examples of Optimization

• Conclusion

Grid Topology

• Grid Topology

– A collection of cells of various dimensions,

– implicit or explicit incidence relationships

1

A B0

23

mn

op

q

2-Cells 0-CellsA 0A 1A 3B 1B 2B 3

1-Cells 0-Cellsm 0m 1n 1n 2

: :

2-Cells = {A,B}1-Cells = {m,n,o,p}0-Cells = {0,1,2,3}

Grid Properties

• Topology <> geometry

• A grid may contain cells of – multiple dimensions – multiple “shapes”

• Dimension of a grid is the maximum dimension of its cells

1

AB

02

3

mn

o

pq

4

r

GridField: Grid with Bound Data

• Tuples of numeric primitives

• Total functions over cells of dimension k

• Two gridfields may share a grid

x y salt temp

x1 y1 29.4 12.1

x2 y2 29.8 12.5

x3 y3 28.0 12.0

x4 y4 30.1 13.2

flux area

11.5 3.3

13.9 5.5

13.1 4.5

Roadmap

• Domain Introduction

• Model Introduction

• Conventional Approaches

• Examples of Optimization

• Conclusion

1) Modeling with Relations

• trivial join dependency embedded in the key– decomposition won’t help– no notion of “grid”

x y t salt temp

x1 y1 1 29.4 12.1

x2 y2 1 29.8 12.5

x3 y3 1 28.0 12.0

x4 y4 1 30.1 13.2

x1 y1 2 30.6 12.1

x2 y2 2 31.5 12.2

x3 y3 2 31.7 11.8

x4 y4 2 32.0 10.1

Node Data cid flux area

a 11.5 3.3

b 13.9 5.5

c 13.1 4.5

Cell Data

G

G

cid x y

a x1 y1

a x2 y2

a x4 y4

b x2 y2

: : :

Incidence

2) Spatial Extensions

• Incidence relationship dependent on geometry rather than topology• Geometry information redundantly defined in nodes and cells• No concept of a “grid”: impedance mismatch with visualization applications

Node::Point t salt temp

Point(x1,y1) 1 29.4 12.1

Point(x2,y2) 1 29.8 12.5

Point(x3,y3) 1 28.0 12.0

Point(x4,y4) 1 30.1 13.2

Point(x1,y1) 2 30.6 12.1

Point(x2,y2) 2 31.5 12.2

Point(x3,y3) 2 31.7 11.8

Point(x4,y4) 2 32.0 10.1

Node Data

Cell::Polygon flux area

Polygon(Point(x1,y1),…) 11.5 3.3

Polygon(Point(x2,y2),…) 13.9 5.5

Polygon(Point(x1,y1),…) 13.1 4.5

Cell Data

3) Visualization Libraries

• Different algorithms, each dependent on data characteristics. • Programmer’s responsibility to match algorithms with data• Logical equivalences are obscured

vtkExtractGeometryvtkThresholdvtkExtractGridvtkExtractVOIvtkThresholdPoints

Grid restriction:

With VTK:

restrict

Roadmap

• Domain Introduction

• Model Introduction

• Conventional Approaches

• Examples of Optimization

• Conclusion

associate grids with data

combine grids topologically

reduce a grid using data values

transform grids or data

bind (b)

union, intersection, cross product ()

restrict (r)

aggregate (a)

Task Operator

Operators

25

26 21

19

Restrict Semantics

24

26

25

27

25

24

26

21

19

26

27

25

24

restrict(<24)

restrict(<24)

Values bound to 0-cells (nodes)

Values bound to 2-cells (triangles)

Working With GridFields

H : (x,y,b)

V : (z)

r(z>b) b(s) r(region) render

H V (H V) r(H V) b(r(H V)) r(b(r(H V)))

“wetgrid”

Optimize: Push Restricts

• salt,temp defined on G

• Materialize pointers to elements of salt, temp

• Bind salt, temp to a subgrid of G, G'

G =

s1 s2 s3 s4 s5

t1 t2 t3 t4 t5

::

G' =

s1 s3 s5

::

t1 t3 t5

salt =

temp =

salt' =

temp' =

r(p(x,y))

r(p(z))

r(z>b) b(s)

H : (x,y,b)

V : (z)

Optimization Results

0

5

10

15

20

25

30

35

40

45

0 0.2 0.4 0.6 0.8 1

selectivity

tim

e (s

ec)

unopt

opt

vtk

rdbms

Horizontal Slice

H(x,y,b)

V(z)

r(z>b) b(s) slice

H(x,y,b)

<depth>

r(z>b) b(s)apply

Transect (Vertical Slice)

H(x,y,b)

V(z)

r(z>b) b(s) “join”

PP V

CA B

Transect (Vertical Slice)

V(z)P

H(x,y,b)

“join” b(s) “join”

A B

CP

A

B

C

Transect Optimizations

05

1015202530354045