Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

41
Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison

Transcript of Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Page 1: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Creating High Performance Spatial Databases with SQL Server 2008

Alastair Aitchison

Page 2: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

About Me

• Consultant, Trainer, Author, and Housedad

Page 3: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Page 4: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

The “Two-Column” Model

CREATE TABLE Customers ( Name varchar(32), Address varchar(255), Lat float, Long float);

Page 5: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Point-in-Polygon

SELECT * FROM CustomersWHERE Lat BETWEEN LtMin AND LtMaxAND Long BETWEEN LnMin AND LnMax

(LtMin, LnMin)

(LtMax, LnMax)

Page 6: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Calculating Distance

SELECT 3963.0 * ACOS( SIN(Lat1) * SIN(Lat2) + COS(Lat1) * COS(Lat2) * COS(Lon2 - Lon1))

(Lat1, Lon1)

(Lat2, Lon2)

Page 7: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

“Two-Column” Model Limitations

• Only stores points• Calculations on flat plane or perfect sphere• Limited range of methods

Page 8: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

SQL Server 2008

• Points, Linestrings, Polygons• Accurate calculations– Ellipsoid model (geography)– Flat plane (geometry)

• Full complement of spatial methods – Intersects, Contains, Crosses, Touches– Distance, Length, Area– DE-9IM

Page 9: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Do You Need geometry / geography?

• Not all “spatial” apps need spatial datatypes• Example: Store locator

“two column” geometry / geographySimple ComplexApproximate AccurateFAST! SLOW!

Page 10: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Page 11: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Querying geometry and geography

• SELECT * WHERE A.STIntersects(B) = 1• Primary Filter (Based on Index)– Approximate– Fast– Superset of actual results

• Secondary Filter (Based on Table)– Refine results of primary filter– Accurate

Page 12: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Assigning Order to Spatial Data

• B-Tree indexing for linearly ordered data– decimal, float, money etc. – numeric order– char, varchar, nvarchar etc. – alphabetic order– datetime, date, time etc. – chronological order

• How do we assign order to spatial data?

Page 13: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

The Multi-Level Grid

Page 14: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

From Grid to Index

• Covered, partially covered, or touched cells• Maximise accuracy - Minimise index size• Three Rules– Covering Rule– Deepest-Cell Rule– Cells Per Object Rule

Page 15: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Covering Rule

“If a grid cell is completely covered by a geometry, don’t further subdivide that cell.”

Page 16: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Deepest-Cell Rule

“Once a cell has been subdivided, only store the intersecting cell(s) at the deepest grid level.”

Page 17: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Cells Per Object Rule

“If subdividing a cell would exceed the maximum allowed number of cells for each object, do not subdivide the cell.”

Page 18: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Page 19: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Creating a Spatial Index I

CREATE TABLE Grid ( id char(1), shape geometry, CONSTRAINT [idxGridCluster] PRIMARY KEY CLUSTERED ( id ASC ));

Page 20: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Add Some Points To The Table

INSERT INTO Grid VALUES ('A', geometry::Point(0.5, 2.5, 0)),('B', geometry::Point(2.5, 1.5, 0)),('C', geometry::Point(3.25, 0.75, 0)),('D', geometry::Point(3.75, 2.75, 0));

Page 21: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Creating a Spatial Index II

CREATE SPATIAL INDEX idxGrid ON Grid(shape)USING GEOMETRY_GRID WITH ( BOUNDING_BOX = (0, 0, 4096, 4096), GRIDS = (

LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM, LEVEL_3 = MEDIUM, LEVEL_4 = MEDIUM),

CELLS_PER_OBJECT = 16);

-- Each L1 cell is 512 x 512-- Each L2 cell is 64 x 64-- Each L3 cell is 8 x 8-- Each L4 cell is 1 x 1

Page 22: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Grid Level 4

A

B

C

D

Page 23: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Finding Intersecting Points

DECLARE @Polygon geometry = 'POLYGON ((1.5 0.5, 3.5 0.5, 3.5 2.5, 1.5

2.5, 1.5 0.5)) ';

SELECT *FROM GridWHERE shape.STIntersects(@Polygon) = 1;

Page 24: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Execution Plan With Spatial Index

Page 25: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

sp_help_spatial_geometry_index

EXEC sp_help_spatial_geometry_index @tabname = Grid, @indexname = idxGrid, @verboseoutput = 1, @query_sample = 'POLYGON ((1.5 0.5, 3.5

0.5, 3.5 2.5, 1.5 2.5, 1.5 0.5))';

Page 26: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Number_Of_ObjectCells_In_Level4_In_Index

A

B

D

C

4

Page 27: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Number_Of_ObjectCells_In_Level4_For_QuerySample 9

Page 28: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Compare the Grid Cells

A

B

C

D

Page 29: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Percentage_Of_Rows_ NotSelected_By_Primary_Filter 25%

A D

B

C

Page 30: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Number_Of_Rows_Selected_By_Primary_Filter

A

3

D

B

C

Page 31: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter

A D

B

C

33

Page 32: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Number_Of_Times_Secondary_Filter_Is_Called

A D

B

C

2

Page 33: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Number_Of_Rows_Output

A

B

D

C

2

Page 34: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Primary_Filter_Efficiency

A D

66

B

C

Page 35: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Internal_Filter_Efficiency

A D

50

B

C

Page 36: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Session Plan

• Do you need geometry and geography?• Constructing a spatial index (the theory)• Filtering spatial query results (the practice)• Optimising spatial queries

Page 37: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Making Sure the Index Is Used

• Use a Supported Method– STIntersects()– STContains(), STWithin(), STTouches()– STDistance()– Filter()

• Syntax must be A.STIntersects(B) = 1• Upgrade to SP1• Use a HINT where necessary

Page 38: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Making the Index Effective

• Three possible outcomes:– Preselection (Internal Filter)– Discarding (Primary Filter)– Secondary Filter

• Adjust Index Settings to fit data in the column and typical query samples

Page 39: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Improving Performance

• Make bounding box as tight as possible• Grid Resolution ↑ ... Cells Per Object ↑• Multiple Indexes (may need HINT)• Use non-spatial predicates• Reduce unnecessary detail• Experiment!

Page 40: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

Want To Know More?

Beginning Spatial with SQL Server 2008

MSDN Spatial Forum http://social.msdn.microsoft.com/ Forums/en-US/sqlspatial/threads

[email protected]

Page 41: Creating High Performance Spatial Databases with SQL Server 2008 Alastair Aitchison.

A Practical Demonstration

• Geonames export• 6.9 million points• Search for those in Newport• Without Index: ~100 rows. 12,391,230 secs• With Index: ~100 rows. < 1 sec