Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. ·...

51
Relational Division What it is, when to use it and how to use it.

Transcript of Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. ·...

Page 1: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Relational Division

What it is, when to use it and how to use it.

Page 2: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Who am I?

• Teacher Institute of Education (1990-1992)– Matemathics and Physics

• Self-employed (1993-1998)

• Long time consultant (1993-)

• MVP – Microsoft Valuable Professional (2009-)

• SolidQ (2011-)

• Active on several forums– SwePeso, Peso or Pesomannen

• www.sqlteam.com

• www.sqlservercentral.com

• msdn.microsoft.com

• Blog at– http://weblogs.sqlteam.com/peterl

– http://www.sqltopia.com

Page 3: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

BACKGROUND

Relational database theory

Page 4: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Codd & Date Consulting Group

Edgar Frank "Ted" Codd

• August 23, 1923 – April 18, 2003

• British computer scientist who, while

working for IBM, invented the relational

model for database management, the

theoretical basis for relational

databases.

Chris Date

• January 18, 1941 –

• Is an independent author, lecturer,

researcher, and consultant,

specializing in relational database

theory.

• The Third Manifesto - 1995

Page 5: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Relational Algebra

• Similar to normal algebra, except we use relations as values instead of numbers.

• Not used as a query language in actual DBMS.

• Relations are seen as sets of tuples, which means that no duplicates are allowed.

– SQL behaves differently in some cases.

• T-SQL is declarative, which means that you tell the DBMS what you want, but not

how it is to be calculated

– A .Net-program is procedural, which means that you have to state, step by step, exactly how

the result should be calculated.

• Relational algebra is mathematical expressions

Page 6: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Defining relational division

• Relational division is a binary operation involving two sets, R and S.

– The operation can be written just like a mathematical division, R ÷ S.

– Set R should have at least two attributes, A1 and A2.

– The result, Q will be a set of values from A1 in set R for which there’s a corresponding A2

value to match every row in set S.

• The relational division operation is

– Normally a time-consuming and complex operation.

– Effectively the opposite of the Cartesian product.

Page 7: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Practical uses

• Running a dating site and want to search for an arbitrary number of attributes

• Pattern matching; for example when you have only a part of a song and want to search for the full song, either binary or by the lyrics

• Which customers have been served by James and Gabrielle

• Which rental cars have had both engine overhaul and a paint job

Page 8: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Examples and visualization

Joe Celko

• Is an American relational database

expert from Austin, Texas

• He has participated on the ANSI X3H2

Database Standards Committee, and

helped write the SQL-89 and SQL-92

standards

Examples

• We’ll use his known example with

PilotsSkills and Hangars tables

Page 9: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Main example

PilotSkills

PilotName PlaneType

Celko Piper Cub

Higgins B-52 Bomber

Higgins F-14 Fighter

Higgins Piper Cub

Jones B-52 Bomber

Jones F-14 Fighter

Smith B-1 Bomber

Smith B-52 Bomber

Smith F-14 Fighter

Wilson B-1 Bomber

Wilson B-52 Bomber

Wilson F-14 Fighter

Wilson F-17 Fighter

Hangar

PlaneType

B-1 Bomber

B-52 Bomber

F-14 Fighter

PilotSkills ÷ HangarPilotName

Smith

Wilson

Page 10: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

WilsonSmith

Exakt division

Celko Higgins Jones

Piper Cub

B-1 Bomber

B-52 Bomber

F-14 Fighter

F-17 Fighter

Page 11: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Smith Wilson

Division with remainder

Celko Higgins Jones

Piper Cub

B-1 Bomber

B-52 Bomber

F-14 Fighter

F-17 Fighter

Page 12: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

DIVIDED WE STAND

Have I already used division?

Page 13: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

WilsonSmithJonesHiggins

Single record, exact division

Piper Cub

B-1 Bomber

B-52 Bomber

F-14 Fighter

F-17 Fighter

Celko

Page 14: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

WilsonSmithJones

Single record, with remainder

Piper Cub

B-1 Bomber

B-52 Bomber

F-14 Fighter

F-17 Fighter

HigginsCelko

Page 15: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

What did we just do?

• Single record

• Single column

• Division with remainder

• WHERE clause!

– WHERE Col1 = 99

Page 16: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

WilsonSmithJones

Is WHERE really a division?

Piper Cub

B-1 Bomber

B-52 Bomber

F-14 Fighter

F-17 Fighter

Celko Higgins

Page 17: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Records Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

Types of Relational Division

Relational Division Types

Rows Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

You are already an expert!

• Division is the most common arithmetic operation

• However, it only works for a single record subset

• Wait? Multiple column too?

• Yes– WHERE Col1 = 101

AND Col2 = 9

Page 18: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

EXACT DIVISION

The simple way

Page 19: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Exact division, common

Page 20: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Exact division, advanced

Page 21: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Exact division, advanced

Page 22: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

A new Relational Division algorithm

Page 23: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

COMMON SET-BASED

ALGORITHMS

With remainder

Page 24: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Codd

• Uses double negation– “There ain't no planes in this hangar that I can't fly!“

• Can’t be used for exact division

Page 25: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Chris Date (Celko #1)

• Made popular by Celko

Page 26: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Celko #2

Page 27: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Todd’s Division

• Written by Pierre Mullin

Page 28: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

COMMON SET-BASED

ALGORITHMS

Exact Division

Page 29: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Celko #3

Page 30: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

COMPARISON GRAPH AND

CHART

Exact Division

Page 31: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Celko algorithms

Page 32: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Celko algorithms

Page 33: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Celko algorithms

Page 34: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

New Relational Division algorithms

• http://www.sqlteam.com/forums/

topic.asp?TOPIC_ID=70832

• http://weblogs.sqlteam.com/

peterl/archive/2010/08/19/

checksum-weakness-

explained.aspx

Page 35: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Exact division Comparison Graph

Hangar - Sum of Reads

Hangar - Sum of Scans

PilotSkills - Sum of Reads

PilotSkills - Sum of Scans

Worktable - Sum of ReadsWorktable - Sum of Scans

0

5

10

15

20

25

30

35

40

Hangar - Sum of Reads

Hangar - Sum of Scans

PilotSkills - Sum of Reads

PilotSkills - Sum of Scans

Worktable - Sum of Reads

Worktable - Sum of Scans

Page 36: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Comparison Chart

Column Labels

Hangar PilotSkills Worktable Total Sum of Reads Total Sum of Scans

Row Labels Sum of Reads Sum of Scans Sum of Reads Sum of Scans Sum of Reads Sum of Scans

Celko 1 20 2 6 3 26 5

Celko 2 32 3 8 4 40 7

Celko 3 10 3 2 1 34 3 46 7

Celko 4 38 19 9 2 34 3 81 24

Celko 5 23 3 2 1 25 4

Peso experimental 2 1 2 1 4 2

Peso fast 8 3 4 2 12 5

Grand Total 133 34 33 14 68 6 234 54

Page 37: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

COMPARISON GRAPH AND

CHART

With remainder

Page 38: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Remainder division Comparison Graph

Hangar - Sum of Reads

Hangar - Sum of Scans

PilotSkills - Sum of Reads

PilotSkills - Sum of Scans

Worktable - Sum of Reads

Worktable - Sum of Scans

0

5

10

15

20

25

30

35

40

Celko 1Celko 3

Celko 4Celko 5

Celko 6Peso fast

Hangar - Sum of Reads

Hangar - Sum of Scans

PilotSkills - Sum of Reads

PilotSkills - Sum of Scans

Worktable - Sum of Reads

Worktable - Sum of Scans

Page 39: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Comparison Chart

Column Labels

Hangar PilotSkills Worktable Total Sum of Reads Total Sum of Scans

Row Labels Sum of Reads Sum of Scans Sum of Reads Sum of Scans Sum of Reads Sum of Scans

Celko 1 20 2 6 3 26 5

Celko 3 18 3 2 1 31 3 51 7

Celko 4 38 19 16 8 31 3 85 30

Celko 5 23 3 2 1 25 4

Celko 6 20 10 2 1 22 11

Peso fast 16 8 6 3 22 11

Grand Total 135 45 34 17 62 6 231 68

Page 40: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Rows Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

Rows Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

Types of Relational Division

Relational Division Types What’s next?

• As you can see, neither of the previous

shown algorithms handles all types of

division

Page 41: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

FINDING A COMMON ALGORITHM

Multiple columns

Page 42: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Comparison Graph

Hangar - Sum of Scans

Hangar - Sum of Reads

PilotSkills - Sum of Scans

PilotSkills - Sum of Reads

0

5

10

15

20

25

Celko 5 (exact)Celko 5

(remainder) Peso (exact)Peso (remainder)

Hangar - Sum of Scans

Hangar - Sum of Reads

PilotSkills - Sum of Scans

PilotSkills - Sum of Reads

Page 43: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Comparison Chart

Column Labels

Hangar PilotSkills Total Sum of Scans Total Sum of Reads

Row Labels Sum of Scans Sum of Reads Sum of Scans Sum of Reads

Celko 5 (exact) 3 23 1 2 4 25

Celko 5 (remainder) 3 23 1 2 4 25

Peso (exact) 4 8 2 4 6 12

Peso (remainder) 8 16 3 6 11 22

Grand Total 18 70 7 14 25 84

Page 44: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Comparison Graph, megasize

Sum of CPU

Sum of Duration

Sum of Reads

0

2 000 000

4 000 000

6 000 000

8 000 000

10 000 000

12 000 000

14 000 000

16 000 000

18 000 000

20 000 000

Celko 5 (megasize, exact)

Peso (megasize, exact)

18 472 449

61 962

18 181 022

39 050

11 300 678

51 879

Sum of CPU

Sum of Duration

Sum of Reads

Page 45: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Comparison Chart, megasize

Row Labels Sum of CPU Sum of Duration Sum of Reads

Celko 5 (megasize, exact) 18 472 449 18 181 022 11 300 678

Peso (megasize, exact) 61 962 39 050 51 879

Grand Total 18 534 411 18 220 072 11 352 557

Page 46: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Final example

PilotSkills

PilotName PlaneType Model

Celko Piper Cub A

Higgins B-52 Bomber A

Higgins F-14 Fighter J

Higgins Piper Cub A

Jones B-52 Bomber A

Jones F-14 Fighter J

Smith B-1 Bomber A

Smith B-52 Bomber B

Smith F-14 Fighter K

Wilson B-1 Bomber A

Wilson B-52 Bomber B

Wilson F-14 Fighter J

Wilson F-17 Fighter C

Hangar

SetID PlaneType Model

1 B-1 Bomber A

1 B-52 Bomber B

2 F-14 Fighter J

PilotSkills ÷ HangarSetID PilotName

1 Smith

1 Wilson

2 HIggins

2 Jones

2 Wilson

Page 47: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

One query to rule them all

Page 48: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Rows Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

Rows Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

Rows Columns Remainder You

Single Single Yes

Single Single No

Single Multiple Yes

Single Multiple No

Multiple Single Yes

Multiple Single No

Multiple Multiple Yes

Multiple Multiple No

Types of Relational Division

Relational Division Types And as a bonus, several divisors!

• http://connect.microsoft.com/SQLServ

er/feedback/details/670531/move-t-sql-

language-closer-to-completion-with-a-

divide-by-operator

• http://bit.ly/u3318c

Page 49: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Succesful implementations and PoC

• An international Telecom company

– About 100 million rows

• 2 seconds vs ~1300 seconds

• An archiving medical system

– Projected one billion patients notes

– Already at 300 million rows

• 0.005 seconds vs ~65 seconds

Page 50: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Help make a difference!

• Buy the book!

Page 51: Relational Division Explainedpublications.sqltopia.com/Relational Division.pdf · 2013. 11. 12. · Joe Celko • Is an American relational database expert from Austin, Texas •

Want to know more about me?

• Homepage and contact– http://www.sqltopia.com

– http://www.developerworkshop.net/

• Twitter– @SwePeso

• Co-founder of PASS Scania– http://www.pass-scania.se/

• Local mentor– http://www.SQLUG.se

• Phil Factor Speed Phreak challenges– 3 time winner