Temple University – CIS Dept. CIS331– Principles of Database Systems V. Megalooikonomou Indexing...

Temple University – CIS Dept.CIS331– Principles of Database Systems

V. Megalooikonomou

Indexing and Hashing I

(based on notes by Silberchatz, Korth, and Sudarshan and notes by C. Faloutsos at CMU)

General Overview - rel. model Relational model - SQL

Formal & commercial query languages

Functional Dependencies Normalization Physical Design Indexing

Indexing- overview primary / secondary indices index-sequential (ISAM) B - trees, B+ - trees hashing

static hashing dynamic hashing

Basic Concepts

Indexing mechanisms speed up access to desired data

E.g., author catalog in library Search Key - attribute to set of attributes used to

look up records in a file An index file consists of records (called index

entries) of the form

Index files are typically much smaller than the original file

Two basic kinds of indices: Ordered indices: search keys are stored in sorted order Hash indices: search keys are distributed uniformly across

“buckets” using a “hash function”

search-key pointer

Indexing once the records are stored in a

file, how do you search efficiently? (e.g., ssn=123?)

STUDENTSsn Name Address

123 smith main str234 jones forbes ave125 tomson main str

Indexing once the records are stored in a

file, how do you search efficiently?

brute force: retrieve all records, report the qualifying ones

better: use indices (pointers) to locate the records directly

Indexing – main idea:

123125234


123 smith main str234 jones forbes ave125 tomson main str

Measuring ‘goodness’ retrieval time?

insertion / deletion?

space overhead?

reorganization?

range queries?

Main concepts search keys are sorted in the index

file and point to the actual records

primary vs. secondary indices

Clustering (sparse) vs

non-clustering (dense) indices

Indexing


123 smith main str234 jones forbes ave678 tomson main str456 stevens forbes ave345 smith forbes ave

123234345456567

Primary key index: on primary key (no duplicates)

Indexing



forbes avemain str

secondary key index: duplicates may exist

Address-index

Indexing



forbes avemain str

secondary key index: typically, with ‘postings lists’

Postings lists

Main concepts – cont’d Clustering (= sparse) index:

records are physically sorted on that key (and not all key values are needed in the index)

Non-clustering (=dense) index: the opposite

E.g.:

Indexing- Sparse index



123456

…

Clustering/sparse index on ssn

>=123

>=456

Sparse Index Files Sparse Index: contains index records for only some

search-key values Applicable when records are sequentially ordered on search-

key To locate a record with search-key value K we:

Find index record with largest search-key value < K Search file sequentially starting at the record to which the

index record points Less space and less maintenance overhead for

insertions and deletions Generally slower than dense index for locating records Good tradeoff: sparse index with an index entry for

every block in file, corresponding to least search-key value in the block

Indexing – Dense Index

Ssn Name Address345 tomson main str234 jones forbes ave567 smith forbes ave456 stevens forbes ave123 smith main str

123234345456567

Non-clustering / dense index

Summary

Dense Sparse

Primary usual

secondary

usual rare

• All combinations are possible…

• at most one sparse/clustering index

• as many as desired dense indices

• usually: one primary-key index (maybe clustering) and a few secondary-key indices (non-clustering)

Indexing- overview primary / secondary indices index-sequential (ISAM) B - trees, B+ - trees hashing


ISAM What if index is too large to search

sequentially?

use a multilevel index…

ISAM



123456

…

>=123

>=456

1233,423

…

block

ISAM - observations if index is too large, store it on disk

and keep index-on-the-index usually two levels of indices, one first-level entry per disk block

(why? )

ISAM - Multilevel Index

ISAM - observations What about insertions/deletions?



123456

…

>=123

>=456

1233,423

…

124; peterson; fifth ave.




123456

…

1233,423

…124; peterson; fifth ave.

overflows

Problems?




123456

…

1233,423


overflows

• overflow chains may become very long - what to do?




123456

…

1233,423


overflows

• overflow chains may become very long - thus:

• shut-down & reorganize

• start with ~80% utilization

So far … indices (like ISAM) suffer in the

presence of frequent updates sequential scan using primary index is

efficient, but a sequential scan using a secondary index is expensive each record access may fetch a new block

from disk

alternative indexing structure: B - trees

Overview primary / secondary indices multilevel (ISAM) B - trees, B+ - trees hashing


B-trees the most successful family of

index schemes (B-trees, B+-trees, B*-trees)

can be used for primary/secondary, clustering/non-clustering index

they are balanced “n-way” search trees

B-trees Disadvantage of indexed-sequential files:

performance degrades as file grows, since many overflow blocks get created. Periodic reorganization of entire file is required

Advantage of B+-tree index files: automatic self-reorganization with small, local,

changes, in the face of insertions and deletions. Reorganization of entire file is not required

Disadvantage of B+-trees: extra insertion and deletion overhead, space overhead

Advantages of B+-trees outweigh disadvantages, and they are used extensively

B-treesE.g., B-tree of order 3 (i.e., at most 3 pointers from each

node):

1 3

6

7

9

13

<6

>6 <9 >9

B-tree properties: each node, in a B-tree of order n :

key order at most n pointers at least n/2 pointers (except root) all leaves at the same level if number of pointers is k, then node has

exactly k-1 keys

v1 v2 … vn-1

p1 pn

Properties “block aware” nodes: each node -> disk

page

O(log (N)) for everything! (ins/del/search)

typically, if N = 50 - 100, then 2 - 3 levels

utilization >= 50%, guaranteed; on average 69%

Queries Algorithm for exact match query? (e.g., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9

Queries Algorithm for exact match query? (e.g., ssn=8?)

1 3

6

7

9

13

<6

>6 <9 >9H steps (= disk accesses)

Queries what about range queries? (e.g.,

5<salary<8) Proximity/ nearest neighbor

searches? (e.g., salary ~ 8 )

Queries what about range queries? (e.g.,

5<salary<8) Proximity/ nearest neighbor searches?

(e.g., salary ~ 8 )

1 3

6

7

9

13

<6

>6 <9 >9

Queries what about range queries? (eg.,

5<salary<8) Proximity/ nearest neighbor searches?

(eg., salary ~ 8 )

1 3

6

7

9

13

<6

>6 <9 >9

B-trees: Insertion Insert in leaf;

on overflow, push middle up (recursively)

split: preserves B - tree properties

B-trees

Easy case: Tree T0; insert ‘8’

1 3

6

7

9

13

<6

>6 <9 >9

B-trees

Tree T0; insert ‘8’

1 3

6

7

9

13

<6

>6 <9 >9

8

B-trees

Hardest case: Tree T0; insert ‘2’

1 3

6

7

9

13

<6

>6 <9 >9

2

B-trees


1 2

6

7

9

133

push middle up

B-trees


6

7

9

131 3

22Ovf; push middle

B-trees


7

9

131 3

2

6

Final state

B-trees - insertion Q: What if there are two

middles? (e.g., order 4) A: either one is fine

B-trees: Insertion Insert in leaf; on overflow, push

middle up (recursively – ‘propagate split’)

split: preserves all B - tree properties (!!)

notice how it grows: height increases when root overflows & splits

Automatic, incremental re-organization (contrast with ISAM!)

INSERTION OF KEY ’K’

find the correct leaf node ’L’;

if ( ’L’ overflows ){

split ’L’, by pushing the middle key upstairs to parent node ’P’;

if (’P’ overflows){

repeat the split recursively;

}

else{

add the key ’K’ in node ’L’; /* maintaining the key order in ’L’ */

}

Pseudo-code

Overview primary / secondary indices multilevel (ISAM) B – trees

Dfn, Search, insertion, deletion

B+ - trees hashing

Deletion

Rough outline of algorithm: Delete key; on underflow, may need to merge

In practice, some implementors just allow underflows to happen…

B-trees – Deletion

Easiest case: Tree T0; delete ‘3’

1 3

6

7

9

13

<6

>6 <9 >9

B-trees – Deletion

Easiest case: Tree T0; delete ‘3’

1

6

7

9

13

<6

>6 <9 >9

B-trees – Deletion Case1: delete a key at a leaf – no underflow Case2: delete non-leaf key – no underflow Case3: delete leaf-key; underflow, and ‘rich

sibling’ Case4: delete leaf-key; underflow, and ‘poor

sibling’

B-trees – Deletion Case1: delete a key at a leaf – no underflow

(delete 3 from T0)

1 3

6

7

9

13

<6

>6 <9 >9

B-trees – Deletion Case2: delete a key at a non-leaf – no

underflow (e.g., delete 6 from T0)

1 3

6

7

9

13

<6

>6 <9 >9

Delete & promote, i.e:


underflow (e.g., delete 6 from T0)

1 3 7

9

13

<6

>6 <9 >9

Delete & promote, i.e.:


underflow (eg., delete 6 from T0)

1 7

9

13

<6

>6 <9 >9

Delete & promote, i.e.:3


underflow (eg., delete 6 from T0)

1 7

9

13

<3

>3 <9 >9

3FINAL TREE

B-trees – Deletion Case2: delete a key at a non-leaf – no underflow (eg.,

delete 6 from T0) Q: How to promote? A: pick the largest key from the left sub-tree (or the

smallest from the right sub-tree)

Observation:

Every deletion eventually becomes a deletion of a leaf key

B-trees – Deletion Case1: delete a key at a leaf – no underflow Case2: delete non-leaf key – no underflow Case3: delete leaf-key; underflow, and

‘rich sibling’ Case4: delete leaf-key; underflow, and ‘poor

sibling’

B-trees – Deletion Case3: underflow & ‘rich sibling’ (eg.,

delete 7 from T0)

1 3

6

7

9

13

<6

>6 <9 >9

Delete & borrow, ie:


delete 7 from T0)

1 3

6 9

13

<6

>6 <9 >9


Rich sibling

B-trees – Deletion Case3: underflow & ‘rich sibling’

‘rich’ = can give a key, without underflowing

‘borrowing’ a key: always THROUGH the PARENT!


delete 7 from T0)

1 3

6 9

13

<6

>6 <9 >9


Rich sibling

NO!!


delete 7 from T0)

1 3

6 9

13

<6

>6 <9 >9



delete 7 from T0)

1

3 9

13

<6

>6 <9 >9


6


delete 7 from T0)

1

3 9

13

<3

>3 <9 >9

Delete & borrow, through the parent

6

FINAL TREE

B-trees – Deletion Case1: delete a key at a leaf – no underflow Case2: delete non-leaf key – no underflow Case3: delete leaf-key; underflow, and ‘rich

sibling’ Case4: delete leaf-key; underflow, and

‘poor sibling’

B-trees – Deletion Case4: underflow & ‘poor sibling’ (eg.,

delete 13 from T0)

1 3

6

7

9

13

<6

>6 <9 >9


delete 13 from T0)

1 3

6

7

9<6

>6 <9 >9


delete 13 from T0)

1 3

6

7

9<6

>6 <9 >9

A: merge w/ ‘poor’ sibling


delete 13 from T0)

Merge, by pulling a key from the parent exact reversal from insertion: ‘split and push

up’, vs. ‘merge and pull down’ Ie.:


delete 13 from T0)

1 3

6

7

<6

>6

A: merge w/ ‘poor’ sibling

9


delete 13 from T0)

1 3

6

7

<6

>69

FINAL TREE

B-trees – Deletion Case4: underflow & ‘poor sibling’ -> ‘pull key from parent, and merge’ Q: What if the parent underflows? A: repeat recursively

B-tree deletion - pseudocodeDELETION OF KEY ’K’

locate key ’K’, in node ’N’

if( ’N’ is a non-leaf node) {

delete ’K’ from ’N’;

find the immediately largest key ’K1’;

/* which is guaranteed to be on a leaf node ’L’ */

copy ’K1’ in the old position of ’K’;

invoke this DELETION routine on ’K1’ from the leaf node ’L’;

else {

/* ’N’ is a leaf node */

... (next slide..)

B-tree deletion - pseudocode/* ’N’ is a leaf node */ if( ’N’ underflows ){ let ’N1’ be the sibling of ’N’; if( ’N1’ is "rich"){ /* ie., N1 can lend us a key */ borrow a key from ’N1’ THROUGH the parent node; }else{ /* N1 is 1 key away from underflowing */ MERGE: pull the key from the parent ’P’, and merge it with the keys of ’N’ and ’N1’ into a new

node; if( ’P’ underflows){ repeat recursively } } }

B-trees in practiceIn practice: no empty leaves; pointers to records

1 3

6

7

9

13

<6

>6 <9 >9theory

B-trees in practiceIn practice: no empty leaves; pointers to records

1 3

6

7

9

13

<6

>6 <9 >9

practice

B-trees in practiceIn practice:

1 3

6

7

9

13

<6

>6 <9 >9

Ssn ……

3

7

6

9

1

B-trees in practice

In practice, the formats are:- leaf nodes: (v1, rp1, v2, rp2, … vn, rpn)- Non-leaf nodes: (p1, v1, rp1, p2, v2, rp2, …)

1 3

6

7

9

13

<6

>6 <9 >9

Overview primary / secondary indices multilevel (ISAM)

B – trees

B+ - trees

hashing

B+ trees - Motivation

B-tree – print keys in sorted order:

1 3

6

7

9

13

<6

>6 <9 >9

B+ trees - Motivation

B-tree needs back-tracking – how to avoid it?

1 3

6

7

9

13

<6

>6 <9 >9

Solution: B+ - trees Facilitate sequential ops

They string all leaf nodes together

AND

Replicate keys from non-leaf nodes, to make sure every key appears at the leaf level !!

B+ trees

1 3

6

6

9

9

<6

>=6 <9 >=9

7 13

B+-Trees (Cont.)

All paths from root to leaf are of the same length

Each node that is not a root or a leaf has between [n/2] and n children

A leaf node has between [(n–1)/2] and n–1 values

Special cases: If the root is not a leaf, it has at least 2 children If the root is a leaf (that is, there are no other nodes

in the tree), it can have between 0 and (n–1) values

A B+-tree is a rooted tree satisfying the following properties:

B+-Tree Node Structure Typical node

Ki are the search-key values Pi are pointers to children (for non-leaf

nodes) or pointers to records or buckets of records (for leaf nodes).

The search-keys in a node are ordered K1 < K2 < K3 < . . . < Kn–1

Leaf Nodes in B+-Trees - Properties

For i = 1, 2, . . ., n–1, pointer Pi either points to a file record with search-key value Ki, or to a bucket of pointers to file records, each record having search-key value Ki. Only need bucket structure if search-key does not form a primary key.

If Li, Lj are leaf nodes and i < j, Li’s search-key values are less than Lj’s search-key values

Pn points to next leaf node in search-key order

Non-Leaf Nodes in B+-Trees - Properties

Non leaf nodes form a multi-level sparse index on the leaf nodes. For a non-leaf node with m pointers: All the search-keys in the subtree to which P1 points

are less than K1

For 2 i n – 1, all the search-keys in the subtree to which Pi points have values greater than or equal to Ki–

1 and less than Km–1

B-Tree vs B+-Tree

B-tree (above) and B+-tree (below) on same data

B+ tree insertionINSERTION OF KEY ’K’ insert search-key value to ’L’ such that the keys are in order; if ( ’L’ overflows) { split ’L’ ; insert (ie., COPY) smallest search-key value of new node to parent node ’P’; if (’P’ overflows) { repeat the B-tree split procedure recursively; /* Notice: the B-TREE split; NOT the B+ -tree */ } }

B+-tree insertion – cont’d

/* ATTENTION:

a split at the LEAF level is handled by COPYING the middle key upstairs;

A split at a higher level is handled by PUSHING the middle key upstairs

*/

B+ trees - insertion

1 3

6

6

9

9

<6

>=6 <9 >=9

7 13

Eg., insert ‘8’


1 3

6

6

9

9

<6

>=6 <9 >=9

7 13

Eg., insert ‘8’

8


1 3

6

6

9

9

<6

>=6 <9 >=9

7 13

Eg., insert ‘8’

8

COPY middle upstairs


1 3

6

6

9<6

>=6 <9>=9

9 13

Eg., insert ‘8’


7 8

7


1 3

6

6

9<6

>=6 <9>=9

9 13

Eg., insert ‘8’


7 8

7

Non-leaf overflow – just PUSH the middle


1 3

6

6

<6

>=6>=9

9 13

Eg., insert ‘8’

7 8

7

9

<7 >=7

<9

FINAL TREE

B-Trees vs B+-Trees

Advantages of B-Tree indices: May use less tree nodes than a corresponding B+-Tree. Sometimes possible to find search-key value before reaching

leaf node. Disadvantages of B-Tree indices:

Only small fraction of all search-key values are found early Non-leaf nodes are larger, so fan-out is reduced. Thus B-Trees

typically have greater depth than corresponding B+-Tree Insertion and deletion more complicated than in B+-Trees Implementation is harder than B+-Trees.

Typically, advantages of B-Trees do not out weigh disadvantages

B*-tree In B-trees, worst case util. = 50%,

if we have just split all the pages how to increase the utilization of B

- trees?

… with B* - trees!

B-trees and B*-trees

E.g., Tree T0; insert ‘2’

1 3

6

7

9

13

<6

>6 <9 >9

2

B*-trees: deferred split! Instead of splitting, LEND keys to

sibling!(through PARENT, of course!)

1 3

6

7

9

13

<6

>6 <9 >9

2

B*-trees: deferred split! Instead of splitting, LEND keys to

sibling!(through PARENT, of course!)

1 2

3

6

9

13

<3

>3 <9 >9

2

7

FINAL TREE

B*-trees: deferred split!

Notice: shorter, more packed, faster tree

It’s a rare case, where space utilization and speed improve together

BUT: What if the sibling has no room for our ‘lending’?

B*-trees: deferred split!

BUT: What if the sibling has no room for our ‘lending’?

A: 2-to-3 split: get the keys from the sibling, pool them with ours (and a key from the parent), and split in 3.

Details: too messy (and even worse for deletion)

Conclusions all B – tree variants can be used for

any type of index: primary/secondary, sparse (clustering), or dense (non-clustering)

All have excellent, O(logN) worst-case performance for ins/del/search

It’s the prevailing indexing method

Overview ordered indices

primary / secondary indices index-sequential multilevel (ISAM)

B - trees, B+ - trees

hashing static hashing dynamic hashing

Temple University – CIS Dept. CIS331– Principles of Database Systems V. Megalooikonomou Indexing...

Documents

Transcript of Temple University – CIS Dept. CIS331– Principles of Database Systems V. Megalooikonomou Indexing...