Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A...

30
Module 6: Tries CS 240 - Data Structures and Data Management Sajed Haque Veronika Irvine Taylor Smith Based on lecture notes by many previous cs240 instructors David R. Cheriton School of Computer Science, University of Waterloo Spring 2017 Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 1 / 28

Transcript of Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A...

Page 1: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Module 6: Tries

CS 240 - Data Structures and Data Management

Sajed Haque Veronika Irvine Taylor SmithBased on lecture notes by many previous cs240 instructors

David R. Cheriton School of Computer Science, University of Waterloo

Spring 2017

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 1 / 28

Page 2: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries

Trie (Radix Tree): A dictionary for binary stringsI Comes from retrieval, but pronounced “try”I A binary tree based on bitwise comparisonsI Similar to radix sort: use individual bits, not the whole key

Structure of trie:I A left child corresponds to a 0 bitI A right child corresponds to a 1 bit

Keys can have different number of bits

Keys are not stored in the trie: a node x is flagged if the path fromroot to x is a binary string present in the dictionary

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 2 / 28

Page 3: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries

Example: A trie forS = {00, 0001, 01001, 011, 01101, 01111, 110, 1101, 111}

0

0

1

1

10

1

0 1

0 0 1

1 1 1

01001

0

1

0001

1

00

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 3 / 28

Page 4: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Search

Search(x):

start from the root

take the left link if the current bit in x is 0 and take the right link if itis 1; return failure if the link is missing

if there are no extra bits in x left and the current node is flagged then- success (x is found)

recurse

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 4 / 28

Page 5: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Search

Example: Search(011)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 5 / 28

Page 6: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Search

Example: Search(0101)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 5 / 28

Page 7: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Search

Example: Search(1101)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 5 / 28

Page 8: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Insert

Insert(x)I Search for x , and suppose we finish at a node v

Note: x may have extra bits.I Expand the trie from the node v by adding necessary nodes that

correspond to extra bits of x ; flag the last one.

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 6 / 28

Page 9: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Insert

Example: Insert(101)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 7 / 28

Page 10: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Insert

Example: Insert(0100)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

0

1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 8 / 28

Page 11: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Insert

Example: Insert(11101)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

0

1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 9 / 28

Page 12: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Delete

Delete(x)I Search for xI if x found at an internal flagged node, then unflag the nodeI if x found at a leaf vx , delete the leaf and all ancestors of vx until

F we reach an ancestor that has two children orF we reach a flagged node

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 10 / 28

Page 13: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Delete

Example: Delete(011)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 11 / 28

Page 14: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Delete

Example: Delete(0001)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

0

1 1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 12 / 28

Page 15: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Delete

Example: Delete(01001)

0

0

1

1

10

1

0 1

0 0 1

1 1 1

1

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 13 / 28

Page 16: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Tries: Operations

Search(x)

Insert(x)

Delete(x)

Time Complexity of all operations: Θ(|x |)|x |: length of binary string x , i.e., the number of bits in x

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 14 / 28

Page 17: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries (Patricia Tries)

Patricia: Practical Algorithm To Retrieve Information Coded inAlphanumeric

Introduced by Morrison (1968)

Reduces storage requirement: eliminate unflagged nodes with onlyone child

Every path of one-child unflagged nodes is compressed to a singleedge

Each node stores an index indicating the next bit to be tested duringa search (index= 0 for the first bit, index= 1 for the second bit, etc)

A compressed trie storing n keys always has at most n − 1 internal(non-leaf) nodes

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 15 / 28

Page 18: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries (Patricia Tries)

Each node stores an index indicating the next bit to be tested duringa search

Example: A trie and the equivalent compressed trie

0

0

1

1

10

1

0 1

0 0 1

1 1 1

01001

0

1

0001

1

00

0111101101

1101

111

0

0

1

1

10

0 1

0 1

10

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3 011

,3 110 111

1101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 16 / 28

Page 19: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Search(x):I Follow the proper path from the root down in the tree to a leafI If search ends in an unflagged node, it is unsuccessfulI If search ends in a flagged node, we need to check if the key stored is

indeed x

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 17 / 28

Page 20: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Example: Search(01001)

0

0

1

1

10

0 1

0 1

10

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3 011

,3 110 111

1101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 18 / 28

Page 21: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Example: Search(11)

0

0

1

1

10

0 1

0 1

10

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3 011

,3 110 111

1101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 19 / 28

Page 22: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Example: Search(101)

0

0

1

1

10

0 1

0 1

10

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3 011

,3 110 111

1101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 20 / 28

Page 23: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Delete(x):I Perform Search(x)I if search ends in an internal node, then

F if the node has two children, then unflag the node and delete the keyF else delete the node and make his only child, the child of its parent

I if search ends in a leaf, then delete the leaf andI if its parent is unflagged, then delete the parent

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 21 / 28

Page 24: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Example: Delete(110)

0

0

1

1

10

0 1

0 1

10

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3 011

,3 110 111

1101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 22 / 28

Page 25: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Example: Delete(011)

0

0

1

1

10

0 1

0 1

0

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3 011

1111101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 23 / 28

Page 26: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Example: Delete(01101)

0

0

1

1

10

0 1

0 1

0

0 , −

1 , −

2 , −

2 , −

2 , 00

0001 01001 ,3

1111101

01101 01111

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 24 / 28

Page 27: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Compressed Tries: Operations

Insert(x):I Perform Search(x)I If the search ends at a leaf L with key y , compare x against y .I If y is a prefix of x , add a child to y containing x .I Else, determine the first index i where they disagree and create a new

node N with index i .Insert N along the path from the root to L so that the parent of N hasindex < i and one child of N is either L or an existing node on the pathfrom the root to L that has index > i .The other child of N will be a new leaf node containing x .

I If the search ends at an internal node, we find the key corresponding tothat internal node and proceed in a similar way to the previous case.

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 25 / 28

Page 28: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Multiway Tries

To represent Strings over any fixed alphabet Σ

Any node will have at most |Σ| children

Example: A trie holding strings {bear, bell, ben, soul, soup}

bear bell

ben

soul soup

b s

oe

a l nu

r l l p

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 26 / 28

Page 29: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Multiway Tries

Append a special end-of-word character, say $, to all keys

Example: A trie holding strings {bear, bell, be, so, soul, soup}

bear bell

be

soul soup

b s

oe

a l $ u

$ $

l pso

$

rl

$ $

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 27 / 28

Page 30: Module 6: Triesresearch.cs.queensu.ca/home/tsmith/lec/S17CS240/... · Tries Trie (Radix Tree): A dictionary for binary strings I Comes from retrieval, but pronounced \try" I A binary

Multiway Tries

Compressed multi-way tries

Example: A compressed trie holding strings {bear, bell, be, so, soul,soup}

0

2

bear bell be

soul

3

soup

so

b s

a l$ u

$

l p

2

Haque, Irvine, Smith (SCS, UW) CS240 - Module 6 Spring 2017 28 / 28