Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a...

12
Indexing By: Arnold Mesa

description

There are two kinds... 4 Ordered Indices - sorted ordering of the values. 4 Hash Indices - a uniform distribution of values across a range of buckets. The distribution is based on a hash function.

Transcript of Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a...

Page 1: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

IndexingBy: Arnold Mesa

Page 2: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

Indexing

You can think of an index to a file like a catalogue to a library

Page 3: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

There are two kinds...

Ordered Indices - sorted ordering of the values.

Hash Indices - a uniform distribution of values across a range of buckets. The distribution is based on a hash function.

Page 4: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

Key Concepts

Access Types - types of access that are supported efficiently

Access Time - time it takes to access a particular data item

Insertion Time - time it takes to insert a data item Deletion Time - time it takes to delete a data item Space Overhead - additional space occupied by an index

structure

Page 5: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

There are two kinds of ordered indices

– Dense Index - An index record appears for every search-key value in the file. The index record contains the search-key value and a pointer to the first data record. The rest of the records with the same search key-value would be sequentially stored after the first record.

– Sparse Index - An index record appears for only some of the search key values. So you have a smaller number of index records. Each index contains a search key and a pointer to the first record, as with the dense index.

Page 6: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

234 Hotel Sofitel A-212

321 Hilton B-321

389 Hilton C-002

396 Hilton A-322

112 Westin C-034

253 Westin B-219

501 Marriot B-069

532 Marriot C-304

221 The Ritz A-007

Hotel SofitelHiltonWestinMarriotThe Ritz

Dense Index

Page 7: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

234 Hotel Sofitel A-212

321 Hilton B-321

389 Hilton C-002

396 Hilton A-322

112 Westin C-034

253 Westin B-219

501 Marriot B-069

532 Marriot C-304

221 The Ritz A-007

Hotel SofitelWestinThe Ritz

Sparse Tree

Page 8: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

234 Hotel Sofitel A-212

321 Hilton B-321

389 Hilton C-002

396 Hilton A-322

112 Westin C-034

253 Westin B-219

501 Marriot B-069

532 Marriot C-304

221 The Ritz A-007

Hotel SofitelWestinThe Ritz

Suppose we want to find the Marriot #532...

Page 9: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

Efficiency Issues Even if we use a sparse index, the index itself may become

too large for efficient processing

If an index is sufficiently small to be kept in main memory, the search time would be low

If the index is large that is kept on disk, a search may require several disk block reads

Page 10: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

How to deal ...

With a large index we should construct a sparse index on the primary index.

234 Hotel Sofitel A-212

321 Hilton B-321

389 Hilton C-002

396 Hilton A-322

112 Westin C-034

253 Westin B-219

501 Marriot B-069

532 Marriot C-304

221 The Ritz A-007

Hotel SofitelHiltonWestinMarriot

The Ritz

Hotel Sofitel

Marriot

Marriot

Page 11: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

Is this looking familiar? Remember B+-trees

– B+ trees are said to be of m-order. A number of the designers choosing.– Each leaf has between m and [m-2] children.– All data is stored at the leaf level.– All leaves are at the same depth

Page 12: Indexing By: Arnold Mesa. Indexing You can think of an index to a file like a catalogue to a library.

Example?