September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to...

47
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5. 1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demain e LECTURE5 Sorting Lower Bounds Decision trees Linear-Time Sorting Counting sort Radix sort Appendix: Punched cards

Transcript of September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to...

Page 1: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L51

Introduction to Algorithms6046J18401J

Prof Erik Demaine

LECTURE5Sorting Lower Boundsbull Decision treesLinear-Time Sortingbull Counting sortbull Radix sortAppendix Punched cards

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L52

How fast can we sort

All the sorting algorithms we have seen so far are comparison sorts only use comparisons to determine the relative order of elements

bull Eg insertion sort merge sort quicksort heapsort

The best worst-case running time that wersquove seen for comparison sorting is O(nlgn)

Is O(nlgn) the best we can do

Decision trees can help us answer this question

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L53

Decision-tree example

Sort 〈 a1 a2 hellip an〉

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 2: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L52

How fast can we sort

All the sorting algorithms we have seen so far are comparison sorts only use comparisons to determine the relative order of elements

bull Eg insertion sort merge sort quicksort heapsort

The best worst-case running time that wersquove seen for comparison sorting is O(nlgn)

Is O(nlgn) the best we can do

Decision trees can help us answer this question

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L53

Decision-tree example

Sort 〈 a1 a2 hellip an〉

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 3: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L53

Decision-tree example

Sort 〈 a1 a2 hellip an〉

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 4: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 5: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 6: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56

Decision-tree example

Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 7: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57

Decision-tree example

Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉

Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 8: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58

Decision-tree model

A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 9: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59

Lower bound for decision-tree sorting

Theorem Any decision tree that can sort n elements must have height Ω(nlgn)

Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h

(lg is mono Increasing)(Stirlingrsquos formula)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 10: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510

Lower bound for comparison sorting

Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 11: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511

Sorting in linear time

Counting sort No comparisons between elements

bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 12: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512

Counting sort

for ilarr1 to k do C[i] larr0

for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 13: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513

Counting-sort example

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 14: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514

Loop 1

for ilarr1 to k do C[i] larr0

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 15: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 16: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 17: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 18: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 19: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519

Loop 2

for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 20: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 21: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 22: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522

Loop 3

for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 23: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 24: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 25: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 26: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 27: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527

Loop 4

for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 28: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528

Analysisfor

for

for

for

to

to

to

downto

do

do

do

do

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 29: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529

Running time

If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 30: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530

Stable sorting

Counting sort is a stable sort it preserves the input order among equal elements

Exercise What other sorts have this property

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 31: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531

Radix sort

bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )

bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 32: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532

Operation of radix sort

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 33: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 34: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 35: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535

Correctness of radix sort

Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 36: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536

Analysis of radix sort

bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits

How many passes should we make

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 37: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537

Analysis (continued)

Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have

Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 38: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538

Choosing r

Minimize T(nb) by differentiating and setting to 0

Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint

Choosing r = lgn implies T(nb) = (bnlgn)

bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 39: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539

Conclusions

In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes

Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 40: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540

Appendix Punched-card technology

bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology

Return to last slide viewed

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 41: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541

Herman Hollerith(1860-1929)

bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 42: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542

Punched cards

bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator

Holleriths tabulating system punch cardin Genealogy Article on the Internet

Image removed due to copyright restrictions

Replica of punch card from the 1900 US census [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 43: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543

Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box

Image removed due to copyright restrictions

ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 44: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544

Operation of the sorter

bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort

Image removed due to copyright restrictions

Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 45: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545

Origin of radix sort

Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort

ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo

Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 46: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546

ldquoModernrdquo IBM card

bull One character per column

Image removed due to copyright restrictions

To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg

Produced by the WWW Virtual Punch-Card Server

So thatrsquos why text windows have 80 columns

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47
Page 47: September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.

September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547

Web resources on punched-card technology

bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history

  • Slide 1
  • Slide 2
  • Slide 3
  • Slide 4
  • Slide 5
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Slide 17
  • Slide 18
  • Slide 19
  • Slide 20
  • Slide 21
  • Slide 22
  • Slide 23
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Slide 33
  • Slide 34
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Slide 45
  • Slide 46
  • Slide 47