Part I: Linkages c: Locked Chains Joseph ORourke Smith College (Many slides made by Erik Demaine)
September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to...
-
Upload
alyson-nichols -
Category
Documents
-
view
227 -
download
2
Transcript of September 26, 2005Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to...
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L51
Introduction to Algorithms6046J18401J
Prof Erik Demaine
LECTURE5Sorting Lower Boundsbull Decision treesLinear-Time Sortingbull Counting sortbull Radix sortAppendix Punched cards
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L52
How fast can we sort
All the sorting algorithms we have seen so far are comparison sorts only use comparisons to determine the relative order of elements
bull Eg insertion sort merge sort quicksort heapsort
The best worst-case running time that wersquove seen for comparison sorting is O(nlgn)
Is O(nlgn) the best we can do
Decision trees can help us answer this question
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L53
Decision-tree example
Sort 〈 a1 a2 hellip an〉
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L52
How fast can we sort
All the sorting algorithms we have seen so far are comparison sorts only use comparisons to determine the relative order of elements
bull Eg insertion sort merge sort quicksort heapsort
The best worst-case running time that wersquove seen for comparison sorting is O(nlgn)
Is O(nlgn) the best we can do
Decision trees can help us answer this question
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L53
Decision-tree example
Sort 〈 a1 a2 hellip an〉
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L53
Decision-tree example
Sort 〈 a1 a2 hellip an〉
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L54
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L55
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L56
Decision-tree example
Each internal node is labeled ij for i j 1 2hellip isin nbull The left subtree shows subsequent comparisons if ai le ajbull The right subtree shows subsequent comparisons if ai ge aj
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L57
Decision-tree example
Sort 〈 a1 a2 a3〉= 〈 9 4 6 〉
Each leaf contains a permutation 〈 π(1) π(2)hellip π(n)〉 to indicate that the ordering aπ(1) le aπ(2) le hellip le aπ(n) has been established
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L58
Decision-tree model
A decision tree can model the execution of any comparison sortbull One tree for each input size n bull View the algorithm as splitting whenever it compares two elementsbull The tree contains the comparisons along all possible instruction tracesbull The running time of the algorithm = the length of the path takenbull Worst-case running time = height of tree
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L59
Lower bound for decision-tree sorting
Theorem Any decision tree that can sort n elements must have height Ω(nlgn)
Proof The tree must contain gen leaves since there are n possible permutations A height-h binary tree has le2h leaves Thus n le2h
(lg is mono Increasing)(Stirlingrsquos formula)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L510
Lower bound for comparison sorting
Corollary Heapsort and merge sort are asymptotically optimal comparison sorting algorithms
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L511
Sorting in linear time
Counting sort No comparisons between elements
bull Input A[1 n] where A[j] 1 2 hellip isin kbull Output B[1 n] sortedbull Auxiliary storage C[1 k]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L512
Counting sort
for ilarr1 to k do C[i] larr0
for jlarr1 to n do C[A[j]] larr C[A[j]] + 1 ⊳ C[i] = |key = i|
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
for jlarrn downto1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L513
Counting-sort example
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L514
Loop 1
for ilarr1 to k do C[i] larr0
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L515
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L516
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L517
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L518
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L519
Loop 2
for jlarr1 to n do C[A[j]] larrC[A[j]] + 1 ⊳ C[i] = |key = i|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L520
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L521
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L522
Loop 3
for ilarr2 to k do C[i] larrC[i] + C[indash1] ⊳ C[i] = |key lei|
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L523
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L524
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L525
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L526
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L527
Loop 4
for jlarrn downto 1 do B[C[A[j]]] larrA[j] C[A[j]] larrC[A[j]] ndash1
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L528
Analysisfor
for
for
for
to
to
to
downto
do
do
do
do
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L529
Running time
If k = O(n) then counting sort takes (n) timebull But sorting takes Ω(nlgn) timebull Wherersquos the fallacyAnswerbull Comparison sorting takes Ω(nlgn) timebull Counting sort is not a comparison sortbull In fact not a single comparison between elements occurs
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L530
Stable sorting
Counting sort is a stable sort it preserves the input order among equal elements
Exercise What other sorts have this property
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L531
Radix sort
bull Origin Herman Hollerithrsquos card-sorting machine for the 1890 US Census (See Appendix )
bull Digit-by-digit sortbull Hollerithrsquos original (bad) idea sort on most-significant digit firstbull Good idea Sort on least-significant digit first with auxiliary stable sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L532
Operation of radix sort
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L533
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L534
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L535
Correctness of radix sort
Induction on digit position bull Assume that the numbers are sorted by their low-order t ndash1digitsbull Sort on digit t 1048707 Two numbers that differ in digit t are correctly sorted1048707 Two numbers equal in digit t are put in the same order as the input correct orderrArr
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L536
Analysis of radix sort
bull Assume counting sort is the auxiliary stable sortbull Sort n computer words of b bits eachbull Each word can be viewed as having br base-2r digitsExample 32-bit wordr = 8 rArr br = 4 passes of counting sort on base-28 digits or r = 16 rArr br = 2 passes of counting sort on base-216 digits
How many passes should we make
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L537
Analysis (continued)
Recall Counting sort takes (n + k) time to sort n numbers in the range from 0 to k ndash1If each b-bit word is broken into r-bit pieces each pass of counting sort takes (n + 2r) time Since there are br passes we have
Choose r to minimize T(nb)bull Increasing r means fewer passes but as r ≫ lg n the time grows exponentially
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L538
Choosing r
Minimize T(nb) by differentiating and setting to 0
Or just observe that we donrsquot want 2r ≫ n and therersquos no harm asymptotically in choosing r as large as possible subject to this constraint
Choosing r = lgn implies T(nb) = (bnlgn)
bull For numbers in the range from 0 to ndndash1 we have b =d lg n radix sort runs in (rArr dn) time
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L539
Conclusions
In practice radix sort is fast for large inputs as well as simple to code and maintainExample (32-bit numbers)bull At most 3 passes when sorting ge2000 numbersbull Merge sort and quick sort do at least lg2000 = 11passes
Downside Unlike quicksort radix sort displays little locality of reference and thus a well-tuned quicksort fares better on modern processors which feature steep memory hierarchies
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L540
Appendix Punched-card technology
bull Herman Hollerith (1860-1929)bull Punched cardsbull Hollerithrsquos tabulating systembull Operation of the sorterbull Origin of radix sortbull ldquoModernrdquo IBM cardbull Web resources on punched- card technology
Return to last slide viewed
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L541
Herman Hollerith(1860-1929)
bull The 1880 US Census took almost 10 years to processbull While a lecturer at MIT Hollerith prototyped punched-card technologybull His machines including a ldquocard sorterrdquoallowed the 1890 census total to be reported in 6 weeksbull He founded the Tabulating Machine Company in 1911 which merged with other companies in 1924 to form International Business Machines
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L542
Punched cards
bull Punched card = data recordbull Hole = value bull Algorithm = machine +human operator
Holleriths tabulating system punch cardin Genealogy Article on the Internet
Image removed due to copyright restrictions
Replica of punch card from the 1900 US census [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L543
Hollerithrsquos tabulating systembull Pantograph card punchbull Hand-press readerbull Dial countersbull Sorting box
Image removed due to copyright restrictions
ldquoHollerith Tabulator and Sorter Showing details of the mechanical counter and the tabulator press rdquoFigure from [Howells 2000]
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L544
Operation of the sorter
bull An operator inserts a card into the pressbull Pins on the press reach through the punched holes to make electrical contact with mercury- filled cups beneath the cardbull Whenever a particular digit value is punched the lid of the corresponding sorting bin liftsbull The operator deposits the card into the bin and closes the lidbull When all cards have been processed the front panel is opened and the cards are collected in order yielding one pass of a stable sort
Image removed due to copyright restrictions
Hollerith Tabulator Pantograph Press and Sorter (httpwwwcolumbiaeduacishistorycensus-tabulatorhtml)
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L545
Origin of radix sort
Hollerithrsquos original 1889 patent alludes to a most-significant-digit-first radix sort
ldquoThe most complicated combinations can readily be counted with comparatively few counters or relays by first assorting the cards according to the first items entering into the combinations then reassorting each group according to the second item entering into the combination and so on and finally counting on a few counters the last item of the combination for each group of cardsrdquo
Least-significant-digit-first radix sort seems to be a folk invention originated by machine operators
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L546
ldquoModernrdquo IBM card
bull One character per column
Image removed due to copyright restrictions
To view image visit httpwwwmuseumwaalsdorpnlcomputerimagesibmcardjpg
Produced by the WWW Virtual Punch-Card Server
So thatrsquos why text windows have 80 columns
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-
September 26 2005 Copyright copy 2001-5 Erik D Demaine and Charles E Leiserson L547
Web resources on punched-card technology
bull Doug Jonesrsquos punched card indexbull Biography of Herman Hollerithbull The 1890 US Censusbull Early history of IBMbull Pictures of Hollerithrsquos inventionsbull Hollerithrsquos patent application (borrowed from Gordon Bellrsquos CyberMuseum)bull Impact of punched cards on US history
- Slide 1
- Slide 2
- Slide 3
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Slide 8
- Slide 9
- Slide 10
- Slide 11
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Slide 21
- Slide 22
- Slide 23
- Slide 24
- Slide 25
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Slide 33
- Slide 34
- Slide 35
- Slide 36
- Slide 37
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Slide 45
- Slide 46
- Slide 47
-