O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

7
A Historical Note on the Fisher-Yates and Durstenfeld Shuffle Algorithms Derek O’Connor September 16, 2011 * 1 Algorithms for Random Permutations Algorithms and programs for generating random permutations have been around since the start of the computer age. In fact Knuth has tracked down the first known algorithm to R. A. Fisher and F. Yates, Statistical Tables, (London, 1938), Example 12. 1 There appear to be just two types of random permutation algorithms. These are based on: (1) Random transpositions, and (2) Sorting a random vector. Given a set of n elements S = {s 1 , s 2 ,..., s n }, the production of random permutations of S may be viewed in different ways: 1. Take a sample of size n without replacement from the set of n elements S. 2. Take a sample of size 1 from the n! permutations of S. 3. Build a permutation randomly step-by-step, by in-place transformations. All three views are useful and should be kept in mind. Conceptually, the first method is the simplest: take a random sample of size n without replacement from a set of n elements S = {s 1 , s 2 ,..., s n }. * Started: 3 Sept 2011. Web: email : 1 Available here: 1

description

Durstenfeld's Random Permutation (1964) algorithm is shown to be the first optimal random permutation generator to be published. It is not merely a computer version of the older Fisher-Yates Shuffle algorithm (1938) , which is not optimal.

Transcript of O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Page 1: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

A Historical Note on the

Fisher-Yates and Durstenfeld Shuffle Algorithms

Derek O’Connor

September 16, 2011*

1 Algorithms for Random Permutations

Algorithms and programs for generating random permutations have been around sincethe start of the computer age. In fact Knuth has tracked down the first known algorithmto R. A. Fisher and F. Yates, Statistical Tables, (London, 1938), Example 12.1 There appearto be just two types of random permutation algorithms. These are based on: (1) Randomtranspositions, and (2) Sorting a random vector.

Given a set of n elements S = {s1, s2, . . . , sn}, the production of random permutations ofS may be viewed in different ways:

1. Take a sample of size n without replacement from the set of n elements S.

2. Take a sample of size 1 from the n! permutations of S.

3. Build a permutation randomly step-by-step, by in-place transformations.

All three views are useful and should be kept in mind.

Conceptually, the first method is the simplest: take a random sample of size n withoutreplacement from a set of n elements S = {s1, s2, . . . , sn}.

*Started: 3 Sept 2011. Web: http://www.derekro onnor.net email : derekro onnor�eir om.net1Available here: http://digital.library.adelaide.edu.au/ oll/spe ial/�sher/

1

Page 2: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Derek O’Connor A Historical Note on Shuffle Algorithms

Algorithm RandPerm(S) → π

Generates a random permutation of the

elements of the set S = {s1, s2, . . . , sn}for k := 1 to n do

Choose an element rk at random from SS := S − {rk}π[k] := rk

endfor kreturn π

endalg RandPerm

(1.1)

The element chosen at random at the kth stage is rk. The statement S := S − {rk} en-sures that no element is chosen more than once. Notice that all the elements of S will bechosen and put into the array π. The array is needed to preserve the random order inwhich the elements were drawn. Thus Algorithm 1.1 returns π which contains a randompermutation of S.

2 The Fisher-Yates Shuffle Algorithm

This is similar to sampling without replacement and is such an important algorithm thatit is worth examining the original paper-and-pencil version (1938).2

This is the original Fisher-Yates algorithm in a slightly formal algorithmic style:

Algorithm Fisher­Yates→ P

Write out a horizontal list of numbers L = {1, 2, . . . , n}.Write out underneath L an empty list P = {}.Consider each number to be unticked

while there are k > 0 unticked numbers remaining doChoose a random number r between 1 and k inclusive.

Starting from the left, strike out the rth unticked number.

Add this number to the end of a second list P = {n1, n2, . . . , nk−1}endwhile

2There is an excellent discussion of the Fisher-Yates shuffle in Wikipaedia herehttp://en.wikipedia.org/wiki/Fisher-Yates_shu�e© DEREK O’CONNOR, SEPTEMBER 16, 2011 2

Page 3: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Derek O’Connor A Historical Note on Shuffle Algorithms

Table 1. ORIGINAL FISHER-YATES SHUFFLE

k rk 1 2 3 4 5 6 1 2 3 4 5 6

6 5 1 2 3 4 5* 6 5

5 1 1* 2 3 4 X 6 5 1

4 3 X 2 3 4* X 6 5 1 4

3 2 X 2 3* X X 6 5 1 4 3

2 1 X 2* X X X 6 5 1 4 3 2

1 X X X X X 6* 5 1 4 3 2 6

Once the list of numbers P = {5, 1, 4, 3, 2, 6} has been obtained then it can be used to shuf-fle any other indexed list. For example {a1, a2, a3, a4, a5, a6} becomes {a5, a1, a4, a3, a2, a6}when shuffled by the permutation P.

This is a MATLAB version of the Fisher- Yates Shuffle algorithm.

function P = FYSorig(n)

P = zeros(1,n);

ticked(1:n) = false;

for k = n:­1:1

r = ceil(rand*k);

nu = 0; i = 0;

while nu < r Find r-th unticked number

i = i+1;

if ∼ ticked(i)

nu = nu+1;

end

end

Found r-th unticked at i

P(k) = i Add i to P-list

ticked(i) = true; tick it

end for k

(2.1)

Analysis of the Original Fisher-Yates Algorithm. It is obvious that the for k-loop state-ment is performed n times. The nested while loop finds an unticked number i in theset {1, 2, . . . , n} which is also the number of times the while statement is performed.Hence the total number of times the while statement is performed is ∑

nk=1 ik = ∑

nk=1 k =

n(n + 1)/2. Hence the complexity of the original Fisher-Yates algorithm is O(n2) andconsumes n random numbers.

© DEREK O’CONNOR, SEPTEMBER 16, 2011 3

Page 4: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Derek O’Connor A Historical Note on Shuffle Algorithms

3 Durstenfeld’s Shuffle Algorithm

The first O(n) shuffle or random permutation generator was written by Richard Dursten-feld in 19643.

Figure 1 shows graphically how Durstenfeld’s algorithm works. The algorithm may besuccinctly stated as:

for k := n, n − 1, . . . , 2 do

{

Choose at random a number r in [1, k]

Interchange p[r] and p[k](3.1)

The most succinct statement was given by Reingold, et al.4

for k := n downto 2 do πk ↔ πrand(1,k), (3.2)

where π is a permutation of length n, and rand(i, j) returns a random integer from theset {i, i + 1, . . . , j − 1, j}. The random transposition is performed by πi ↔ πrand(1,i).

This is a truly elegant algorithm: it is tiny, uses no extra space, and is optimal. Also, itcan shuffle any array in place, not just permutations of the integers {1, 2, . . . , n}.

This algorithm is one of a family of transposition algorithms for generating combinatorialobjects. Given an initial permutation π(1 : n), a sequence of n − 1 random transposi-tions is performed on it to shuffle the elements of π. We know that the algorithm mustproduce a permutation because it starts with a permutation and performs nothing buttranspositions on it.

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

p

r

1 2 10 4 5 6 7 8 9 3

1 2 3 4 5 6 7 8 9 10

p

r

1 2 10 4 5 9 7 8 6 3

1 2 3 4 5 6 7 8 9 10

p

r

shuffled

1 k r

k = n 1

k

k

1

1

p

Figure 1. DURSTENFELD’S SHUFFLE

3Richard Durstenfeld, Algorithm 235, Random Permutation, Communications of the ACM, Vol. 7, July1964, page 420.

4Edward M. Reingold, Jurg Nievergelt, and Narsingh Deo, Combinatorial Algorithms: Theory and Practice,Prentice-Hall, 1977, page 177.

© DEREK O’CONNOR, SEPTEMBER 16, 2011 4

Page 5: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Derek O’Connor A Historical Note on Shuffle Algorithms

GRPdur below is a MATLAB implementation of the Durstenfeld Shuffle. Table 2 shows theoperation the algorithm.

function p = GRPdur(n)

p = 1:n; Identity permutation

for k = n:­1:2

r = 1 + floor(k*rand); Random integer between 1 and k

t = p(r);

p(r) = p(k); Swap(p(r),p(k))

p(k) = t;

end;

return; GRPdur

(3.3)

Table 2. DURSTENFELD SHUFFLE

k rk π Swap

8 7 1 2 3 4 5 6 7* 8 7,8

7 7 1 2 3 4 5 6 8* | 7 8,8

6 1 1* 2 3 4 5 6 | 8 7 1,6

5 5 6 2 3 4 5* | 1 8 7 5,5

4 3 6 2 3* 4 | 5 1 8 7 3,4

3 1 6* 2 4 | 3 5 1 8 7 6,4

2 1 4* 2 | 6 3 5 1 8 7 4,2

2 4 6 3 5 1 8 7

Analysis of Durstenfeld’s Shuffle. The algorithm has a single for loop and the twostatements in this loop are performed n − 1 times. Hence the complexity is O(n) and ituses n − 1 random numbers. This is an optimal permutation generator because just toread a permutation requires O(n) steps.

4 Historical Note.

The Fisher-Yates Shuffle algorithm was first published in one of the editions before the6th edition of Statistical Tables for Biological, Agricultural and Medical Research, edited byFisher, R.A.; Yates, F. (Edinburgh : Oliver & Boyd, 1938, 1943, 1948, 1953, 1957, 1963).5

The only readily available edition is the 6th, published in 1963.6

5Also published in Spanish and Portuguese.6http://digital.library.adelaide.edu.au/ oll/spe ial/�sher/stat_tab.pdf

© DEREK O’CONNOR, SEPTEMBER 16, 2011 5

Page 6: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Derek O’Connor A Historical Note on Shuffle Algorithms

The preface to the 6th Edition written by Frank Yates in March 1963, states, at the bot-tom of page viii: “Examples 12 and 12 · 1 give an improved method of forming randompermutations”. The shuffle algorithm in that edition was written by C.R. Rao7

Knuth, in the 1st and 2nd editions of Vol. 2, The Art of Computer Programming, page140, says: “This algorithm [Knuth’s Algorithm P] was first published by L. E. Moses andR.V. Oakford, in Tables of Random Permutations (Stanford University Press, 1963); and byR.Durstenfeld, CACM 7 (1964), 420.”

In the 3rd edition8 Knuth claims that “This algorithm was first published by R. A. Fisherand Frank Yates [Statistical Tables (London 1938), Example 12], in ordinary language, andby R. Durstenfeld [CACM 7 (1964), 420] in computer language.” It is interesting to notethat Knuth has dropped the reference to L. E. Moses and R.V. Oakford in the 3rd edition.

I believe Knuth is wrong in attributing Algorithm P to Fisher & Yates. I will show thatDurstenfeld’s Shuffle algorithm was a new, optimal shuffling algorithm when it appearedin 1964, in the same sense that Hoare’s QuickSort algorithm was a new, optimal sortingalgorithm when it appeared in 1962.

The discussion that follows is based on Example 12, page 37, 6th Edition of the Fisher &Yates Tables, which is reproduced verbatim below. This allows us to identify the originalFisher-Yates Shuffle algorithm and the improved algorithm of C. R. Rao.9

Example 12: Required to arrange 8 treatments, numbered 1–8, in random order.

[Pre-6th Ed. Method]. This operation can be performed by selecting one of the treat-ments at random from the eight, then selecting a second from the seven that remain,and so on. When the number of treatments is at all large, however, this procedure istiresome, since each treatment must be deleted from a list as it is selected and a freshcount made for each further selection.

[New Method by C.R. Rao] To avoid this C.R. Rao (78) has proposed an alternativemethod. The one-dimensional version of this, suitable for the present example, con-sists of taking 10 cells numbered 0–9, and allocating the numbers 1–8 to these accord-ing to a sequence of 8 single digit random numbers. Thus, using the first column ofTable XXXIII(I), which begins 0, 9, 1, 1, 5, 1, 8, 6 we allocate 1 to cell 0, 2 to cell 9, 3 to cell1, 4 to cell 1, etc., the complete allocation being

Cell: 0 1 2 3 4 5 6 7 8 9

1 3,4,6 – – – 5 8 – 7 2

The three numbers in cell 1 must now be permuted. This can be done by the sameprocess, using the next three random numbers 3,5,1 to give the order 6,3,4, so that thefinal permutation is 1, 6, 3, 4, 5, 8, 7, 2. Alternatively this . . . etc., etc.

7C.R. Rao, “Generation of random permutations of given number of elements using random samplingnumbers”, Sankhya, xxiii, 305–307, (1961).

8The Art of Computer Programming, Volume 2, 3rd Edition, Addison-Wesley, 1998, bottom of page 145.9I have broken the original first paragraph into two parts so that the old and new methods are clearly

visible.

© DEREK O’CONNOR, SEPTEMBER 16, 2011 6

Page 7: O'Connor -- Fisher-Yates & Durstenfeld Shuffle Algorithms: History

Derek O’Connor A Historical Note on Shuffle Algorithms

We can clearly see that the first method is not the Durstenfeld shuffle: it is the genericRandPerm algorithm 1.1 which we have implemented as FYSorig 2.1, and is an O(n2)algorithm.

Equally clearly, Rao’s method is not the Durstenfeld shuffle: apart from the fact that ituses more than n − 1 random numbers, it allocates, in this example, 3 numbers to onecell which must be permuted subsequently. This is not remotely similar to Durstenfeld’salgorithm.

It is for these reasons that I believe Durstenfeld should get sole credit for his algorithm.It may have been part of the programmers’ folklore at the time but Durstenfeld was thefirst to publish this algorithm, although many people were writing algorithms for theexhaustive enumeration of permutations.

Durstenfeld’s algorithm, despite its simplicity, is very subtle. It uses the minimum num-ber of random numbers to perform n − 1 random transpositions; it uses no extra space,apart from 3 or 4 scalars. Also, it can generate a partial permutation with just a slightmodification (Pike’s modification).

The only reason I can see for not ranking Durstenfeld’s Shuffle with Hoare’s QuickSort isthat permuting things is less important (and difficult) than sorting them.

Very few programmers, who had not seen something similar, would invent it. Even thosewho have seen it get it wrong: Microsoft, with all its money and highly-paid program-mers, could not randomly permute 5 web browsers:http://www.robweir. om/blog/2010/02/mi rosoft-random-browser-ballot.html.

© DEREK O’CONNOR, SEPTEMBER 16, 2011 7