Is There Trouble With The Bubble? Presented by: Gregory H. Leisch, CRE November 14, 2005.
Bubble Trouble
description
Transcript of Bubble Trouble
Bubble Trouble
A statistical analysis of the Bubble Sort.
The Guys Who Bring It To You
Jon KroeningNick JonesErik WeyersAndy SchieberSean Porter
History and Background
Jon Kroening
The Bubble Sort
Popular algorithm used for sorting dataIverson was the first to use name ‘bubble sort’ in 1962, even though used earlierUnfortunately it is commonly used where the number of elements is too large
The Bubble Sort Starts at one end of the array and make repeated scans through the list comparing successive pairs of elements5 3 6 2 8 9 1If the first element is larger than the second, called an inversion, then the values are swapped3 5 6 2 8 9 1
The Bubble Sort
Each scan will push the maximum element to the top3 5 6 2 8 9 13 5 6 2 8 9 13 5 2 6 8 9 13 5 2 6 8 9 13 5 2 6 8 9 13 5 2 6 8 1 9
The Bubble Sort
This is the “bubbling” effect that gives the bubble sort its nameThis process is continued until the list is sortedThe more inversions in the list, the longer it takes to sort
3 5 2 6 8 1 93 5 2 6 8 1 93 2 5 6 8 1 93 2 5 6 8 1 93 2 5 6 8 1 93 2 5 6 1 8 92 3 5 6 1 8 92 3 5 6 1 8 9
2 3 5 6 1 8 92 3 5 1 6 8 92 3 5 1 6 8 92 3 5 1 6 8 92 3 1 5 6 8 92 3 1 5 6 8 92 1 3 5 6 8 91 2 3 5 6 8 9
Bubble Sort Algorithmtemplate <typename R>void bubbleSort(vector<R>& v){
int pass, length;R temp;for( length = 0; length < v.size()-1; length++){
for(pass = 0; pass < v.size()-1; pass ++){
if(v[pass] > v[pass +1]){
temp = v[pass];v[pass] = v[pass + 1];v[pass + 1] = temp;
}}
}}
Best and Worst Case Scenarios
Nick Jones
Best Case of the Bubble Sort
Let X: number of interchanges (discrete).Consider list of n elements already sorted
x1 < x2 < x3 < … < xn
Best Case (cont.)
Only have to run through the list once since it is already in order, thus giving you n-1 comparisons and X = 0.Therefore, the time complexity of the best case scenario is O(n).
Worst Case of the Bubble Sort
Let X: number of interchanges (discrete).Consider list of n elements in descending order
x1 > x2 > x3 > … > xn.
Worst Case (cont.)
# of comparisons = # of interchanges for each pass.X = (n-1) + (n-2) + (n-3) + … + 2 + 1This is an arithmetic series.
Worst Case (cont.)
(n-1) + (n-2) + … + 2 + 1 = n(n-1)/2So X = n(n-1)/2.Therefore, the time complexity of the worst case scenario is O(n2).
Time Complexity Model of the Worst Case
# elements
Tim
e in
seco
nds
Average Case
Nick Jones, Andy Schieber & Erik Weyers
Random Variables A random variable is called discrete if its range is finite or countably infinite.Bernoulli Random Variable – type of discrete random variable with a probability mass function
p(x) = P{X = x}
Bernoulli Random Variable
X is a Bernoulli Random Variable with probability p if
px(x) = px(1-p)1-x, x = {0,1}.Other notes:
E[X] = p
Expected Value
The expected value of a discrete random variable is
nnpxpxpxxpxn
iii
...)( 2211
1
Sums of Random Variables
If are random variables then
i.e. expectation is linear
nXXX ...2,1
n
ii
n
ii XEXE ][
Analytic Estimates for Bubble Sort
I = number of inversionsX = number of interchangesEvery inversion needs to be interchanged in order to sort the list, therefore X must be at least I,E[X] = The average number of inversions
][][ XEIE
and
2)1(][0
nnXE
ji
jiII ),(
invertednotinvertedjiI 0
1),(
Case Worst ][ caseBest XE
ji
jiIEIE )],([][
The probability that i, j are inverted is .5 I is a Bernoulli random variable,
therefore
there are
terms in the above sum
5.}1),({ jiIp 5.)],([ jiIE
2)1(
)!2(!2!
2
nn
nnn
Since p(I) = .5 and there are
terms in the above sum
then
2)1( nn
4)1(
21
2)1(][
nnnnIE
So
The average case is equivalent to the worst case
2)1(][
4)1(
nnXEnn
)(][)( 22 nOXEnO
)(][ 2nOXE
Simulation Results
Jon Kroening, Sean Porter, and Erik Weyers
int MAX = 99; int Random[MAX]; int temp, flag;
for(int i = 0; i<100; i++) { flag = 1; // Random Number temp = 1+(int) (100.0*rand()/(RAND_MAX+1.0)); //obtained for(int check = 0; check <= i; check++){ if(Random[check] == temp) //Checks if number is
flag = 0; // already obtained } if(flag == 0) //if number is already obtained i--; //then it will reloop to find new one else{ Random[i] = temp; //Otherwise enters number cout << Random[i] << endl; //in array and displays } }
Simulation ResultsBUBBLE SORT ANALYSIS
2000
2500
3000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
DATASETS
INVE
RSIO
NS
Worst Case: 4950
2)99(100
Closing Comments
The Bubble sort is not efficient, there are other sorting algorithms that are much faster
A Final Note
Thank you Dr. Deckleman for making this all possible
References
“Probability Models for Computer Science” by Sheldon Ross, Academic Press 2002