Honors Research Colloquium Final Poster

1
Background Ranking college football teams is a significant challenge that has major implications. With 124 colleges and universities represented in the NCAA Football Bowl Subdivision (FBS), there are 124! possible rankings. With this many rankings, it’s hard to tell if it’s the best ranking out of all the possibilities. Other factors that makes ranking difficult are: what people consider to be important when ranking a team, how bias of the voters or computer programs impact ranks, and how conflicting results are handled. Money is also a huge issue affecting rankings. Getting the rankings right can have a huge monetary impact on the schools that do or don’t get into BCS bowl games. The current system of ranking the teams is the Bowl Championship Series (BCS), which consists of two opinion polls and six computer rankings. The opinion polls provide human error and bias. The computer rankings have controversy due to the fact that all but one of them aren’t published for peer review. CMS+ Background The CMS+ program is a quadratic assignment problem, the program uses a genetic algorithm heuristic with two inputs: degree of victory (DOV) and relative distance. Degree of victory consists of multiple factors that compare two teams. Relative distance shows how the rankings compare with one another, and culminate to form a bell curve. Producing Valid College Football Rankings in Reasonable Time Mark W. Edwards Department of Mechanical Engineering Jonathan A. Shumaker Department of Chemical Engineering 5 th Annual FEP Honors Research Symposium C. Richard Cassady, PhD Department of Industrial Engineering April 20, 2013 CMS+ Problems Degree of Victory can lead to some issues due to the multiple different factors that can be included. Defining factors that should be included, which can also be varied is a difficult task. Also an issue with the CMS+ is that it currently takes a significant amount of time to generate a ranking. Decreasing the run time makes the CMS+ more practical for use. Platform and Objectives We believe that a four-team play-off increases the need for a ranking system, a system should clearly state what factors are included, and a computer based system should be used. Our objectives were to improve the CMS+ system for ranking college football teams by identifying factors to include and defining a way for them to be quantified. Secondly we worked to improve the time it takes to run the program. Degree of Victory Defining Factors We determined that one factor should always be included, and that is the winner of a game should receive points over the team that they beat. Four other factors are also in the calculation of DOV and can be added and removed in any variation: Home and Away points are given to away teams if they win on the road. This also gives half of the points to neutral site wins. AP Rank gives points over a team that is ranked higher in the AP poll. This is done for every single pair of teams. Common Opponents points takes into account ever pair of teams, and it gives the points to teams that have more wins over common opponents. Conference Champions are given points over all of the other teams in their conference. Creating Degree of Victory Matrix A matrix is necessary to compare every single pair of teams. Creating this matrix is done by giving the points allocated to each of the specified factors. In our case 15 points are given when the factor is turned on. The matrix created will be able to be entered into the CMS+ program with a relative distance matrix to compute a ranking. Results Sixteen files were created where every combination of factors was turned on and off. A fitness value of 200,000 provides a result of 10,000 standard deviations above the mean with no possible rankings that are better. Run Time Through testing we found two things: The number of heuristic replications does not affect run time, therefore we left the number of replications at 20 throughout testing Solution Quality and the Rankings produced is not affected by the number of generations or replications used in the CMS+ Knowing these two things, we were then able to test the relationship between the number of GA generations used compared to the run time. These results are seen in Figure 1. TABLE OF RUN TIME RES Table 1: Ranking's Generated by CMS+ System Using Test Cases BCS Rankings DOV 01 (Only H2H) DOV 05 (AP Rank) DOV 12 (H/A/N, ConfChamp, CommOpp) DOV 16 (Everything on) Fitness 244343 246025 216012 196313 1 Auburn Texas Christian Oregon Oregon Oregon 2 Oregon Oregon Texas Christian Auburn Auburn 3 Texas Christian Stanford Boise State^ Texas Christian Texas Christian 4 Stanford Wisconsin Auburn Michigan State^ Boise State^ 5 Wisconsin Michigan State Ohio State Boise State^ Michigan State 6 Ohio State Nevada^ Nevada^ Wisconsin Wisconsin 7 Oklahoma Ohio State Wisconsin Virginia Tech^ Virginia Tech^ 8 Arkansas Boise State Stanford Nevada^ Nevada^ 9 Michigan State Auburn* Michigan State Ohio State Stanford* 10 Boise State Virginia Tech Utah^ Stanford* Ohio State ^Gained 5 or more spots from BCS Ranking *Lost 5 or more spots from BCS Ranking As seen in the graph, once the generation amount gets past a certain point, which is in the 3,500-4,000 generation range, the times start to increase. Going in either direction increases the time to produce a ranking. Conclusion We were able to improve both the Degree of Victory and the run time during our research. Factors were found to be included in Degree of Victory and these factors were quantified then used to compute rankings from the CMS+ system. The run time was decreased by lowering the number of generations in the algorithm from 100,000 to somewhere in the range of 3,500 to 4,000. Future research should be done on more factors to include and more ways to include them in Degree of Victory. Also the run time could be further decreased based on the fact that there are more factors involved in running the CMS+ system than just the generations and replications. Figure 1: The Effects of Generations on Run Time

Transcript of Honors Research Colloquium Final Poster

Page 1: Honors Research Colloquium Final Poster

Background Ranking college football teams is a significant challenge that has major implications. With 124 colleges and universities represented in the NCAA Football Bowl Subdivision (FBS), there are 124! possible rankings. With this many rankings, it’s hard to tell if it’s the best ranking out of all the possibilities. Other factors that makes ranking difficult are: what people consider to be important when ranking a team, how bias of the voters or computer programs impact ranks, and how conflicting results are handled. Money is also a huge issue affecting rankings. Getting the rankings right can have a huge monetary impact on the schools that do or don’t get into BCS bowl games. The current system of ranking the teams is the Bowl Championship Series (BCS), which consists of two opinion polls and six computer rankings. The opinion polls provide human error and bias. The computer rankings have controversy due to the fact that all but one of them aren’t published for peer review.

CMS+ Background The CMS+ program is a quadratic assignment problem, the program uses a genetic algorithm heuristic with two inputs: degree of victory (DOV) and relative distance. • Degree of victory consists of multiple factors that compare two teams. • Relative distance shows how the rankings compare with one another, and

culminate to form a bell curve.

Producing Valid College Football Rankings in Reasonable Time

Mark W. Edwards Department of Mechanical Engineering

Jonathan A. Shumaker

Department of Chemical Engineering

5th Annual FEP Honors Research Symposium

C. Richard Cassady, PhD Department of Industrial Engineering

April 20, 2013

CMS+ Problems Degree of Victory can lead to some issues due to the multiple different factors that can be included. Defining factors that should be included, which can also be varied is a difficult task. Also an issue with the CMS+ is that it currently takes a significant amount of time to generate a ranking. Decreasing the run time makes the CMS+ more practical for use.

Platform and Objectives We believe that a four-team play-off increases the need for a ranking system, a system should clearly state what factors are included, and a computer based system should be used. Our objectives were to improve the CMS+ system for ranking college football teams by identifying factors to include and defining a way for them to be quantified. Secondly we worked to improve the time it takes to run the program.

Degree of Victory • Defining Factors We determined that one factor should always be included, and that is the winner of a game should receive points over the team that they beat. Four other factors are also in the calculation of DOV and can be added and removed in any variation: Home and Away points are given to away teams if they win on the road. This

also gives half of the points to neutral site wins. AP Rank gives points over a team that is ranked higher in the AP poll. This is

done for every single pair of teams. Common Opponents points takes into account ever pair of teams, and it gives

the points to teams that have more wins over common opponents. Conference Champions are given points over all of the other teams in their

conference.

• Creating Degree of Victory Matrix A matrix is necessary to compare every single pair of teams. Creating this matrix is done by giving the points allocated to each of the specified factors. In our case 15 points are given when the factor is turned on. The matrix created will be able to be entered into the CMS+ program with a relative distance matrix to compute a ranking. • Results Sixteen files were created where every combination of factors was turned on and off. A fitness value of 200,000 provides a result of 10,000 standard deviations above the mean with no possible rankings that are better.

Run Time Through testing we found two things: The number of heuristic replications does not affect run time, therefore

we left the number of replications at 20 throughout testing Solution Quality and the Rankings produced is not affected by the

number of generations or replications used in the CMS+ Knowing these two things, we were then able to test the relationship between the number of GA generations used compared to the run time. These results are seen in Figure 1.

TABLE OF RUN TIME RES

Table 1: Ranking's Generated by CMS+ System Using Test Cases

BCS Rankings DOV 01 (Only H2H)

DOV 05(AP Rank)

DOV 12(H/A/N, ConfChamp, CommOpp)

DOV 16(Everything on)

Fitness 244343 246025 216012 196313

1 Auburn Texas Christian Oregon Oregon Oregon

2 Oregon Oregon Texas Christian Auburn Auburn

3 Texas Christian Stanford Boise State^ Texas Christian Texas Christian

4 Stanford Wisconsin Auburn Michigan State^ Boise State^

5 Wisconsin Michigan State Ohio State Boise State^ Michigan State

6 Ohio State Nevada^ Nevada^ Wisconsin Wisconsin

7 Oklahoma Ohio State Wisconsin Virginia Tech^ Virginia Tech^

8 Arkansas Boise State Stanford Nevada^ Nevada^

9 Michigan State Auburn* Michigan State Ohio State Stanford*

10 Boise State Virginia Tech Utah^ Stanford* Ohio State

^Gained 5 or more spots from BCS Ranking*Lost 5 or more spots from BCS Ranking

As seen in the graph, once the generation amount gets past a certain point, which is in the 3,500-4,000 generation range, the times start to increase. Going in either direction increases the time to produce a ranking.

Conclusion We were able to improve both the Degree of Victory and the run time during our research. • Factors were found to be included in Degree of Victory and these factors

were quantified then used to compute rankings from the CMS+ system. • The run time was decreased by lowering the number of generations in

the algorithm from 100,000 to somewhere in the range of 3,500 to 4,000. Future research should be done on more factors to include and more ways to include them in Degree of Victory. Also the run time could be further decreased based on the fact that there are more factors involved in running the CMS+ system than just the generations and replications.

Figure 1: The Effects of Generations on Run Time