Honors Research Colloquium Final Paper

Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors.

Producing Valid College Football Rankings in Reasonable Time

Mark EdwardsDepartment of Mechanical Engineering

Jonathan ShumakerDepartment of Chemical Engineering

Faculty Mentor: C. Richard Cassady Ph.D.Department of Industrial Engineering

Abstract

College football is an annual source of controversy when it comes time to produce the NCAA Bowl Championship Subdivision rankings at the end of the regular season. From the secrecy of how the computer systems rank teams, to the bizarre ballots submitted for the coach’s poll, every year people ask the question: is there a better way to do this? In a couple years, there will be a four-team playoff to determine the two teams that play in the national championship, and the CMS+ ranking system aims to guide the counsel that will select the teams that play in the playoff. The CMS+ system uses the quadratic assignment problem, a way to mathematically push winning teams to the top of the rankings and losing teams to the bottom produce results. Currently it takes a long time for the CMS+ to produce results, and this is where we are doing our research. We improved the CMS+ system in two ways. We have found and set parameters for degree of victory to be based off of within the CMS+ system. We also have found the ideal amount of mutations and repeated tests for the CMS+ system to produce rankings of high quality.

1. The Present State of College Football

There are currently 124 teams in the NCAA Football Bowl Subdivision. Each team will play around 12 games each season. This means that only about nine percent of the matchups that could happen actually happen. Obviously there are many matchups that do not occur, thus it is very hard to decide who should play in the National Championship game. Currently there are only two teams chosen to play in the national championship game, but this will change in 2014 with the addition of a four-team playoff. There are many financial implications to playing in the National Championship and other BCS bowl games. There could be even more money on the line when they move to a playoff system. It is very hard to choose the two teams that play for the national championship for multiple reasons. The first is that there are many possible rankings; 124! to be exact. Because there are so many possible rankings it is hard to find the ranking that is most accurate. It is also very hard to think about what should be considered in the rankings. There are many factors that can be considered, for example: strength of schedule, strength of conference, margin of victory, or location. It is very hard to apply all of these factors across teams equally. There are also conflicts when you rank teams because there will always be teams that are disappointed and feel like they should be placed higher in the rankings. It is hard to determine which teams should play in the National Championship if there are more than two undefeated teams in the same season. All of these ideas prove that it is very difficult to rank the teams. This, along with the financial aspect, is described by Martinich (2002) “Given the substantial financial implications, as well as the desire to select the best teams for the championship and other BCS bowls, it is imperative that the ranking systems included in the selection formula be the most accurate at ranking

1Edwards, M. and J. Shumaker

Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors. teams.” Given these difficulties and problems in the approach to ranking College football teams, it is evident that a fair and quality system is needed to insure the best rankings.

1.1. How the Teams are Currently Ranked

The Bowl Championship Series (BCS) system is presently used to rank the teams in college football. The BCS system combines computer rankings and human polls to formulate its final rankings. The two opinion polls are the Harris Poll and the USA Today/Coaches Poll. The human polls account for two thirds of the total BCS rankings; one third each. Opinion polls have a few flaws. They are after all opinion, so there will be bias in the rankings that they produce. A discrepancy associated with the coach’s poll is that, in some cases the person voting is a coach assistant or someone associated with the school. There are also times when the ballots are filled out in a way that a team is left off. This happens simply because people are liable to mess up when filling out their ballots. The other one third consists of six computer rankings. These rankings have many problems. The first is that they are mostly secret, with only one of the six being public. The fact that they are not all published and peer reviewed proves that there could be major problems with them. Because we don’t know how they are formed, we don’t know if they are quality and can even come up with the correct rankings. There are also data errors. The one ranking that is not a secret was found to have errors in its data. When there are data errors they need to be found, and they will not be found if there are secret rankings. There will also be design bias in the computer rankings. The way a program is made will introduce bias. When there is bias it will affect the results that are generated by the computer. These flaws in the BCS system combined with the many difficulties that arise in ranking teams prove that there could be a system that better ranks the teams that should play in the National Championship game and eventually decide the teams that should play in the four team playoff.

2. A New System for Ranking College Football Teams

We believe that due to the numerous flaws with the BCS system for ranking teams there should be a new system for ranking college football teams.

2.1. Our Platform

First we believe that a four team playoff increases the need for a better ranking system. The selection committee will need help deciding the teams that should play in the four team playoff. It will actually be a more rigorous process to distinguish between the fourth and fifth teams than to distinguish between the second and third teams. Second, we believe that there is no such thing as an unbiased system, so the system should be public. The committee should also state what is important in the rankings so the teams know what to do in creating their schedules and how they play their games. Finally, we believe that a computer-based system should be used to create rankings. This is the case because humans cannot simultaneously think about all the games that happened throughout the season. Humans do not remember every game. They cannot think about where they were played, or the score, or all the other information pertaining to each and every game. Computers can simultaneously process each of these aspects and compare them. Also computers can apply the bias that the committee want consistently


Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors. across all the teams. If the method used to rank teams is applied equally to each school, then each team would be placed in the best and most accurate position.

2.2. Our System

We propose using the CMS+ system for ranking college football teams provided by Sullivan and Cassady (2009). The CMS+ defines the problem mathematically using a quadratic assignment problem (QAP). The only issue with the QAP is that it is on such a massive scale. The largest problems that have been solved are for around n=40. In our case n=124, so we cannot use the solution. Nevertheless there are methods for acquiring valid results for such large QAPs. In our QAP there are two inputs: degree of victory and relative distance. The degree of victory is a system for comparing teams the rankings. The degree of victory is adjusted by things such as head to head victories, margin of victory, and location of the game. The other input is relative distance. This is defined as the distance between the teams in the ranking. We will use the bell curve to get a general ranking set up where the distance between teams at the ends of the bell curve are further apart than the teams in the middle. The way that we attempt to solve the QAP is through a two stage heuristic approach. First we will use a genetic algorithm that uses a survival of the fittest approach. We do this process 100,000 times to make sure that the rankings are as close to perfect as possible. After the genetic algorithm has been completed, we use a local search. The search switches one team at a time and if the switch makes the ranking better than it makes that adjustment and redoes the local search for switching. This entire heuristic approach is redone twenty times because the start of the genetic algorithm is done randomly.

2.3 Problems with the CMS+ System

There are two main features that can be studied to further improve the CMS+ system for ranking teams. The first problem is deciding what parameters should be included in degree of victory. We know that there are things that should be included, but the difficulty is which factors to include and how to include them. The second problem with the CMS+ system is that it is a long process.

3. Our Research Plan: Improving the CMS+ System

As stated before there are problems with the CMS+ System. Our goal is to improve the system and we will do this in two ways, improving the degree of victory and shortening the run time.

3.1. Degree of Victory Research

We worked to improve the Degree of Victory by finding possible factors to include and ways to quantify these factors determined. This was done by collecting data from past seasons. The data collected was from the information provided by James Howell on his data base, ESPN.com, and collegepollarchive.com. Once we had the data we were able to use it to find rankings and analyze the effects our factors had on the rankings created from the years. This part of the project was done with the other group working on the project: (T. Dodson and A. McElhenney).

3.2. Run Time Research


Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors. Our second research goal was to shorten the run time. We had to find ways to get valid rankings, while also shortening run time. This was done by experimentation and adjustments to the CMS+ system through the use of various combinations of GA generations and heuristic replications.

4. Improving Degree of Victory

The first way that we worked to improve the CMS+ System of ranking was by establishing what should be included in degree of victory and upon completion of this task we created a degree of victory matrix. Finding factors to include was done in partnership with T. Dodson and A. McElhenny.

4.1. Factors to Include in Degree of Victory

Ultimately five factors were chosen to be adjusted in our Degree of Victory that led to our rankings. Of the five factors that were chosen four can be varied in their weight. The total amount of degree of victory points that can be given to one team in the degree of victory matrix is 100 points. The factor that cannot be removed is the game result, which is just who won the game. This factor can be lowered in weight based on how much influence is given to the other four aspects that go into the ranking. The four other factors that are included, and which can also be varied are: if the game was played at home or on the road, if the team has more wins against common opponents, if they are their conference champion, and if they have a higher rank in the AP poll. These factors can all carry a total weight of 60 points in the degree of victory matrix. This means the winner of the game must receive 40 points. The way that more than 40 points can be given to the winner is if some of the four varied factors are turned off. In that case the unallocated points are all given to the winner of the game.

4.2. The Degree of Victory Matrix

The best way to compare teams is through a matrix. The creation of this matrix is done through visual basic programming and it is adjustable based on the variations that each person can choose for degree of victory points. Points are awarded to each team based on the factors that were established above. For every game played, the system takes into account who won and who lost the game and if it was a home or away game. The points allocated are given to an away team that wins based on the idea that games played on the road are harder to win than games at home. For every single pair of teams the other categories are compared. The team that has more wins over common opponents will get the allocated points. This is a way to establish a difference between teams that may have the same record but may have competed better against teams that the teams have in common. The points assigned to AP points are given to the team that is ranked higher in the AP poll. This is a way to include a human aspect, or the “eye test,” which is something that is hard to do within computer rankings. The final aspect is that if the teams are in the same conference then the champion of that conference will be given the points allocated for conference champions over every other team in the conference. Using visual basic the matrix for degree of victory can be created where each team is both along the vertical and horizontal sides and the points that are allocated for each team are within the matrix. This matrix is used in the CMS+ system for ranking teams and will be able to produce rankings with this degree of victory matrix.


Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors. 4.3. Results

Once we determined the factors that should be included in degree of victory we tested the effect that the factors have on rankings. This was done by defining sixteen different cases to work on for past years. These cases will allow us to run the program and evaluate how changing all of the factors affects the ranking. The test cases come from the four factors that we vary and they are all turned on and off in varying arrangements such that all various combinations of the four factors are tested. When we turned a factor on we gave it fifteen points and none when it was turned it off. The results for the rankings are seen in our Appendix. The results for fitness that we found produced an average standard deviation above the mean of close to 10,000 based on the information from Dodson and McElhenny. That is such a large value that calculating how many better rankings would be produced comes up with such a small number that it reads it as zero. This means that using our degree of victory we are able to calculate the ideal rankings that correspond to the factors we included.

5. Reducing Run Time

Improving the problem of a lengthy run time is done through testing and experimentation. The best way to do this is through tests on past years. We needed to find the point at which the run time is minimized but still a valid ranking is produced.

5.1. Creating Tests

Reducing the run time is done by decreasing the amount of GA generations and heuristic replications. We needed to test the effect that decreasing the generations and replications has on solution quality. We decreased the generations from the 100,000 initially used in the CMS+ system. Intervals used to decrease the generation were of 25,000. Replications were also reduced, but only once by 10. After completing tests down to 25,000 total generations, we noticed that whether we were at 10 replications or 20 replications, we got virtually the same run time. This is seen in Table 1. Because the number of replications does not affect the run time we decided to leave it at 20 replications. When these generations are decreased the run time goes down to a certain point, but if lowered too far the time will start to increase again. We found out that no matter how low we go with the number of generations and replications solution quality is not affected at all. We took a range of values, going from 100,000 to 100 to see how time varies for each amount. We went in steps of 25,000 from 100,000 down to 25,000, and then varied steps from 10,000 to 100 just to find times in certain ranges that correspond to the ideal time.

5.2 Run Time Results

This table expresses perfectly the fact that run time is not affect by replications because the same times are produced for both the sets of 20 and 10 replications. Also the fitness never changed which means that no matter how we adjust the generations and replications the same fitness will be calculated.


Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors. Table 1: Run Time and Solution Quality for Combinations of GA Generations and Heuristic Replications

GA Generations Heuristic Replications Run Time (seconds) Fitness100,000 20 139 19631375,000 20 107.68 19631350,000 20 71.72 19631325,000 20 40.12 19631310,000 20 28.62 196313100,000 10 139 19631375,000 10 105 19631350,000 10 71 19631325,000 10 40.67 19631310,000 10 27.92 196313

Figure 1: The Effect of Generations on Run Time at 20 Replications

We noticed that when we lowered the generations past a certain point the time started to increase again. This is seen in the 4,000 to 1,000 range where the time changes and will increase on either side of this range. Going from 3,500 to 1,000, there was a 19.74 second jump in time. Going from 1,000 to 100 generations, there was a 230.26 second jump in time. The same increase is seen going the other


Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors. way, just not in as significant of jumps. This graph along with this analysis leads us to believe that the area where the shortest run time would correspond to is in the range of 3,500 to 4,000 generations. This is a large decrease from the initial 100,000 used in the CMS+ system. The 100,000 generations produced a run time of 139 seconds, through our experimentation we were able to find in this range we will get a run time of about 25 seconds which is one fifth of the initial time.

7. Conclusions

Throughout this research, we were able to achieve all of our goals. We defined parameters for and computed degree of victory through the use of data collected from past years. Using our degree of victory, we produced rankings for each set of parameter combinations, which we compared to the BCS rankings for the year of 2010. Based on calculations from our fitness value, the ranks that are better than the one produced is so small that it rounds down to zero, meaning our rankings are high quality. We were also able to reduce run time by effectively and efficiently reducing the generation amount. We also varied the replication amount in our tests, but found that changing the replication amount had no affect on the ranking or run time. However, there’s plenty of research that can be done in the future on this project. The main area that can be expanded is through further work on degree of victory. Our factors were not all that could be included. There are multiple more factors and more ways to include the factors that we expressed. It would be possible to adjust factors in different ways and provide an even more personal approach. Run time is something that can be decreased in many ways as well. There are other aspects within the CMS+ system that can be adjusted to make the time needed to run the program shorter, and there are ways to adjust the program as a whole to run in less time. Our research brings to the forefront the idea that there can be a personalized aspect to the ranking, and that there can be many factors that can be turned on and off. We also improved the CMS+ system by shortening the time it takes to run the program while still getting valid results. Overall both goals were met and the CMS+ system was improved in the areas of degree of victory and run time.

8. References

Cassady, C. Richard, Maillart, Lisa M., and Salman, Sinan, 2005, “Ranking Sports Teams: A Customizable Quadratic Approach,” Interfaces, 35(6), 497-510.

Martinich, Joseph. 2002. “College football rankings: Do the computers know best?” Interfaces 32(5) 85–94.

Sullivan, Kelly, Cassady, C. Richard, 2009. “The CMS+ System for Ranking College Football Teams.” Proceedings of the 2009 Industrial Engineering Research Conference.


Proceedings of the 5th Annual FEP Honors Research Symposium Copyright, 2013, Edwards, M., Shumaker, J. Please do not use the materials without the expressed permission of the authors.

9. Appendix:

DOV 01 DOV 02 DOV 03 DOV 04 DOV 05 DOV 06 DOV 07 DOV 08 REAL BCSFitness 244343 224675 226317 219045 246025 237197 221582 238751

1 TexasChristian Oregon Oregon Oregon Oregon Oregon Oregon Oregon Auburn2 Oregon TexasChristian TexasChristian TexasChristian TexasChristian TexasChristian TexasChristian TexasChristian Oregon3 Stanford Stanford BoiseState Auburn BoiseState Auburn Auburn Auburn TCU4 Wisconsin BoiseState Auburn BoiseState Auburn Stanford MichiganState OhioState Stanford5 MichiganState Auburn Nevada Stanford OhioState MichiganState BoiseState Nevada Wisconsin6 Nevada Wisconsin Wisconsin Nevada Nevada OhioState Wisconsin MichiganState Ohio State7 OhioState Nevada Stanford OhioState Wisconsin Nevada Nevada BoiseState Oklahoma8 BoiseState OhioState OhioState Wisconsin Stanford BoiseState VirginiaTech Stanford Arkansas9 Auburn MichiganState MichiganState MichiganState MichiganState Wisconsin OhioState Wisconsin Michigan State

10 VirginiaTech Utah Utah VirginiaTech Utah VirginiaTech Stanford VirginiaTech Boise State11 Utah VirginiaTech VirginiaTech Utah VirginiaTech Utah Oklahoma Utah LSU12 Oklahoma Missouri Oklahoma Oklahoma Oklahoma OklahomaState Utah Oklahoma Missouri13 Nebraska Nebraska Missouri OklahomaState OklahomaState Oklahoma Nebraska OklahomaState Virginia Tech14 OklahomaState Oklahoma OklahomaState Missouri Missouri Missouri Missouri Missouri Oklahoma State15 Missouri OklahomaState Arkansas Arkansas Arkansas Nebraska OklahomaState Arkansas Nevada16 Arkansas Arkansas CentralFlorida CentralFlorida Nebraska Arkansas CentralFlorida CentralFlorida Alabama17 SouthCarolina CentralFlorida Nebraska Hawaii CentralFlorida CentralFlorida Arkansas Hawaii Texas A&M18 WestVirginia Hawaii Tulsa Tulsa FloridaState Hawaii Hawaii Nebraska Nebraska19 Alabama Tulsa Hawaii FloridaState Tulsa FloridaState WestVirginia FloridaState Utah20 LouisianaState FloridaState WestVirginia Nebraska WestVirginia Tulsa Tulsa Tulsa South Carolina21 Hawaii Alabama FloridaState WestVirginia Hawaii WestVirginia FloridaState WestVirginia Mississippi State22 FloridaState WestVirginia LouisianaState LouisianaState LouisianaState LouisianaState LouisianaState LouisianaState West Virginia23 Tulsa LouisianaState SouthCarolina Alabama SouthCarolina Alabama Alabama Alabama Florida State24 CentralFlorida SouthCarolina Alabama SouthCarolina Alabama SouthCarolina Miami(Ohio) SouthCarolina Hawaii25 TexasA&M Temple TexasA&M Miami(Ohio) TexasA&M Toledo TexasA&M TexasA&M UCF

DOV 09 DOV 10 DOV 11 DOV 12 DOV 13 DOV 14 DOV 15 DOV 16 REAL BCSFitness 201622 214227 203404 216012 216012 217337 223232 196313

1 Oregon Oregon Oregon Oregon Oregon Oregon Oregon Oregon Auburn2 TexasChristian TexasChristian TexasChristian Auburn Auburn TexasChristian TexasChristian Auburn Oregon3 Auburn Auburn Auburn TexasChristian TexasChristian Stanford Auburn TexasChristian TCU4 BoiseState MichiganState BoiseState MichiganState MichiganState Auburn BoiseState BoiseState Stanford5 MichiganState BoiseState Wisconsin BoiseState BoiseState Nevada MichiganState MichiganState Wisconsin6 Wisconsin Wisconsin MichiganState Wisconsin Wisconsin BoiseState Wisconsin Wisconsin Ohio State7 Nevada VirginiaTech Nevada VirginiaTech VirginiaTech OhioState Nevada VirginiaTech Oklahoma8 Stanford Nevada VirginiaTech Nevada Nevada MichiganState VirginiaTech Nevada Arkansas9 VirginiaTech OhioState Stanford OhioState OhioState Wisconsin OhioState Stanford Michigan State

10 OhioState Stanford OhioState Stanford Stanford VirginiaTech Stanford OhioState Boise State11 Oklahoma Oklahoma Oklahoma Oklahoma Oklahoma Utah Oklahoma Oklahoma LSU12 Utah Utah Utah Utah Utah Oklahoma Utah Utah Missouri13 Missouri OklahomaState Missouri OklahomaState OklahomaState OklahomaState Missouri CentralFlorida Virginia Tech14 Nebraska Missouri CentralFlorida CentralFlorida CentralFlorida Missouri OklahomaState OklahomaState Oklahoma State15 OklahomaState Nebraska OklahomaState Missouri Missouri Nebraska CentralFlorida Missouri Nevada16 CentralFlorida CentralFlorida Hawaii Hawaii Hawaii Arkansas Arkansas Hawaii Alabama17 Hawaii Hawaii Arkansas Arkansas Arkansas CentralFlorida Hawaii Arkansas Texas A&M18 Arkansas Arkansas WestVirginia WestVirginia WestVirginia Hawaii WestVirginia WestVirginia Nebraska19 WestVirginia WestVirginia Tulsa FloridaState FloridaState FloridaState Nebraska Miami(Ohio) Utah20 Tulsa FloridaState Nebraska Tulsa Tulsa Tulsa Tulsa Tulsa South Carolina21 FloridaState Tulsa FloridaState Nebraska Nebraska WestVirginia FloridaState FloridaState Mississippi State22 Alabama LouisianaState Miami(Ohio) LouisianaState LouisianaState LouisianaState LouisianaState LouisianaState West Virginia23 LouisianaState Alabama LouisianaState Miami(Ohio) Miami(Ohio) Alabama Alabama Nebraska Florida State24 Miami(Ohio) Miami(Ohio) Alabama Alabama Alabama SouthCarolina Miami(Ohio) Alabama Hawaii25 NorthCarolinaStateTexasA&M Connecticut TexasA&M TexasA&M TexasA&M Connecticut TexasA&M UCF


Honors Research Colloquium Final Paper

Engineering

Transcript of Honors Research Colloquium Final Paper