Prediction of Sports Performance based on Genetic ... · PDF filePrediction of Sports...

Prediction of Sports Performance based on Genetic Algorithm and

Artificial Neural Network

Biao Xu

Physical education department, Suzhou University of Science and Technology, Suzhou,

215011

Email: [email protected]

Abstract

In order to predict sports performance more accurately, we investigated the recent algorithms, and

proposed a hybrid prediction system based on genetic algorithm and artificial neural network (GANN).

This is the first paper using the GANN to predict sport performance. We created a database of 1000

samples by questionnaire and physical test. The dataset was divided into two categories of training set

(30%) and test set (70%). The experiments demonstrate the MSE of our method achieves as small as

26.637 on the training set and 32.334 on the test set. Moreover, the training time only costs 1.563s.

Keywords: Sports performance, Genetic algorithm, Prediction, Artificial Neural Network

1. Introduction

A prediction or forecast is a statement about the way things will happen in the future [1], Sports

performance prediction is determined by many factors, such as the morale of a player and a team, skill

and the training strategy and so on [2]. The performance prediction is related to many people, including

the coach, fans and the athletes. It also raises interesting to the researchers. We explore the use of

advanced non-linear modeling techniques such as neural networks [3] to provide an analysis in such

decision situations.

Stekler et al. [4] found that a large quantity of effort was spent on predicting the results of sporting

events, but researchers focused only on the characteristics of sports forecasts. On the contrary, many

researchers paid attention to the efficiency of sports betting markets. As it turned out, it was possible to

derive a considerable amount of information about the forecasts and the forecasting process from

studies that had tested the markets for economic efficiency. Moreover, the huge number of observations

provided by betting markets made it possible to obtain robust tests of various forecasting hypotheses.

Their paper was concerned with a number of predicting topics in horse racing and several team sports.

The first topic involved the type of forecast that is made: picking a winner or forecasting whether a

particular team would beat the point spread. Different evaluation procedures would be tested and

alternative predicting methods (models, experts, and the market) compared. Iyer et al. [5] supposed that

the selection for international sports competitions required forecasting performance of individual

athletes. They employed the neural networks to rate players and select specific players for a

competition. Cricket as an example, they employed neural networks to predict each cricketer’s

performance in the future based upon their past performance. They classified cricketers into three

categories which are performer, moderate and failure. They collected data on cumulative player

performance from 1985 onwards until the 2006-2007 season. The neural network models were

Prediction of Sports Performance based on Genetic Algorithm and Artificial Neural Network Biao Xu

International Journal of Digital Content Technology and its Applications(JDCTA) Volume6,Number22,December 2012 doi:10.4156/jdcta.vol6.issue22.14

141

progressively trained and tested using four sets of data. The trained neural network models were then

applied to generate a forecast of the cricketer’s near term performance. Min et al. [6] proposed a

framework for sports prediction based on Bayesian inference and rule-based reasoning, together with

an in-game time-series approach. The framework consisted of two major components: a rule-based

reasoner and a Bayesian network component. The two different approaches cooperated in forecasting

the results of sports matches. It was motivated by the observation that sports matches are highly

stochastic, but meanwhile, the strategies of a team could be approximated by crisp logic rules. Plessner

et al [7] presented a social-cognitive overview of empirical work on judging sport performance. It

followed the basic steps of social information processing such as perception, encoding/categorization,

memory processes, and information integration. The application of a social cognition approach

provided insights into the processes that underlie biases in judgments of sport performance and, thus,

some hints on how to prevent them. In addition, they proposed possible future applications of social

cognition concepts in sports judgment research.

In this paper, we proposed a method based on genetic algorithm and BP network model to

forecasting the sports performance. The paper is organized as following: section 2 will describe the

basic back-propagation Neural Network, Section 3 explains the algorithm employed in this paper,

section 4 will demonstrate the experiment, Section 5 is the conclusion and our future work.

2. The Artificial Neural Network

The BP is a type of supervised learning neural network. A general model of the BP has a structure as

shown in Figure 1.

Figure 1. The architecture of an Artificial Neural Network

In Figure 1 we find that there are three layers exist in BP: input layer, hidden layer, and output layer.

Two nodes of each adjacent layer are directly connected called as a link [8]. Each link has a weighted


142

value as the relation degree between two nodes [9]. Assume that there are n input neurons, m hidden

neurons, and one output neuron [10], we can infer a training process described by the following

equations to update these weighted values, which can be divided into two steps [11]:

1) Hidden layer stage: The outputs of all neurons in the hidden layer are calculated by following steps:

0

1, 2, ,n

j ij i

i

net v x j m=

= =∑ L (1)

( ) 1, 2, ,j H jy f net j m= = L (2)

Here jnet is the activation value of the jth node, jy is the output of the hidden layer, and Hf is

called the activation function of a node, usually a sigmoid function as follow:

1

( )1 exp( )

Hf xx

=+ −

(3)

2) Output Stage: The outputs of all neurons in the output layer are given as follows:

0

( )m

O jk j

j

O f yω=

= ∑ (4)

Here of is the activation function, usually a line function [12]. All weights are assigned with random

values initially, and are modified by the delta rule according to the learning samples traditionally.

3. Introduction of Genetic Algorithm

In the computer science field of artificial intelligence, a genetic algorithm (GA) is a search heuristic

that mimics the process of natural evolution. This heuristic is routinely used to generate useful

solutions to optimization and search problems. Genetic algorithms belong to the larger class of

evolutionary algorithms (EA)[13] , which generate solutions to optimization problems using techniques

inspired by natural evolution, such as inheritance, mutation, selection, and crossover.

GAs are powerful stochastic search techniques based on the principle of natural evolution [14]. The

individuals of GA are encoded in the form of strings [15]. A collection of such string is called a

population. Initially, a random population is created, which represents different points in the search

space [16]. An objective function is associated with each string that represents the degree of the

goodness of the string. Based on the principle of survival of the fittest [17], a few of the string are

selected and each is assigned a numbers of copies that go into the mating pool. Crossover and mutation

operators are applied on these strings. The process of selection, crossover and mutation continues for a

fixed number of generations or till a termination condition is met [18]. Figure 2 shows the flowchart of

GA.


143

3.1. Initialization

In the beginning, a lot of individual solutions are randomly generated to form an initial population.

The population size is decided by the nature of the problem, but usually contains several hundreds or

thousands of possible solutions. Traditionally, the population is generated randomly, allowing the entire

range of possible solutions.

3.2. Selection

During each successive generation, a proportion of the existing population is selected to generate a

new generation. Individuals are selected via a fitness-based process, during which the solutions with

high fitness value are more likely to be chosen [19]. Certain selection methods rate the fitness of each

solution and preferentially select the best solutions. Other methods rate only a random sample of the

population, as the former process consumes a lot of time [20].

3.3. Crossover and Mutation

After the selection process, it is time to generate the second generation population of solutions from

those selected through genetic operators: crossover and mutation [21].

For each new solution to be produced, a pair of "parent" solutions is chosen for generating from the

pool selected previously. By producing a "child" solution using the above methods of crossover and

mutation, a new solution is created which shares many of the characteristics of its parents. New parents

are selected for each new child, and the process continues until a new population of solutions of

appropriate size is generated [22].

These processes finally make the next generation population of chromosomes different from the

initial generation. Generally the average fitness will be increased via this procedure for the population,

since only the best organisms from the first generation are selected for breeding, along with a small

proportion of less fit solutions, for reasons already mentioned above.

3.4. Termination

This generational process is repeated until a termination condition has been reached as shown in

Figure 2. Usually the conditions are:

A solution is found that satisfies minimum criteria

� Fixed number of generations reached;

� Allocated budget such as the computation time reached;

� The highest ranking solution's fitness is reached or has reached a plateau such that successive

iterations no longer produce better results;

� Manual inspection;

� All of the above.

GAs are very different from classical optimization techniques such as gradient-based algorithm in


144

the following points [23]: 1) GAs make use of the encoding of the parameters not the parameters

themselves; 2) GAs work on a population of points not a single one; 3) GAs only use the values of the

objective function not their derivatives or other auxiliary knowledge; 4) GAs use probabilistic

transition functions and not deterministic ones [24].

3.5. Genetic algorithm Neural Network

In order to employ the genetic algorithm to train the neural network, the fitness function is shown in

equation (5)

1

( ( ) ) ^ 2

( )

n

i i

i

P O

fn

=

−

=

∑ w

w (5)

In which Pi(w) and Oi means the predicted performance and original score of one specific athlete

Figure 2. Flow chart of the genetic algorithm.

4. Experiment

The experiments are carried on the Windows XP operation system with 2GB Hz processor and 1GB

memory. We also developed an in-house GUI as friendly interface between human and computers. The

GUI can run at any computer with Matlab.

4.1. Database

For the sports performance, the features used for the experiment is shown in Table 1. We select

fatigue, weather, experience, training time, weight, height and nutrition facts of the athletes as the


145

criteria of the prediction [25]. The performance of the athletes are scored within [60,100] [26].

Table 1. Features used for the prediction

Feature Grades

Fatigue Weak (A) Normal (B) Strong (C)

weather Excellent (A) Normal (B) Bad (C)

experience Long (A) Normal (B) Short (C)

Training time Long (A) Average (B) Short (C)

weight Above (A) Average (B) Less (C)

Height Tall (A) Average (B) Short (C)

Nutrition facts Excellent (A) Average (B) Bad (C)

We collect the data via questionnaire and physical ability tests based on 1000 students randomly

selected from Suzhou University of Science and Technology. We choose 30% for training which means

that we have 300 training samples and 700 hundred test samples. Figure 3 and Figure 4 shows the

performance score for 1000 samples and the histogram of the 1000 samples.

Figure 3. Original Performance Samples in the experiment

0 100 200 300 400 500 600 700 800 900 100060

65

70

75

80

85

90

95

100

Samples


146

Figure 4. Histogram of score samples

4.2. Experiment Result

We randomly select 300 hundred samples from the 1000 original data samples and the remaining

700 data samples will be used for testing. We employ the back-propagation neural network as the

training model and genetic algorithm to optimize the weight values [27]. The prediction results

compared with the original score are shown in Figure 5. The medium square error (MSE) between

prediction score and original score of the training samples is 26.637 and is 32.334 of the test samples

value. The training time of 300 data samples is 1.563 seconds.

5. Conclusion and Future work

We employ the BP neural and genetic algorithm for the sports performance prediction. From the

experiment result, it demonstrates that we can approximately predict the value.

Our future work should collect some real data from different school and different ages. Furthermore,

we should optimize the algorithm to make the prediction more accurate. Finally, we should design a

user-interface to the wide application.

60 65 70 75 80 85 90 95 1000

5

10

15

20

25

30

35

Grade/Points

Fre

quency*1

00


147

Figure 5. Prediction result vs Original result

Reference

[1] Yudong Zhang, Lenan Wu, "Stock market prediction of S&P 500 via combination of improved BCO

approach and BP neural network," Expert systems with applications, Vol. 36, No. 5, pp. 8849-8854, 2009.

[2] Marius Janta, Christoph Ebert, Veit Senner, "Functionality and performance of customized sole inlays

for various sports applications," Procedia Engineering, Vol. 34, No. 0, pp. 290-294, 2012.

[3] Yudong Zhang, Shuihua Wang, Lenan Wu, Yuankai Huo, "PSONN used for Remote-Sensing Image

Classification," Journal of Computational Information Systems, Vol. 6, No. 13, pp. 4417- 4425, 2010.

[4] H. O. Stekler, David Sendor, Richard Verlander, "Issues in sports forecasting," International Journal of

Forecasting, Vol. 26, No. 3, pp. 606-621, 2010.

[5] Subramanian Rama Iyer, Ramesh Sharda, "Prediction of athletes performance using neural networks:

An application in cricket team selection," Expert Systems with Applications, Vol. 36, No. 3, Part 1, pp.

5510-5522, 2009.

[6] Byungho Min, Jinhyuck Kim, Chongyoun Choe, Hyeonsang Eom, R. I. McKay, "A compound

framework for sports results prediction: A football case study," Knowledge-Based Systems, Vol. 21, No. 7,

pp. 551-562, 2008.

[7] Henning Plessner, Thomas Haar, "Sports performance judgments from a social cognitive perspective,"

Psychology of Sport and Exercise, Vol. 7, No. 6, pp. 555-575, 2006.

[8] Yudong Zhang, Lenan Wu, "A Rotation Invariant Image Descriptor based on Radon Transform,"

International Journal of Digital Content Technology and its Applications, Vol. 5, No. 4, pp. 209-217, 2011.

[9] Alireza Emami, Pirooz Noghreh, "New approach on optimization in placement of wind turbines within

wind farm by genetic algorithms," Renewable Energy, Vol. 35, No. 7, pp. 1559-1564, 2010.

[10] Teng Wu, Ahsan Kareem, "Modeling hysteretic nonlinear behavior of bridge aerodynamics via cellular

automata nested neural network," Journal of Wind Engineering and Industrial Aerodynamics, Vol. 99, No. 4,

pp. 378-388, 2011.

[11] Yudong Zhang, Lenan WU, "A Fast Document Image Denoising Method based on Packed Binary

format and Source Word Accumulation," Journal of Convergence Information Technology, Vol. 6, No. 2, pp.

131-137, 2011.

0 100 200 300 400 500 600 70060

65

70

75

80

85

90

95

100

Sample

Gra

des/P

oin

ts

Prediction

Original


148

[12] S. P. Asok, K. Sankaranarayanasamy, T. Sundararajan, K. Rajesh, G. Sankar Ganeshan, "Neural

network and CFD-based optimisation of square cavity and curved cavity static labyrinth seals," Tribology

International, Vol. 40, No. 7, pp. 1204-1216, 2007.

[13] Min-Yuan Cheng, Andreas F. V. Roy, "Evolutionary fuzzy decision model for construction

management using support vector machine," Expert Systems with Applications, Vol. 37, No. 8, pp.

6061-6069, 2010.

[14] Daniel Tuhus-Dubrow, Moncef Krarti, "Genetic-algorithm based approach to optimize building

envelope design for residential buildings," Building and Environment, Vol. 45, No. 7, pp. 1574-1581, 2010.

[15] Yudong Zhang, Lenan Wu, "A Novel Genetic Ant Colony Algorithm," Journal of Convergence

Information Technology, Vol. 7, No. 1, pp. 268-274, 2012.

[16] J. Sasikala, M. Ramaswamy, "Optimal gamma based fixed head hydrothermal scheduling using genetic

algorithm," Expert Systems with Applications, Vol. 37, No. 4, pp. 3352-3357, 2010.

[17] Yudong Zhang, Lenan Wu, "A Genetic Ant Colony Classifier System," in Computer Science and

Information Engineering, 2009 WRI World Congress on, 2009, pp. 744-748.

[18] I. J. Graham, K. Case, R. L. Wood, "Genetic algorithms in computer-aided design," Journal of

Materials Processing Technology, Vol. 117, No. 1–2, pp. 216-221, 2001.

[19] Francesco Isgr, Domenico Tegolo, "A distributed genetic algorithm for restoration of vertical line

scratches," Parallel Comput., Vol. 34, No. 12, pp. 727-734, 2008.

[20] Yudong Zhang, Lenan Wu, "Face Pose Estimation by Chaotic Artificial Bee Colony," International

Journal of Digital Content Technology and its Applications, Vol. 5, No. 2, pp. 55-63, 2011.

[21] Sanjay Kumar Shukla, M. K. Tiwari, "Soft decision trees: A genetically optimized cluster oriented

approach," Expert Systems with Applications, Vol. 36, No. 1, pp. 551-563, 2009.

[22] Yudong Zhang, Yuankai Huo, Qing Zhu, Shuihua Wang, Lenan Wu, "Polymorphic BCO for Protein

Folding Model," Journal of Computational Information Systems, Vol. 6, No. 6, pp. 1787-1794, 2010.

[23] A. B. Koc, P. H. Heinemann, G. R. Ziegler, "Optimization of Whole Milk Powder Processing Variables

with Neural Networks and Genetic Algorithms," Food and Bioproducts Processing, Vol. 85, No. 4, pp.

336-343, 2007.

[24] Yudong Zhang, Lenan Wu, "A Hybrid TS-PSO Optimization Algorithm," Journal of Convergence

Information Technology, Vol. 6, No. 5, pp. 169-174, 2011.

[25] François Marclay, Elia Grata, Laurent Perrenoud, Martial Saugy, "A one-year monitoring of nicotine

use in sport: Frontier between potential performance enhancement and addiction issues," Forensic Science

International, Vol. 213, No. 1–3, pp. 73-84, 2011.

[26] Trevor Thompson, Tony Steffert, Tomas Ros, Joseph Leach, John Gruzelier, "EEG applications for

sport and performance," Methods, Vol. 45, No. 4, pp. 279-288, 2008.

[27] Yudong Zhang, Lenan Wu, "Weights optimization of neural network via improved BCO approach,"

Progress in Electromagnetics Research, Vol. 83, No. pp. 185-198, 2008.


149

Prediction of Sports Performance based on Genetic ... · PDF filePrediction of Sports...

Documents

Transcript of Prediction of Sports Performance based on Genetic ... · PDF filePrediction of Sports...