Atsalakis Et Al. 2011 - Elliott Wave Theory and Neuro-fuzzy Systems in Stock Market Prediction - The...

11
Elliott Wave Theory and neuro-fuzzy systems, in stock market prediction: The WASP system George S. Atsalakis , Emmanouil M. Dimitrakakis, Constantinos D. Zopounidis Department of Production Engineering and Management, Technical University of Crete, Chania 73100, Greece article info Keywords: Neurofuzzy forecasting Stock market forecasting Price trend forecasting Elliott waves Technical analysis ANFIS abstract This paper presents the WASP (Wave Analysis Stock Prediction) system, a system based on the neuro- fuzzy architecture, which utilizes aspects from the Elliott Wave Theory, presented by Ralph Nelson Elliott. This theory has been found to be extremely useful and accurate, particularly in problems of forecasting. A neuro-fuzzy logic technique has been used to forecast the trend of the stock prices and the results derived are very encouraging. Ó 2011 Elsevier Ltd. All rights reserved. 1. Introduction Finding a reliable forecasting method has always been the goal of many investors. The possibility of easy profits by forecasting the market has been the underlying force that motivates many researchers to invent new methodologies and models. Forecasts have always been made, with the help of statistical models, techni- cal analysis, econometric methods, and others. Recently, artificial intelligence has been found to provide significant results for such problems. 2. Neurofuzzy systems and stock forecasting Interesting efforts have been made in the way of neuro fuzzy systems in stock forecasting. Abraham (2004) presented various neurofuzzy modeling techniques. Wong, Wang, Goh, and Quek (1992) proposed a neuro fuzzy system to forecast annual returns. The inputs he used were the beta coefficient, a moving average of three years, Tobin’s q, and others. The results were satisfactory. Nishina and Hagiwara (1997) proposed a model that aimed to fore- cast the exact return of a stock. His model achieved better results than a neural network. Rast (1999) compared a neurofuzzy system with a neural network, for the years 1987 and 1988. His system was trained with the DAX index. He found that the neuro fuzzy system outperformed the neural network. Siekmann, Kruse, Gebhardt, Van Overbeek, and Cooke (2001) proposed a neuro-fuzzy model to forecast the DAX index. His model outperformed a linear model, using the hit-rate as a measure of success. Abraham, Nath, and Mahanti (2001) combined a neural network with a neuro- fuzzy system to forecast the next day’s return of the NASDAQ index. He used the neuro-fuzzy system to evaluate the neural net- work’s result. His overall results were satisfactory. Wu, Fung, and Flitman (2001) proposed a Feed Forward Neuro Fuzzy (FFNF) sys- tem to forecast the monthly trend on the S&P500 index. As inputs he used previous returns, as well as various economic indices like the unemployment rate, lending rates and others. Atsalakis and Valavanis (2009) proposed a system based on an inverted neuro fuzzy controller. As inputs he used returns. The model achieved one of the highest percentages of correct forecasts, over various periods for various stocks. Ghandar, Schmidt, To, and Zurbruegg (2007), used a neuro-fuzzy technique to select stocks, based on buy and sell signals of various technical analysis indexes, like the moving average, the double moving average and the On Balance Volume index. The model assigns every stock with a score. Accord- ing to this score, the system chooses the stocks that form the port- folio. The results were very interesting. Bekiros (2007) presents the results of an ANFIS system, which forecasts the rate of change for the next day, of the NIKKEI index. As inputs, both returns and their lags were used. The system was compared with an ARMA model and a neural network. The neuro-fuzzy system outperformed on bear markets and on bull markets. Pokropinska and Scherer (2008) proposed a neuro-fuzzy system based on the Mamdani architecture to forecast buy and sell signals for the stock market of Warsaw. The inputs used were closing prices, opening prices, maximum and minimum, the difference between maximum and minimum, volume, as well as a combination of moving averages and other technical analysis indexes and oscillators. The system gave correct signals for the testing period. More stock market fore- casting techniques have been presented by Atsalakis et al. (2001), Atsalakis and Ucenic (2005), Atsalakis, Skiadas, and Braimis (2007), Atsalakis and Nezis (2008), Atsalakis and Valavanis (2008, 2010a, 2010b), and Atsalakis and Zopounidis (2009). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.01.068 Corresponding author. E-mail addresses: [email protected] (G.S. Atsalakis), [email protected] (E.M. Dimitrakakis), [email protected] (C.D. Zopounidis). Expert Systems with Applications 38 (2011) 9196–9206 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

description

Elliott Wave Theory and Neuro-fuzzy Systems in Stock Market Prediction

Transcript of Atsalakis Et Al. 2011 - Elliott Wave Theory and Neuro-fuzzy Systems in Stock Market Prediction - The...

Expert Systems with Applications 38 (2011) 9196–9206

Contents lists available at ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Elliott Wave Theory and neuro-fuzzy systems, in stock market prediction:The WASP system

George S. Atsalakis ⇑, Emmanouil M. Dimitrakakis, Constantinos D. ZopounidisDepartment of Production Engineering and Management, Technical University of Crete, Chania 73100, Greece

a r t i c l e i n f o

Keywords:Neurofuzzy forecastingStock market forecastingPrice trend forecastingElliott wavesTechnical analysisANFIS

0957-4174/$ - see front matter � 2011 Elsevier Ltd. Adoi:10.1016/j.eswa.2011.01.068

⇑ Corresponding author.E-mail addresses: [email protected] (G.S. Atsalakis

(E.M. Dimitrakakis), [email protected] (C.D. Zopoun

a b s t r a c t

This paper presents the WASP (Wave Analysis Stock Prediction) system, a system based on the neuro-fuzzy architecture, which utilizes aspects from the Elliott Wave Theory, presented by Ralph Nelson Elliott.This theory has been found to be extremely useful and accurate, particularly in problems of forecasting. Aneuro-fuzzy logic technique has been used to forecast the trend of the stock prices and the results derivedare very encouraging.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Finding a reliable forecasting method has always been the goalof many investors. The possibility of easy profits by forecasting themarket has been the underlying force that motivates manyresearchers to invent new methodologies and models. Forecastshave always been made, with the help of statistical models, techni-cal analysis, econometric methods, and others. Recently, artificialintelligence has been found to provide significant results for suchproblems.

2. Neurofuzzy systems and stock forecasting

Interesting efforts have been made in the way of neuro fuzzysystems in stock forecasting. Abraham (2004) presented variousneurofuzzy modeling techniques. Wong, Wang, Goh, and Quek(1992) proposed a neuro fuzzy system to forecast annual returns.The inputs he used were the beta coefficient, a moving averageof three years, Tobin’s q, and others. The results were satisfactory.Nishina and Hagiwara (1997) proposed a model that aimed to fore-cast the exact return of a stock. His model achieved better resultsthan a neural network. Rast (1999) compared a neurofuzzy systemwith a neural network, for the years 1987 and 1988. His systemwas trained with the DAX index. He found that the neuro fuzzysystem outperformed the neural network. Siekmann, Kruse,Gebhardt, Van Overbeek, and Cooke (2001) proposed a neuro-fuzzymodel to forecast the DAX index. His model outperformed a linearmodel, using the hit-rate as a measure of success. Abraham, Nath,and Mahanti (2001) combined a neural network with a neuro-

ll rights reserved.

), [email protected]).

fuzzy system to forecast the next day’s return of the NASDAQindex. He used the neuro-fuzzy system to evaluate the neural net-work’s result. His overall results were satisfactory. Wu, Fung, andFlitman (2001) proposed a Feed Forward Neuro Fuzzy (FFNF) sys-tem to forecast the monthly trend on the S&P500 index. As inputshe used previous returns, as well as various economic indices likethe unemployment rate, lending rates and others. Atsalakis andValavanis (2009) proposed a system based on an inverted neurofuzzy controller. As inputs he used returns. The model achievedone of the highest percentages of correct forecasts, over variousperiods for various stocks. Ghandar, Schmidt, To, and Zurbruegg(2007), used a neuro-fuzzy technique to select stocks, based onbuy and sell signals of various technical analysis indexes, like themoving average, the double moving average and the On BalanceVolume index. The model assigns every stock with a score. Accord-ing to this score, the system chooses the stocks that form the port-folio. The results were very interesting. Bekiros (2007) presents theresults of an ANFIS system, which forecasts the rate of change forthe next day, of the NIKKEI index. As inputs, both returns and theirlags were used. The system was compared with an ARMA modeland a neural network. The neuro-fuzzy system outperformed onbear markets and on bull markets. Pokropinska and Scherer(2008) proposed a neuro-fuzzy system based on the Mamdaniarchitecture to forecast buy and sell signals for the stock marketof Warsaw. The inputs used were closing prices, opening prices,maximum and minimum, the difference between maximum andminimum, volume, as well as a combination of moving averagesand other technical analysis indexes and oscillators. The systemgave correct signals for the testing period. More stock market fore-casting techniques have been presented by Atsalakis et al. (2001),Atsalakis and Ucenic (2005), Atsalakis, Skiadas, and Braimis (2007),Atsalakis and Nezis (2008), Atsalakis and Valavanis (2008, 2010a,2010b), and Atsalakis and Zopounidis (2009).

G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206 9197

2.1. Elliott Wave Theory

The Elliott Wave Theory was introduced by Ralph Nelson Elliott,during the 1930s. Elliott believed that stock trends follow arepeating pattern, which can be forecasted both in the long andshort-term. He published his ideas in his book ‘‘The Elliott WavePrinciple’’ in 1938. Using data from stocks he concluded, that whatseems to be a chaotic movement, actually outlines a harmonyfound in nature. Elliott’s discovery was completely based on obser-vation, but he tried to explain his findings using psychological rea-sons. The main principle of his theory was that a pattern consists ofeight waves as can be seen in Diagram 1.

It is clearly indicated that waves 1, 3, 5 follow the overall trend,while waves 2 and 4 correct the underlying trend waves a, b, c cor-rect the overall trend, while waves a and c follow the correctionand wave b resists it. Elliott observed that each wave consists ofsmaller waves, which follow the exact same pattern as is shownin Diagram 2, thereby forming a super-cycle.

The numbers in the diagram represent the number of waveswhen counted in a different scope. For example, the whole diagramrepresents two big waves, the impulse and the correction. The im-pulse consists of 5 waves, and the correction of 3. The 5 waves ofthe impulse, consist of 21 sub-waves which, in turn, consist of 89smaller waves, while the Correction wave consists of 13 sub-waveswhich, in turn, consist of 55 even smaller waves. As can beobserved all the above numbers are part of the Fibonacci series, aseries which can be found in many different areas in nature.According to the Elliott Wave Theory (Prechter & Frost, 1998),when Elliott first expressed his theory he was not aware of theFibonacci series. What is even more remarkable is the fact thatsome ratios which are related to the Fibonacci series can be ob-served in many stock movements, and charts, as will be presentedlater.

Elliott believed that there are nine cycles, of different durations,the bigger of which, is formed by the smaller ones. From largest tothe smallest cycles, there are: (1) Grand Super-Cycle, (2) Super-Cycle, (3) Cycle, (4) Primary, (5) Intermediate, (6) Minor, (7)Minute, (8) Minuette and (9) Subminuette.

The duration of the cycles vary from minutes to decades. Eachpattern (cycle) is outlined by the following rules:

(1) The second wave cannot be longer than the first wave, andcannot return to a lower price than that set at the beginningof the first wave.

(2) The third wave is never the smallest wave compared to thefirst and the fifth.

Diagram 1. Basic Elliott wave pattern. Source: Prechter and Frost (1998).

(3) The fourth wave does not return to a lower price than theprice found at end of the first wave. The same applies forwave A.

(4) Usually, the third wave shows a greater dynamic, except insome cases where the fifthwave is extended (the case whenthe fifth wave is made up of five smaller waves).

(5) The fifth wave usually leads to a higher point than the fifth.

2.1.1. Explaining the wave behaviorThe first wave is the ‘‘new beginning’’ of an impulse. It is diffi-

cult to differentiate it from a correction of a previous downtrend,and therefore it is not a powerful wave. Most investors prefer towait for better timing. The force behind the wave pattern is thenumber of investors that decide to enter and exit the market at agiven time. After some initial winnings, investors decide to exitthe market as the price becomes higher, and the stock becomesoverpriced for these few investors. This behavior explains the sec-ond wave. As the price begins falling, the stock becomes moreattractive for a greater number of investors that regretted not hav-ing entered the market during the first wave that they missed.Demand begins rising, which pushes the price up. More and moreinvestors are determined to enter the market, creating a powerful,fast paced wave, which in turn attracts even more investors to en-ter the market at a higher price. Those who entered in the begin-ning of the wave, are satisfied with their winnings, and havemost likely exited the market. Investors realize that the price hasreached a level making it difficult to attract any further investors.Demand begins falling, which leads to the fourth wave. Majorinvestors are out of the market, waiting for the end of the fourthwave, to enter again and reap in the profits of the fifth wave. It isimportant to note that the fourth and fifth waves are the easiestones to follow, as they come after the third wave which is the eas-iest to spot, due to its length, power, and speed. Major investorshave bought stocks on lower prices, from investors that had boughtthem during the end of the third wave, who feared the price mightgo lower. However as the major investors enter the market again,they create a small hype, the fifth wave, smaller than the thirdwave, which usually reaches the peak of the third wave, and some-times even higher. Investors, who know the market, know that theprice is extremely overrated and therefore have exited the market.Wave A, is a corrective wave, which is often mistaken for a secondwave. This explains wave B. Smaller investors think that wave Acorrected the price enough, so that it can lead to an upward trend.Unfortunately, this is the wave where most smaller, and occasionalinvestors lose huge amounts of money, as wave C starts, pushingthe price lower until the price gets underrated again, for a new pat-tern to start.

The above explanation is by no means a statistical explanationof the wave behavior, but explains the difference between majorand occasional investors, and their knowledge of the market. It isvery important to know the exact wave patterns, otherwise it isvery easy to misinterpret signs. It is important to note that the fol-lowing explanation regards an overall impulse trend. The oppositewould happen in the case of an overall correction.

2.1.2. The Fibonacci seriesAs previously mentioned, the Elliott wave principle is acciden-

tally (According to Elliott) connected with the Fibonacci sequence.The Fibonacci sequence is a sequence of numbers derived from theaddition of the previous two numbers (1,1,2,3,5,8,13,21,34,55,89,144, . . .). One can observe that all these numbers are also thenumbers of waves, depending of the size of the Elliott wave pat-terns. What is remarkable for this sequence is that the division ofa number of the sequence with its previous numbers, with theexception of the first ones, gives the same result, the number1.618, which is called Phi (the Greek letter /). This number is also

Diagram 2. Elliott wave super-cycle. Source: Prechter and Frost (1998).

9198 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206

called the golden ratio, or the golden number. By dividing anynumber of the sequence with its following, the result is the number0.618, another important number as will be seen later. There arehistorical evidences that this number was known to ancient civili-zations, as such ratios are found in the Egyptian Pyramids, AncientGreek architecture (The Parthenon). This number is also found innature, microcosmic and macrocosmic. It can be found in HumanDNA architecture, is spiral galaxies, and in planet movements. Itis not surprising that such ratios are also found in Stock marketmovements. In a later book of Elliott, ‘‘Nature’s Law’’, he states that’’All human activities have three distinctive features, pattern, time andratio, all of which observe the Fibonacci summation series’’.

Elliott observed that many ratios in his patterns are derivedfrom the Fibonacci sequence. Wave 2, usually corrects up to 50%or 62% of wave 1. Wave 3 is usually 1.62, 2.62 or 4.25 times thelength of wave 1. Wave 4 corrects up to 24% or 28% of wave 3.Wave 5 is somewhat more complicated, and depends either onwave 1 (1.62 or 2.62 times wave one), or on the length from thebeginning of wave 1 to the end of wave 3 (being 0.62, 1, or 1.62times this length). It is important to note that these ratios arenot to be used for forecasting the market, but rather explainingthe market, and spotting the waves. Elliott’s wave theory cannotconstantly explain the market perfectly, but can give fuzzy esti-mates of the market behavior. Of course, other factors affect themarket, but as the results of the suggested system show, the theoryis capable of improved stock market forecasting.

2.2. Fuzzy logic

Fuzzy logic was proposed by Zadeh (1965), as an alternativeway to express data. One of the most typical examples to under-stand fuzzy logic is the expression of age. Classical reasoningmeasures age in years. Fuzzy logic can be used to express agein three categories: young, middle-aged, old. This can be usedwith the help of a membership function which converts age inyears into an appropriate category. A membership function takeson values in the space [0,1], according to how much the databelong to a corresponding category. This is the main differencebetween fuzzy logic and classical reasoning. In Fuzzy logic, datacan belong in more than one category, while in classical

reasoning, something is either true or false. The above reasoningis depicted in Diagram 3.

According to the diagram, the age of 30 can be categorized bothas young and middle aged, with a membership grade of 0.5 while,the age of 80 belongs with a grade of 1 in the fuzzy set ‘‘old’’. Clas-sical reasoning would use a threshold stating, for example, thatpeople older than 30 are middle aged, which would mean that aperson of 29 years of age would be young. One of the biggestadvantages of fuzzy logic is this use of verbal variables, whichare easily understood by humans. It is important to note that clas-sical reasoning can be seen as a subset of fuzzy logic. In the aboveexample, one could use one fuzzy set for each year, or even a fuzzyset for every 6 months.

Fuzzy logic can be used in many problems, where informationdoes not have to be precise. This is the main reason why it waschosen to be used in stock market forecast, a problem where infor-mation is not noise free, as there are different factors affecting theoutput.

There are various membership functions, which work better fordifferent problems, the most common of which are the triangular,trapezoidal, Gaussian, and the generalized bell membership func-tion shown in Diagram 4.

The triangular membership function requires three parameters(a,b,c) according to the following relation, proposed by Jang andSun (1997)

triangleðx; a; b; cÞ ¼ max minx� ab� a

;c � xc � b

� �;0

� �

The trapezoid membership function needs four parameters (a,b,c,d)according to the following relation

trapezoidðx; a; b; c;dÞ ¼max minx� ab� a

;1;d� xd� c

� �;0

� �

The Gaussian function utilizes two parameters, c, r, with c being thecenter of the curve, and r indicating the width

gaussianðx; a;rÞ ¼ e�12

x�crð Þ2

Finally, the Generalized bell curve uses three parameters: (a,b,c)

bellðx; a; b; cÞ ¼ 1

1þ x�ca

�� ��2b

Diagram 3. Fuzzy logic example. Source: Jang and Sun (1997).

Diagram 4. Membership functions. Source: Jang and Sun (1997).

G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206 9199

2.3. Fuzzy systems

According to Jang and Sun (1997), a fuzzy system consists ofthree parts:

� the rule base;� the database, or dictionary that includes the membership

functions;� the reasoning mechanism.

There are three main types of fuzzy systems:

(1) the Mamdani system;(2) the Sugeno system;(3) the Tsukamoto system.

2.3.1. Mamdani systemThe difference of each system lies in the way the inputs interact

and lead to an output. A Mamdani system produces a fuzzy outputwhich has to be defuzzified.

Diagram 5 depicts a Mamdani system, where the value x is apart of both the A1 and A2 fuzzy set, value y belongs to the B1

and B2 fuzzy sets, and the output Z is expressed by two fuzzy sets,C1 and C2. This system basically has two rules

If x is A1 and y is B1 then Z is C1

if x is A2 and y is B2 then Z is C2

The final results C0 is used by using the Max operator. A differentoperator can also be used. As mentioned before, the Mamdani sys-tem produces a fuzzy output, which needs to be defuzzified. There

Diagram 5. Mamdani architecture.

Diagram 7. Sugeno architecture (Jang & Sun, 1997).

Diagram 8. Tsukamoto architecture.

x1(n)w1

9200 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206

are five different methods to be used: (a) the smallest of the max;(b) the largest of the max; (c) the centroid of the; area (d) the bisec-tor of the area and (e) the mean of the max. The above methods canbe seen in Diagram 6.

2.3.2. Sugeno systemThe Sugeno type fuzzy systems are different from the Mamdani

ones, in the sense that they use a function as an output, and there-fore the output is already a crisp number and not a fuzzy set. Thesame example is used as before in Diagram 7.

This time the rules have the following structure:

If x is A1 and y is B1 then z1 ¼ p1 � xþ q1 � r1

if x is A2 and y is B2 then z2 ¼ p2 � xþ q2 � r2

The final output Z is calculated by using a weighted average of z1

and z2.Sugeno type systems are less demanding in computational

power, but not as flexible as Mamdani systems. The system WASPpresented later is based on the ANFIS neuro-fuzzy architecture,which uses a Sugeno type fuzzy system.

2.3.3. Tsukamoto systemTsukamoto systems are used less often than the Mamdani and

Sugeno type systems. The difference lies again in the way the out-put is calculated. Tsukamoto system use a monotonic function as isillustrated in Diagram 8.

Diagram 6. Defuzzification methods.

x2(n)

x3(n)

xp(n) bias

Single Layer Perceptron

wp

Output:y(n)Inputs

Sum v(n)

Diagram 9. Simple neural network.

The main disadvantage of fuzzy systems is that knowledge mustbe predefined, meaning that the membership functions must beset.

2.4. Neural networks

An artificial neural network (ANN) or neural network (NN), is acomputational method used to model data, derived from the fieldof artificial intelligence. Neural networks try to imitate the archi-tecture of the human brain. Diagram 9 that follows shows a simple

G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206 9201

neural network with one neuron and n inputs. Every input is mul-tiplied by a different parameter. All inputs are added in the neuron,giving the final result. There is no limit in the number of neuronsused, although a very large number can make the network extre-mely demanding in computational power.

Neural networks are very efficient in modeling non-linearproblems. Neural networks are trained from data with the helpof adaptive algorithms such as the back propagation algorithm(Rumelhart, Hinton, & Williams, 1986; Werbose, 1974). The mainadvantages of the neural networks are the ability to learn by exam-ple, in other words to create knowledge from past data. Addition-ally, such networks are found to be extremely useful in patternrecognition. On the other hand, neural networks have also beencriticized mainly due to the high computational power that is re-quired, which limits the number of input variables that can beused. The main disadvantage of a neural network is the lack ofinformation regarding the impact of every input on the output.Neural networks are commonly called a ‘‘black box’’ as there isno information given other than the output.

2.5. Neuro fuzzy systems – ANFIS

Neuro fuzzy systems emerged by the need to find a solution tothe disadvantages of neural networks and fuzzy logic, while main-taining the advantages from both theories. A neuro-fuzzy system isbasically a neural network, where the inputs are transformed into afuzzy set. The parameters are determined by an adaptive algorithmfrom existing data. The advantages of the neuro-fuzzy architecture,is that knowledge is created from the data, and the results are eas-ily interpreted with the use of fuzzy verbal variables. The systemWASP uses the neuro-fuzzy architecture ANFIS proposed by Jangand Sun (1997), which has been found to be extremely efficientin forecasting time series. As mentioned before, the ANFIS architec-ture uses a Sugeno type fuzzy system. Diagram 10 shows an ANFISsystem with two inputs and five levels.

For every level, let Oij be the output of level i, and node j for levelone

O1j ¼ lAjðxÞ for j ¼ 1;2;

and

O1j ¼ lBjðxÞ for j ¼ 3;4;

where lAi(x)lBi(x) are the fuzzy values of the variables A and B ofthe corresponding membership function.

For level two

O2j ¼ wj ¼ lAjðxÞ � lBjðxÞ for j ¼ 1;2

For level three

O3j ¼ wj ¼wj

ðw1 þw2Þfor j ¼ 1;2

Diagram 10. ANFIS architecture (Jang & Sun, 1997).

For level four

O4j ¼ wjfj ¼ wjðpjxþ qjyþ rjÞ for j ¼ 1;2

Finally, for level five

O5;1 ¼X

1

wkfk ¼P

1wkfkP1wk

for k ¼ 1;2

Training of the system requires establishing the parameters of themembership functions, and the parameters of the output functionsof the Sugeno type fuzzy system (pi,qi,ri). The latter are found in lev-els four and five of the ANFIS system. Jang proposed a hybrid meth-od to optimize the ANFIS system, by using two phases, a forwardpass and a backward pass. The forward pass uses the least squaremethod to optimize the consequent parameters (pi,qi,ri) in levelsfour and five, while the backward pass uses a gradient descent algo-rithm such as the back propagation algorithm, to optimize the pre-mise parameters of the membership functions used as inputs inlevels one to three. The parameters of ANFIS are optimized in twosets (premise parameters, and consequent) to reduce the computa-tional power, as the consequent parameters are linear, and there-fore, a linear method such as the least square method can beused, while the premise parameters are non-linear.

2.5.1. Training ANFIS – forward passThe final output of the ANFIS system is a function of the param-

eters p, q, r

O5;1 ¼X

1

wifi ¼ ðw1x1Þp1 þ ðw1x2Þq1 þ ðw1Þr1 þ ðw2x1Þp2

þ ðw2x2Þq2 þ ðw2Þr2

which is a linear function.

2.5.2. Training ANFIS - backward passDuring the backward pass the parameters of the membership

functions are optimized. The gradient descent method is basedon a minimization of a cost function. The cost function is derivedfrom the sum of the square of errors. Assuming the data have p in-puts then:

Jp ¼XL

i¼1

Ti;p � OLi;p

� �2

where Ti,p is the ith element of the pth vector generated by the pthinput vector. The total error is

J ¼Xp

p¼1

Jp

To use a gradient descent method, the error rate @Jp

@O of the pth set ofinput of the training data must be computed. The error rate for theoutput node in level L, and element are derived from the previousfunction using the derivative

@Jp

@OLi;p

¼ �2 Ti;p � OLi;p

� �

for the internal node the error rate is

@Jp

@Oki;p

¼Xðkþ1Þ

m¼1

@Jp

@Okþ1m;p

@Okþ1m;p

@Oki;p

where 1 6 k 6 L � 1. As a conclusion to the above, the error rate ofan internal node is a linear function of the error rates of the nodes ofthe subsequent levels. Thus, by combining the previous functionsevery error can be computed. For the parameter ‘‘a’’ of a member-ship function:

Diagram 11. Elliott wave oscillator.

9202 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206

@Jp

@a¼XO2S

@Jp

@O@O@a

where S indicates the nodes that include the parameter ‘‘a’’.The derivative of the total cost J for parameter a, is:

@J@a¼Xp

p¼1

@Jp

@a

In every epoch, parameter a is optimized according to the function:

Da ¼ �g@j@a

where g is the learning rate which changes during the repetitions ofthe algorithm according to the following relation:

g ¼ kffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPa

@J@a

� 2q

where k is the step size, a very important parameter for finding anoptimum. A very small k might trap the algorithm in a local opti-mum instead of a global optimum. On the other hand, a higher stepsize might overpass a local or global optimum. The k-value also af-fects the speed of the algorithm.

The parameters that ANFIS needs are: (a) the type of member-ship functions; (b) the number of membership functions for eachinput; the epochs (repetition) of the training algorithm and (d)the step size.

3. Wave Analysis Stock Prediction – WASP

The greatest challenge of the Elliott Wave Theory is to count thewaves, and spot the current position of the market or stock on anElliott wave pattern. The same chart can be interpreted in variousways and this can lead to disastrous results for the investor. For thisreason, researchers looked for an index to help track the waves. Asmentioned before, the easiest wave to track is the third wave, whichpushed researchers to analyze the behavior during this wave. Dur-ing this wave, a recent moving average would be significantly higherthan a longer moving average. This is how the Elliott wave oscillatoremerged. The Elliott wave oscillator is derived by subtracting of a35-day moving average from a 5-day moving average. The oscillatorwill have higher values on the third wave, lower but positive valueson the first and fifth waves (but will show corrections and their sig-nificance), and finally, will have negative values on biggest correc-tions, or downtrend impulse waves. The Elliott wave oscillator isused in technical analysis. The first part of Diagram 11 depicts theprice movement of a stock. The second part shows the moving aver-ages of 5 and 35 days. The third part indicates the difference of themoving averages, which is the Elliott wave oscillator (EWO).

The EWO is a suitable oscillator to be used with fuzzy logic as itis important for a system to recognize ‘‘high’’ values of the index.Depending on the value of the stock, the same number can be highor low in different time periods; therefore, a crisp logic cannot beused. The system will use three inputs. Many trials were, carriedout and the best results appeared when we used the Elliot waveoscillator and two lags of the oscillator. This will help track downthe change of trend. Following the ANFIS architecture, the systemwill create different rules (linear functions), for various non-linearscenarios that would look like the following statement:

If EWOt is High and EWOt-1 is High AND EWOt-2 is LOW thenFORECAST IS . . .

The output of the system is the forecasting of the next day’smovement. On the training data, the output has been modified to+1 indicating a positive rate of change, �1 for a negative rate of

change, and 0 for unchanged price movement. This was done to re-duce the complexity of the results, as it is very challenging to fore-cast the exact rate of change, while the important information isthe price trend movement.

3.1. Explaining the WASP system

The proposed system is not a single model, but a repetitivemethod of selecting nine different ANFIS sub-models, whose com-bination will give the final forecast. This method was chosen due tothe fact that various models (ANFIS with different parameters ofmembership function types, number, step sizes and epoch num-bers) gave very good results for some periods, but not constantly,while the other set of parameters gave very good results for laterperiods. The reasoning of WASP is depicted in Diagram 12. Usingthe price, as well as the moving averages of 5 and 35 days, theEWO, and the oscillator lags, which formulate the input data canbe calculated. The returns can also be calculated from prices, andtransformed into values [�1,0,1] thereby creating the output data.The last set of the input data is used for the new forecast. Theremaining data combined with the already known output datacomprises the total data, which are divided in training data andtesting data. The data used are 2060 daily observations. The last60 entries are used for testing data and the remaining 2000 dataare used to train the neuro-fuzzy sub-systems. Two membershipfunctions for each input were chosen, as the critical point of theEWO is the change from negative to positive values. The numberof epochs used to train every sub-system is 15. In many cases,the greatest improvements in the root mean square error tookplace during the first 15–20 epochs. More epochs would mean alonger time needed to train these 42 models, not necessarilyimproving the hit rate. The 42 sub-systems are trained with differ-ent combinations of step sizes, and membership functions. The val-ues of step sizes used are [0.01,0.05,0.1,0.2,0.3,0.4,0.5], and the

Price Moving Average

35 Day

Returns

Moving Average 5 Day

Returns (-1,0,+1)

EWO(5/35)t

EWO(5/35)t-1

EWO(5/35)t-2

Input Data

Output Data

Total Data

New Data

Testing Data

Training Data

Various ANFIS

Model Evaluation

Best model Choice

Average Prediction

Final Prediction

Diagram 12. The WASP system.

G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206 9203

membership functions used are (Bell, Gaussian, Gaussian2(a Gaussian Variation), Triangular, Trapezoidal, and Pi membershipfunction). These 42 sub-models are evaluated with the testing data.The models that give the highest hit rates are selected, and usedwith the new data, giving nine different forecasts for the nextday. The average of these nine forecasts is used as the final forecast.The method has many advantages besides the final result.

One can take under consideration the number of positive andnegative forecasts as well. Furthermore, the system presents a con-fidence index. The confidence index is the average hit-rate of thenine sub-models achieved on the testing data. Obviously, if theconfidence index is below 52%, one is advised not to follow the sys-tem, as it would not make any difference from tossing a coin, orguessing the next day’s movement. This is not an irrational restric-tion, as such periods are often observed in the market place, wheremovements seem to follow a chaotic pattern. It is important to beable to recognize such periods to reduce risk. The typical output ofthe WASP system is presented in Diagram 13.

The output window presents three box-plots. The first one indi-cates the hit-rate of the nine sub-models, on the testing data,which is also the confidence index. The second indicates the re-turns that these sub-models would achieve in the testing data.The third box-plot depicts the predictions, which are derived byevaluating the latest set of input data on the sub models. The re-sults show that for the next day, eight models forecasted a positivemovement, while one forecasted a negative one. On average, theprediction is positive and the confidence index is high, rangingfrom 66.6% to 75%.

For every subsystem, the characteristics of the ANFIS system areindicated in Table 1. Depending on the chosen membershipfunctions selected by WASP, the number of non-linear parameters,varies from 12 for a two parameter MF such as the Gaussian func-tion, 18 for a three parameter MF such as the Bell and Trigonalfunctions, and 24 for a four parameter MF such as the trapezoidMF. The case of a subsystem using the Bell function is presentedbelow.

3.2. Results

The WASP system was tested with the stock of the National Bankof Greece. The system was retrained daily. A paper portfolio worth10.000 Euros was simulated. Buy and sell decisions did not take intoaccount the confidence index, as it is subjective, depending on therisk the investor is willing to take, even though a threshold of 52%is widely acceptable. Stocks were bought whenever the forecastwas positive, and the position was closed when the forecast becamenegative. Transaction costs were not taken into consideration. Thesystem was tested for period April 2007 to November 2008, for atotal of 400 trading days. It is one of the longest periods that any

model has been tested for daily trading decisions. It is worthy tonote that this period also includes the great recession of October2008, were the system achieved interesting results. For the wholeperiod of 400 trading days, the hit rate was 58.75%, mainly due tothe crisis. By breaking this period in four sub-periods of 100 obser-vations, the hit rates achieved are 58%, 64%, 60% and 53%, respec-tively. It is important to note that during the last sub-period, theconfidence index often had a value below 52% which would preventinvestors from investing, as will be shown in the diagrams that fol-low. Again not including the confidence index, the return of theportfolio during this period was +6.79%, while the stock lost 60.9%of its value. It is interesting to note, that just before the crisis, theconfidence index fell below 52%. The respective returns on thatday were +71.49% for the WASP system, while the Buy&Hold strat-egy to this day would produce a loss of�37.13%. For ease of use, thetotal period has been divided into three periods. The first includes200 trading days, between 11/04/2007 and 23/01/2008, the secondfrom 24/01/2008 to 25/06/2008 (100 trading days), and finally thethird from 26/06/2008 to 14/11/2008.

3.2.1. First period: 11/04/2007–23/01/2008For every period, the results include the portfolio return

compared to the return produced by a Buy&Hold strategy(Diagram 15), the confidence index, the hit-rate, and the movinghit-rate. The moving hit-rate is a diagram which shows the hit-ratesince the first day of this period. Table 2 outlines some of the re-sults. Following the WASP system, the return achieved was17.73%, which is 27.31% more than the return of the Buy&Holdstrategy. The hit-rate for this period is 61%. It can be seen inDiagram 14 that the hit-rate moves towards the 60% region. Theconfidence index remained above 52% for the whole period, withan average of 61%.

3.2.2. Second Period 24/01/2008–25/06/2008During this second period of 100 trading days, the results are

even better. The WASP System returned a profit of +19.65% againsta price drop of 19.39% as can seen in Table 3. Diagram 16 shows theportfolio returns usingWASP and the Buy&Hold strategy. The hit-rate for this period was 60% with the moving hit-rate remainingagain on the 60% region as can be seen in Diagram 17.

The confidence index remained again above 52%, with its lowfor this period at 58.7%, and a maximum of 71.67%, giving no rea-son for the investor not to trust the systems output.

3.2.3. Third period 26/06/2008–14/11/2008During this period the market was much harder to explain. This

is the period where the ‘‘famous’’ 2008 crisis hit the marketsworldwide, as can be seen in Table 4. The WASP system would

Diagram 13. WASP system output.

Table 1Characteristics of ANFIS sub-system.

ANFIS parameter type Value

MF type Bell functionNumber of MFs 6Output MF LinearNumber of nodes 34Number of linear parameters 32Number of nonlinear parameters 18Total number of parameters 50Number of training data pairs 2000Number of evaluating data pairs 60Number of fuzzy rules 8

Strategy Returns

8000

9000

10000

11000

12000

13000

1 50 99 148 197Day

Euro

Buy & Hold W.A.S.P

Diagram 14. Returns for the first period.

Moving Hit-Rate

0

0.2

0.4

0.6

0.8

1

1 50 99 148 197Days

Hit

Rat

e

Diagram 15. Moving hit rate for the first period.

9204 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206

have helped investors to reduce losses, by always achieving betterresults than the Buy&Hold strategy. By the end of this period, thestock of NBG had declined by 46.35% while the WASP system lost24.28%.

In Diagram 18, the confidence index is also noted as gray circleson theWASP return line. Every gray circle denoted a day that theconfidence index was below 52%. For these days, as mentioned be-fore, the system is unable to support the results, and thus theinvestor should not follow the system. The best action would beto sell the stocks, and wait until the index gives an acceptable re-sult. The system gave early signs to exit the market. On the point ofthe first exit signal, the WASP system produced profits of 26.61%against profits of 3.95%. After this sign, the index moved againslightly above 52%, but not higher than 53%, giving another sign9 days later, with the results being +15.2% for the WASP systemand �9.21% for the Buy&Hold strategy. Diagram 19 indicates themoving Hit-rate, which moves to the 60% region at the beginning,but falls to approximately 53% at the end. The most important con-clusion for this period was the significance of the confidence index,and how it can be used to give exit signals.

3.2.4. Further resultsDiagram 20 shows the results for the WASP system for the whole

period. The WASP System outperformed the Buy&Hold strategy.

Since the forecasting problem has been converted to a classifi-cation problem, as the return has been converted to �1, 0, +1 val-ues for negative, unchanged or positive rates of change, it is

Table 2Results for the first period.

Date Buy&Hold (%) WASP (%) Difference (%)

20/06/2007 +0.47 +9.61 +9.1430/08/2007 +3.10 +8.01 +4.9008/11/2007 +10.12 +21.04 +10.9123/01/2008 �9.57 +17.73 +27.31

Table 3Results for the second period.

Date Buy&Hold (%) WASP (%) Difference (%)

26/02/2008 +3.68 +16.79 +13.1109/04/2008 �6.57 +7.13 +13.7119/05/2008 �3.71 +21.18 +24.8925/06/2008 �19.39 +19.65 +39.05

Moving Hit-Rate

0

0.2

0.4

0.6

0.8

1

1 25 49 73 97Days

Hit

Rat

e

Diagram 16. Moving hit-rate for the second period.

Strategy Returns

8000

9000

10000

11000

12000

13000

1 25 49 73 97Day

Euro

Buy & Hold W.A.S.P

Diagram 17. Results for the second period.

Table 4Results for the third period.

Date Buy&Hold (%) WASP (%) Difference (%)

03/09/2008 +8.53 +26.50 +7.9724/10/2008 �59.45 �35.28 +18.1714/11/2008 �46.35 �24.28 +26.51

Moving Hit-Rate

0

0.2

0.4

0.6

0.8

1

1 25 49 73 97Days

Hit-

Rat

e

Diagram 18. Moving hit rate for the third period.

Strategy Returns

3000500070009000

1100013000

1 25 49 73 97Day

Euro

Buy&Hold Wasp Confidence Index

Diagram 19. Results for the third period.

Strategy Returns

0250050007500

1000012500150001750020000

1 51 101 151 201 251 301 351 401Day

Euro

Buy & Hold W.A.S.P

Diagram 20. Returns for whole period.

Table 5Classification of results for the whole period.

Movement forecast Positive Negative

Positive 114 (59.37%) TP 87 (41.82%) FPNegative 78 (40.625%) FN 121 (58.17%)TNTotal 192 208

G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206 9205

common to present classification results. Results are classified intofour categories: True Positive (TP), True Negative (TN), False Posi-tive (FP) and False Negative (FN). Obviously, FP cases are the onesthat incur losses. FN cases prevent profits, while TP cases produceprofits and TN cases prevent losses. It is important to note that thetimes that the stock price remained unchanged, are counted ascorrect forecasts, as they bring no change to the portfolio value.

Table 5 shows the classification results for the whole testing peri-od. The results are much better if the third period is excluded, as isindicated in Table 6.

Another interesting result is the number of consecutive correctand wrong forecasts achieved by the system. Ideal situation wouldone where the false forecasts occur once between many correctforecasts. Table 7 presents the analysis of the whole testing period.

Column 1 shows the number of consecutive forecasts. Columns2 and 4, depicts the frequency of wrong and correct forecastsrespectively, while columns 3 and 5 give the number of

Table 6Classification of results for the first two periods.

Movement forecast Positive Negative

Positive 86 (57.33%) TP 54 (36.00%) FPNegative 64 (42.66%) FN 96 (64.00%) TNTotal 150 150

Table 7Consecutive correct and wrong forecasts.

Number ofconsecutiveforecasts

Frequency ofwrongforecasts

Number ofwrongforecasts

Frequency ofcorrectforecasts

Number ofcorrectforecasts

(1) (2) (3) (4) (5)

1 59 59 40 402 20 40 23 463 12 36 12 364 4 16 8 325 0 0 6 306 1 6 7 427 0 0 0 08 1 8 1 8

9206 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 9196–9206

observations that were part of each category (simply multiplyingcolumns 1 and 2, or columns 1 and 4).

For example, a single wrong forecast occurred 59 times. Twoconsecutive wrong forecasts occurred 20 times, three consecutivefalse forecasts occurred 12 times, etc. On the other hand, a singlecorrect forecast occurred 40 times, two consecutive correct fore-casts occurred 23 times, three consecutive correct forecasts oc-curred 12 times, etc.

From columns 3 and 5, it can be concluded that the system gavemore consecutive correct forecasts than wrong ones.

4. Conclusions – future research

The system presented in this paper achieved remarkable results,for a very long period of 400 trading days, when the system isretrained daily. As mentioned before, the WASP system is a meth-odology that selects 9 ANFIS systems, based on the hit-rate eachsub-system achieves on out of sample data. Even those resultsare extremely interesting, with some sub-models achieving a hit-rate above 75% for a sample of 60 trading days. This is evidenceof how effective neuro-fuzzy architectures can be in stock marketforecasting. The system showed a tendency to achieve hit rates inthe 60% mark which is significantly better than forecasting withthe help of a coin. During this period of 400 trading days, the WASPsystem made 63 transactions. This gives a rough average of 1 trans-action every 6 days.

Variations of the WASP system could produce different results.There are many parameters that can be changed, as can the num-ber of sub-models to be chosen. The selection process of the sub-models can also differ. Risk aversion investors might want tochoose the sub-models, depending on the number of false positivecases.

A daily forecast needs approximately 90 s on a core 2 duo laptopin 1.86 MHz with 3 GB of ram.

Of course the system has some restrictions, deriving both fromfuzzy logic theory, neural network, and the Elliott Wave Theory.

The Elliott wave oscillator is not a normalized index, which causesprice variations to change the value of the oscillator significantly,producing values outside of the values used to train the neuro-fuzzy system. The second restriction depends again on the Elliottwave oscillator. The oscillator is a slow moving oscillator, as it isderived from two moving averages. This makes the separation of‘‘if . . . then’’ scenarios more difficult.

References

Abraham, A. (2004). Neuro fuzzy systems: Sate-of-the-art modelling techniques.ArXiv Computer Science e-prints: cs/0405011.

Abraham, A., Nath, B., & Mahanti, P. (2001). Hybrid intelligent systems for stockmarket analysis. Lecture Notes in Computer Science, 337–345.

Atsalakis, G., Skiadas, C., & Braimis, I. (2007). Probability of trend prediction ofexchange rate by neuro-fuzzy techniques. Recent advances in stochastic modellingand data analysis. London: World Scientific Publishing Co. Pte. Ltd..

Atsalakis, G., & Ucenic, C. (2005). Time series prediction of the greek manufacturingindex for the non-metallic minerals sector using a neuro-fuzzy approach(ANFIS). In Conference international symposium on applied stochastic models anddata analysis, May, Brest, France (p. 211).

Atsalakis, G., & Nezis, D. (2008). Moving average, neural networks and geneticalgorithms for stock market prediction. Journal of Computational Optimization inEconomics and Finance, 1(1), 42–57.

Atsalakis, G. S., & Valavanis, K. P. (2008). Surveying stock market forecastingtechniques – Part II: Soft computing methods. Experts Systems with Applications,36, 5932–5941.

Atsalakis, G., & Valavanis, K. A. (2009). Forecasting stock market short-term trendsusing a neuro-fuzzy based methodology. Journal of Expert Systems withApplications, 36, 10696–10707.

Atsalakis, G., & Valavanis, K. (2010a). Forecasting stock trends using a combinedtechnical analysis and neuro-fuzzy based approach. Journal of Financial DecisionMaking, 6(1), 79–94.

Atsalakis, G., & Valavanis, K. (2010b). Surveying stock market forecastingtechniques – Part I: Conventional methods. Journal of ComputationOptimization in Economics and Finance, 2(1).

Atsalakis, G., & Zopounidis, C. (2009). Forecasting turning points in stock marketprices by applying a neuro-fuzzy model. International Journal of Engineering andManagement, 1(1), 19–28.

Atsalakis G., Valavanis K., & Ucenic C. (2001). Elliot waves, marketing and neuro-fuzzy based techniques for marketing of stocks. Finance, accounting,management, marketing (vol. 2, pp. 398–405). Tîrgu-Mures�: The Works ofScientific Session of Petru Maior University, Publishing House of Petru MaiorUniversity. ISBN 973-8084-53-9.

Bekiros, Stelios D. (2007). A neuro-fuzzy model for stock market trading. AppliedEconomics Letters, 14(1), 53–57.

Ghandar, A., Schmidt, Z., To, M., & Zurbruegg, R. (2007). A computationalintelligence portfolio construction system for equity market trading, In IEEECongress on Evolutionary Computation, 2007, CEC 2007.

Jang, J.-S. R., & Sun, C.-T. (1997). Neuro-fuzzy and soft computing: A computationalapproach to learning and machine intelligence. Upper Saddle River, NJ, USA:Prentice-Hall, Inc..

Nishina, T., & Hagiwara, M. (1997). Fuzzy inference neural network.Neurocomputing, 14(3), 223–239.

Pokropinska, A., & Scherer, R. (2008). Financial Prediction with Neuro-fuzzySystems. Lecture Notes in Computer Science, 5097, 1120–1126.

Prechter, R., & Frost, A. (1998). Elliott wave principle: Key to market behaviour. NewClassics Library.

Rast, M. (1999). Forecasting with fuzzy neural networks: A case study instockmarket crash situations. In 18th International Conference of the NorthAmerican, NAFIPS. Fuzzy Information Processing Society.

Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representationsby error propagation. Parallel distributed processing: Explorations in themicrostructure of cognition (vol. 1, p. 319).

Siekmann, S., Kruse, R., Gebhardt, J., Van Overbeek, F., & Cooke, R. (2001).Information fusion in the context of stock index prediction. InternationalJournal of Intelligent Systems, 16(11), 1285–1298.

Werbose, P. J. (1974). Beyond regression: New tools for prediction and analysis inthe behavioral science. Ph.D. dissertation, Harvard University.

Wong, F., Wang, P., Goh, T., & Quek, B. (1992). Fuzzy neural systems for stockselection. Financial Analysts Journal, 48(1), 47–52.

Wu, J., Fung, M., & Flitman, A. (2001). Forecasting stock market performance usinghybrid intelligent system. Lecture Notes in Computer Science, 447–458.

Zadeh, L. (1965). Fuzzy sets. Information and Control, 8(3), 338–353.