Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI...

62
THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report Sibie Arunmozhi UID: 2012555916

Transcript of Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI...

Page 1: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

THE UNIVERSITY OF HONG KONG

Final Year Project 2015/2016

Final Report

Sibie Arunmozhi

UID: 2012555916

Page 2: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

2

Abstract This final year undergraduate project aims to explore the field of financial forecasting

with a focus on a neural network solution in Python. There are a multitude of approaches that

can be taken in this regard and extensive work has been done in this field. Here, fundamental

concepts and research with regards to financial forecasting are to be explored first. Following

this will be a discussion on this project's methodology with respect to environment setup and

tools used in constructing our first neural network. Methods and classes used are to be

explained as well, as each component's structure is described. To conclude, there will be a

review of how the network performed on the 15 stocks selected for analysis and the scope for

future work.

Page 3: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

3

Acknowledgement I would like to thank my supervisor, Dr. Chi Lap Yip, for the guidance and

encouragement he has provided to me over the course of last year. His advice has been

paramount to my understanding of the novel fields of financial forecasting, data mining and

neural networks. I would also like to thank Mr. Shanmugam Nagarajan, founder and CPO of

[24]7 Inc. for introducing me to text and opinion mining by analyzing social media, two areas of

future work also explored in this project.

Page 4: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

4

Contents Abstract ......................................................................................................................................................... 6

Acknowledgement ........................................................................................................................................ 7

Introduction .................................................................................................................................................. 8

Problem Statement ..................................................................................................................................... 10

Project Objectives ....................................................................................................................................... 10

Project Significance ..................................................................................................................................... 11

Project Background ..................................................................................................................................... 12

The Two Schools of Thought ................................................................................................................... 12

Technical Analysis ................................................................................................................................... 12

Fundamental Analysis ............................................................................................................................. 14

Statistical Approaches ................................................................................................................................. 16

Time Series Analysis ................................................................................................................................ 16

ARIMA and GARCH .................................................................................................................................. 18

Limitations of Statistical Approaches ...................................................................................................... 19

Soft Computing Approaches ....................................................................................................................... 21

Data Mining ............................................................................................................................................. 21

Artificial Neural Networks ....................................................................................................................... 23

Limitations of Soft Computing Approaches ............................................................................................ 25

Experimentation with Weka ....................................................................................................................... 27

The Experiment ....................................................................................................................................... 27

Key Observations .................................................................................................................................... 29

Project Methodology .................................................................................................................................. 31

Choice of Python for PL ............................................................................................................................... 31

Environment Setup ..................................................................................................................................... 32

Python 2.7.10 .......................................................................................................................................... 32

IntelliJ IDEA Community Edition 2016.1 ................................................................................................. 32

Enthought Canopy .................................................................................................................................. 34

SciPy, NumPy and other essential machine learning libraries ................................................................ 34

PyBrain .................................................................................................................................................... 34

The Concept ................................................................................................................................................ 35

SupervisedDataset for this Problem ........................................................................................................... 37

Page 5: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

5

Building a Feed Forward Network using buildNetwork .............................................................................. 38

Training with the BackpropTrainer ............................................................................................................. 40

Commencing Training and Forecasting Results .......................................................................................... 43

Experiments and Results ............................................................................................................................. 45

Choice of Stocks for Training ...................................................................................................................... 45

Data Sources ............................................................................................................................................... 46

Results ......................................................................................................................................................... 47

Performance Analysis ................................................................................................................................. 48

Limitations .................................................................................................................................................. 49

Future Work ................................................................................................................................................ 52

Hybrid Approaches...................................................................................................................................... 52

Financial Indicators to Improve Accuracy ................................................................................................... 53

Text Mining ................................................................................................................................................. 56

Sentiment Analysis ...................................................................................................................................... 59

References .................................................................................................................................................. 62

Page 6: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

6

Abstract This final year undergraduate project aims to explore the field of financial forecasting

with a focus on a neural network solution in Python. There are a multitude of approaches that

can be taken in this regard and extensive work has been done in this field. Here, fundamental

concepts and research with regards to financial forecasting are to be explored first. Following

this will be a discussion on this project's methodology with respect to environment setup and

tools used in constructing our first neural network. Methods and classes used are to be

explained as well, as each component's structure is described. To conclude, there will be a

review of how the network performed on the 15 stocks selected for analysis and the scope for

future work.

Page 7: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

7

Acknowledgement I would like to thank my supervisor, Dr. Chi Lap Yip, for the guidance and

encouragement he has provided to me over the course of last year. His advice has been

paramount to my understanding of the novel fields of financial forecasting, data mining and

neural networks. I would also like to thank Mr. Shanmugam Nagarajan, founder and CPO of

[24]7 Inc. for introducing me to text and opinion mining by analyzing social media, two areas of

future work also explored in this project.

Page 8: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

8

Introduction This project aims to explore the fundamental concepts of financial forecasting, a field of

significant interest in data analytics today, by applying these concepts to the prediction of

future values of a select few set of stocks. The very idea of forecasting itself is applicable in a

multitude of fields other than finance, from weather forecast to making a medical diagnosis. In

finance in particular, soft computing techniques revolutionizing this domain such as neural

networks have found much success in tackling a number of difficult problems like economic

indicator forecasts, creditworthiness assessment, bankruptcy prediction and fraud detection to

name a few. However, while acknowledging the usefulness of such technologies, one should

also note the mathematical aspect to the problem. Statistical approaches using concepts such

as ARIMA and GARCH have been applied to the problem of financial forecasting to limited

success, but are informative nevertheless. Time series analysis does offer yet another route to

another potential solution; however this project will be headed in the other direction, exploring

neural networks.

Artificial Neural Networks (ANN) have found much success in a broad spectrum of areas

such as character recognition, image compression, medicine, aside from financial applications.

Warren McCulloh and Walter Pitts developed the first computational neural network model in

1943. This eventually would pave the way for the two predominant approaches to utilizing

neural networks today: namely biological and artificial-intelligence. The backpropagation

algorithm was another massive breakthrough in 1975, following which a plethora of different,

more complicated have now been developed. While some of these algorithms are applicable to

a number of intrinsically similar problems, many of them are made to tackle one specific issue.

Page 9: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

9

Today, it is easy for anyone to get started in understanding the basis of neural networks. While

more fundamental machine learning software such as Weka are available, programming

languages such as Python, R and MATLAB offer much more scope for ANN development. In

Python in particular, there are a number of vast scientific computing and machine libraries

available to experiment with. This project utilizes the PyBrain library for Python, in the

development of a supervised, backpropagation neural network to use for financial forecasting.

It is interesting to note here that while a problem of much interest, financial forecasting

is fundamentally impossible in reality, due to the volatile nature of the market. However it is a

problem tailor-made for neural networks and can offer key insights that could be useful to

those in the financial services domain. The backpropagation algorithm is by no means the best

possible solution to the question of financial forecasting; however it was a good place to start,

with much scope for future improvement. The methodology and results of all experiments

conducted will be discussed here, as will be suggestions for the aforementioned improvements.

Page 10: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

10

Problem Statement "Given the historical data of a number of stock or index prices at different points in time, design

an algorithm that would predict their values in the future. Some factors the algorithm can take

into consideration include the day in month, weekday of day, time of day, various financial

indicators and correlations between data from different time series. Time series data of at least

15 stocks, futures and/or indices are expected to be analyzed."

Project Objectives The primary objectives of this project are as follows:

To build a collection of training data, comprising of the historical data of the stocks to be

forecasted. There are to be 15 stocks totally, not limited to one particular domain.

To develop an Artificial Neural Network (ANN) to mine this data, so as to search for

patterns and systematic relationships within this data that can be used as a basis for

stock price prediction. Python is the choice of programming language.

To apply the neural network to the selected stocks after network training is complete.

Analysis of the forecasted results, error margin and performance, as well as scope for

future improvement to be discussed.

Page 11: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

11

Project Significance As discussed earlier, financial forecasting with a high accuracy, such that the

methodology can actually be applied real-time to stock exchange data, is not very viable.

However it is still a problem of much significance due to its complexity and the variety of

approaches that can be taken towards a potential solution. For mathematicians, it is the perfect

problem for expanding on the increasingly popular ARIMA and GARCH concepts. Applying Time

Series Analysis to financial forecasting is a popular graduate thesis topic today. From a

Computer Science perspective however, it is the breakthroughs in artificial intelligence that

catches the eye, namely neural networks. It is a proven technique, having offered many

answers to the most complex problems known to data scientists. Therefore it is still of great

educational value to understand the nature of the problem of financial forecasting, and how

neural networks can make use of big data to get us close than ever in this "impossible" task.

Numerous different takes on potential solutions have been made, yet the scope for more

future breakthroughs is still possible as technology today allows us to further improve on what

is done. So this project is essentially a first attempt at developing such a neural network based

solution at the undergraduate level, so as to better understand the novel field, and propose

future enhancements.

Page 12: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

12

Project Background

The Two Schools of Thought

Financial Forecasting has long been an area of much fascination to mathematicians,

computer scientists and data analysts worldwide. To develop a methodology by which one

could predict the prices of commodities such as a stock on a daily basis, both consistently and

accurately, would pose a solution to one of the most complex problems today. Much research

has already been done in this regard, and over the years, there has been a common consensus

on the two broad categories to which most existing solutions derive their inspiration from:

Technical Analysis and Fundamental Analysis. Despite the vast difference in outlook, studies

suggest that the two schools of thought can be used to complement each other to make highly

accurate predictions. Here is a discussion on how these two approaches differ.

Technical Analysis

Technical Analysis believes that the numbers are offer more than enough information

that is needed to complete the forecasting process. Here statistics generated from market

activity such as past prices, volume, etc. is compiled over a period of time, then searched for

specific patterns and trends which could indicate future stock activity. The assumption made

here is that the price of a stock is simply a matter of supply and demand. Analysts following this

approach develop and study graphs of the price and volume historical data of chosen stocks to

hypothesize stock price fluctuation according to observed trends.

Trends are the most essential aspect of this school of thought. On plotting the day to

day fluctuations in stock valuation, graphs can give us an enormous amount of information on

Page 13: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

13

the direction in which prices are heading. From an analytical point of view, specific types of

trends and reversal of trends are searched for. Some examples of these patterns are the

"saucer bottom" and the "head-and-shoulders" pattern. A saucer bottom suggests that a stock

has reached the lowest price at which it may be traded, signaling a future upward spike. On the

other hand, the head-and-shoulder pattern is just as its name describes: suggesting a period of

immense market volatility, with both upward and downward spikes. These are just a few

examples of graphical patterns explored in technical analysis.

There are additional indicators used in technical analysis which are highly useful, one of

which is the Moving Average graph, that shows us changes in average share price over a period

of time. The Accumulation and Distribution lines are useful as well, pertaining to stock volume.

Volume is considered equally important in technical analysis as it is believed that changes in

volume always occur before a change in price. Essentially here, technical analysis, the school

Page 14: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

14

from which most statistical approaches obtain their inspiration, is all about these graphs and

indicators, generated from historical data, thereby applying this to the forecasting process.

Fundamental Analysis

On the other hand, Fundamental Analysis is the preferred choice of economists as here,

underlying factors that actually affect a company’s business such as overall economy/industry

conditions, fiscal condition of the company, key management decisions moving forward which

could impact stock value, etc. are taken into account. Essentially it evaluates stock price by

measuring the “intrinsic” value of the stock. A standard, integral part of the fundamental

analysis process are the financial statements of the company, the stock of which is being

studied. Typical financial statements such as balance sheets and income statements are two

such examples. Generally, any business or internal decisions made, that could potentially

impact stock value, are reflected in such financial statements, and is thereby applied to the

forecasting process. Additionally, financial ratios are considered useful as well, offering key

insights into the inner functioning of the company.

While financial statements offer one direction to head in with respect to fundamental

analysis, it is not always so easy to develop a methodology to incorporate this into forecasting.

Aside from just the numbers, there are also the miscellaneous happenings that affect stock

price as well. Organizing all this information in a concise manner, parsing it for key patterns and

then determining the value by which stock price could rise or drop due to these patterns are all

very tricky things to do. As a result, such beliefs go against the principles of technical analysis.

However with the rise in soft computing approaches such as text mining and neural networks,

Page 15: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

15

the scope to develop a hybrid approach, combining mathematics and economics is highly viable

in the current environment.

Financial Statements are integral to Fundamental Analysis

Differences summarized

Page 16: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

16

Statistical Approaches Most solutions developed today fall under two broad categories: statistical techniques

and soft computing techniques. In Time Series Analysis, financial forecasting is a common

problem studied by statisticians. While many statistical techniques and their strengths with

respect to the predictive process have been explored, the most popularly known are

Exponential Smoothing, Autoregressive Integrated Moving Average (ARIMA) and Generalized

Autoregressive Conditional Heteroskedasticity (GARCH).

Time Series Analysis

The two main objectives of Time Series Analysis are to identify the nature of fluctuations

represented by a set of observations and forecasting the future value of the time series

variable. To start off, identification of key patterns in observed time series data is paramount.

Following this, any findings made can be integrated with other data in making forecasts.

The core idea behind time series analysis is that data generally consists of a systematic

pattern, usually a set of distinguishable elements, and random noise, the error that makes

identification of such patterns difficult. The aim of time series analysis techniques is to

eliminate this random noise, so as to make these patterns easier to detect. The two most

common classes of patterns searched for are trend and seasonality. In trend analysis, the

process is to "smooth the data" so as to fit a function into observed patterns (technique known

as smoothing). On the other hand, analysis of seasonality is a more variable indicator detected

by correlograms, which are simply graphical displays of the autocorrelation function of given

problem. After identification of seasonal dependencies, the usual procedure is to remove these

Page 17: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

17

dependencies so as to make the series "stationary", an absolute necessity before applying more

complex techniques such as ARIMA on resultant data.

Generally plotted in line charts, Time Series Analysis is a key tools in statistical

approaches to financial forecasting. Aside from finance, other applications of this concept

include statistics, signal processing, pattern recognition, econometrics, disaster prediction, and

more. As long as the dataset is real and continuous, applying time series analysis to any

statistics is simple, offering many benefits.

Page 18: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

18

ARIMA and GARCH

With respect to Time Series Analysis, the ARIMA model is a generalization of the

autoregressive moving average model (ARMA). An ARMA (p,q) model combines elements of an

autoregressive model and a moving average model. While an AR model uses previous terms in

the series as predictors, MA models use noise terms. ARMA linearly merges these models,

defined as:

An ARIMA (p, d, q) model is essentially one such ARMA (p, q) model differentiated "d" times,

making the series stationary in nature.

The GARCH model focuses on the heteroskedastic nature of a time series such as

volatility clustering, as well as the previous and noise values. Basically what distinguishes it

from ARIMA, is its use of autoregression to account for the variance of the series. While

Bollerslev's original GARCH model remains the standard, a variety of variations have been

developed over the years, an entire study on its own. ARIMA and GARCH can be used in both

forecasting, as well as data analytics. ARIMA in particular is an integral part of the Box-Jenkin's

approach to time series modeling. However despite the pros to such approaches, limitations to

statistical solutions should be noted.

Page 19: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

19

Limitations of Statistical Approaches

While studies indicate that ARIMA and GARCH are more robust and efficient as

compared to the more complex ones, this is usually to do with short term forecasting making it

inextensible in nature. In other words, while it may work for one stock, for predicting on a

certain date, there is no guarantee that it will work with an equivalent, if not better accuracy

rate, on another stock, or for a more long-term date. Essentially developing a "one-size-fits-all"

statistical model is impossible due to this, making it furthermore a tricky route to navigate.

Moreover when it comes to the financial markets there is a large amount of volatility

with respect to the price of a commodity. In reality, any event, big or small, happening in any

part of the world, can inadvertently affect commodity prices. Fitting this into any statistical

model is highly difficult. This is because such statistical techniques derive their basis from

technical analysis when such data does not give the model all the information it needs to

correctly forecast a stock price. Clearly there are some aspects from fundamental analysis such

as management decisions of a company which could be vital in forecasting correctly but may

not be entirely known to us. Another problem is the absence of information on key business

decisions made by the board of a firm such as stock splits. Such decisions could have a huge

impact on how we read into any data as well. For example if we take a stock split, prices would

suddenly seem much lower than usual. From fundamental analysis it would be instantly clear

why this is the case where as from a statistical point of view, there would just seem to be a

sudden rise in variance with no real reason why such a change happened. To conclude, while

there is still a lot of value in time series analysis and traditional solutions like ARIMA and

Page 20: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

20

GARCH, it is an extensively researched field with clear limitations. Due to this being a Computer

Science project, this direction was therefore not taken.

Page 21: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

21

Soft Computing Approaches Soft Computing techniques refer to the use of machine learning in solving such complex

problems such as financial forecasting. Data Mining is a key component of any such technique

as patterns discovered are essential for the training of any such predictive tool before it can

forecast in real time. Artificial Neural Networks (ANNs) are an example of such a computing

technique. Here is a quick discussion on these concepts.

Data Mining

Data mining is a novel concept used to intelligently read patterns from vast amounts of

raw data, in this case historical data of stocks analyzed. Key uses of data mining aside from

pattern discovery are pattern association, path analysis and clustering (documenting groups of

facts not previously known). Thus the power of data mining can be used to forecast stock

performance and predict their potential future values. Using this concept with Neural Networks

could result in powerful applications, not limited to the financial domain, but other industries

as well. While mining has its pros and cons as well, there is no doubt that it is an integral part of

any solution to the given problem as will be highlighted by tests that were run on Weka during

the research phase.

Some of the relationships searched for in data mining include classes, clusters,

associations and sequential patterns. While classification is used to separate data into groups

based on distinguishing characteristics, clustering does the same job except on the basis of

logical relationships as opposed to predetermined groups. Associations and sequential patterns

are more specific tasks. It is important to note that for most data mining tasks, a central

Page 22: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

22

repository is needed to accumulate and organize data, usually called a data warehouse. Once

storage is accounted for, analysis for patterns can be done using different approaches like

neural networks, genetic algorithms, decision trees, data visualization and more. In this

particular project, artificial neural networks were the tool of choice, used to mine historical

data of the stocks selected. By training a neural network with this mined data, one can then

proceed to make financial forecasts, although accuracy of such a model would depend

extensively on the methodology being applied, highly variable on a case-to-case basis.

The Optimus Data Mining software in Action

Page 23: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

23

Artificial Neural Networks

An Artificial Neural Network (ANN) is a data-processing entity that is inspired by the

biological nervous systems like the brain process information. ANNs are made up of a vast

number of highly interconnected processing nodes (neurones) which work together to solve

complex problems. ANNs are data-driven, self-adaptive methods with few assumptions. They

are also good predictors with the ability to make generalized observations from results learned

from the original data. This helps build upon patterns already discovered during data mining.

Most importantly, ANNs have had considerable success in solving non-linear problems

including those in the real world. This is in contrast to statistical models like ARIMA which

assume that the series is generated from linear processes, that as a result would be inapplicable

to most real-world problems.

Neural networks are made up an input layer, a hidden layer and an output layer. While

the input and output layers are standard, the hidden layer is where the computation is done.

The nodes that form these layers collectively make up the ANN. The connections between

these nodes reflect the real relationships between different data elements. Typically neural

networks need additional parameters such as bias, transfer functions and more. More

importantly however is the training phase of an ANN. Complete datasets, enhanced by mining

techniques, are needed here. During training, the network typically starts with one random

configuration in generating output, taking the error into account. Following each "epoch", or

cycle, the network then compares results with expected output, while trying a new

configuration by varying the "couplings" between the nodes, trying to minimize the error. Once

this training is completed on the entire dataset, forecasting in real-time can b attempted.

Page 24: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

24

However as discussed earlier, accuracy is still an issue, although the more sophisticated ANNs

of today have gotten us closer than ever before.

Advanced Neural Network to forecast stock values in action

So the typical models for which use of an ANN is applicable include those which:

capture associations and similarities between sets of patterns.

have a large volume of data, consisting of a number of variables.

the relationship between variables is not clearly understood.

the relationships are difficult to explore with conventional techniques.

Page 25: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

25

Limitations of Soft Computing Approaches

While data mining has its clear uses, it does have its limitations as welll. While

not directly a problem in financial forecasting, data privacy could be an issue in some

experiments. Also, there is the issue that the amount of data may be too much. As a result,

storage and performance may become a persistent issue. Fixing these concerns could end in a

great cost during implementation and development. Most importantly, there is the issue of

data corruption. While even a small mistake in designing some component of the ANN can be

fatal, "dirty" data can fault the entire training and forecasting process, and can be highly

difficult to detect. It may even end up in development having to be started again from scratch.

These are the most noteworthy limitations of data mining.

With respect to neural networks, it depends upon the approach taken during its

development. When making one from scratch, the difficulty level is high as even a small mistake

in one module can ruin the entire training/testing process. The more common approach, to use

software tools and packages available, however is equally tricky due to the black-box nature of

such ANNs. Finally, there is the usual issues of performance and training. While ANNs have

been known to process large amounts of data and do highly difficult tasks, the training period

itself can be lengthy and tedious. Sometimes training can take thousands of epochs, if the

network is run on a single machine, making time a priority issue in this field today.

In relation to this project, It is important to note here that while techniques like ANNs

may get us closer than ever to financial forecasting with a high accuracy, it is still categorically

impossible to use one standard solution to predict any stock price with a high degree of

accuracy and consistency. Even after development. the neural network designed for this project

Page 26: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

26

will be observed to behave differently with different stock data. This is because in reality, no

financial forecasting solution can adopt a "one-size-fits-all" answer, but rather an amalgam of

different approaches, selected on the basis of which ones perform best with the selected

stocks.

Page 27: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

27

Experimentation with Weka

Before actual construction of a neural network, experimentation was done on Weka, an

easy-to-use machine learning software, so as to better understand how data mining works.

These tests were documented in greater detail in the intermediate report; however it is

important to revisit some key observations that were made during the tests which led me to

head in the direction I did for this project.

The Experiment

Weka 3.6 supports a number of standard data mining tasks such as data preprocessing,

clustering, classification, regression and visualization to better understand the nature of data.

All of Weka's techniques are predicated on the assumption that the data is available as one flat

file or relation, where each data point is described by a fixed number of attributes, normally

nominal or numeric.The algorithms that were tested were Gaussian Process, Linear Regression,

Multilayer Perceptron and SMO Regression. While the first two are standard statistical

techniques, Multilayer Perceptron is an algorithm which is nothing but a simple, neural

network. SMO refers to Sequential Minimal Optimization, a more advanced such algorithm. For

more information on these algorithms and the actual source code used by the developers of

the software, you can find such information in the Weka Online User Manual.

Page 28: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

28

Multilayer Perceptron in action

Each of the algorithms were run on two datasets for each stock, with one containing

daily prices spanning three months (roughly 85-90 instances) and the other spanning four years

(roughly 1000 instances). Each instance on its own was made up of four attributes: Open, High,

Low and Close. An additional attribute, Adjusted Close, was omitted to avoid complications as it

takes into account transaction costs, and other thing we do not have enough information

about. After training, testing was to be done using cross validation with 10 folds. This means

that each data set was to be split into 10 partitions whereby 9 would be used for training and

the 10th for testing. This would be done with every partition getting the chance to be tested

against, and results returned would be with respect to the average of all these partitions. The

stocks that were forecasted were that of Bank of America, Apple and Alphabet.

Page 29: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

29

More information on these results of these tests can be found in the intermediate report.

Key Observations

Here is a quick recap of the key observations made from these experiments:

Firstly, for both the stock of Bank of America and Apple, the MAE and other error

statistics reduced significantly for the larger dataset of 1000 instances as compared to

the smaller set. However, when tests on BoA and Apple were run using 5000 instances,

while error statistics did go down, it was minimal as compared to the former. This

indicated that while a larger dataset is never a bad thing, more data does not necessarily

guarantee more accuracy, meaning that there is a fine line here in this regard. Thus to

further raise accuracy in forecasting, there is a need an improvement of a more

qualitative nature rather than a quantitative one.

Page 30: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

30

Next, different algorithms performed better on different stocks, proving that a "one-

size-fits-all" solution is not viable. In this case, due to the difference in variance of data-

points for BoA and Apple (as tech stocks tend to be more volatile than that of financial

institutions), different algorithms performed better in both cases. So the inference here

is that if stocks from other industries such as energy, automobiles, etc were to be

tested, unique observations can be made in each of these cases as well.

Finally, there are the observations made from the test conducted on the stock of

Alphabet (parent company of Google) in particular. Results were relatively

unsatisfactory here compared to the other stocks, due to internal decisions, specifically

its rebranding as Alphabet and a stock split done earlier in the year. The problem here

was that the algorithm used by Weka was not sophisticated enough to understand these

real reasons behind stock fluctuations. So from this it is clear that business decisions

made by a firm(s) can impact historical price statistics, and enhancement to existing

algorithms to reflect these potential changes is fundamental to any solution.

These observations were essential to me understanding the nature of this problem better

before implementation. Based on my results, I determined that neural networks was the way to

go for this project. Machine learning libraries in Python was the best fit here, although other R

and MATLAB were other potentially strong choices.

Page 31: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

31

Project Methodology So for this project, neural networks is the solution of choice. To be specific, this

approach uses a feed forward neural network, trained using backpropagation. While this

solution may not be as sophisticated as the more advanced approaches being worked on today,

it seemed the perfect direction for me to head in as I developed my first neural network. This

section details the design of this ANN using Python and the resources used.

Choice of Python for PL Python is a high-level, general-purpose, object-oriented interpreted language known for

the compact nature of its code. Dynamic in nature with an emphasis on readability of complex

code, it is a powerful tool in the development of sophisticated software. More importantly

however, there are a multitude of scientific computing and machine learning libraries available

on Python for use. This was the decisive factor in selecting Python as the programming

language of choice for this project.Some popular modules available for use are PyBrain and

scikit-learn, a machine-learning Python built on top of the SciPy open-source library. While

many libraries were explored, the PyBrain machine learning library and its API was used in

developing our ANN. It is important to note here that while Python was chosen for this project,

R and MATLAB also provide much scope for potential solutions. Both offer its own libraries and

toolkits to tackling this problem which are highly helpful; but Python was decided upon due to

its extensibility and sheer amount of work being done today with regards to machine learning

in this language. We proceed now to a discussion on the environment setup.

Page 32: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

32

Environment Setup Before implementation, environment setup for this project had to be completed first.

After careful consideration of a number of tools, the following tools were decided upon:

Python 2.7.10

The version of python to be used in this project is Python 2.7.10. While the Python 3.0

series contains the latest version with the newest features, it is not backward compatible, i.e.

many APIs cannot utilized with it. So version 2.7.10 was selected as the best fit for the tools

being used.

IntelliJ IDEA Community Edition 2016.1

In this project, two IDE's were used, serving different purposes. The IntelliJ IDEA is an

IDE developed by JetBrains, offering an extensive platform to run, debug and test code.

Development and debugging of the neural network was done here, as were the actual tests.

Page 33: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

33

Neural Network development on IntelliJ

Exploration of the functionalities available on PyBrain using Canopy

Page 34: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

34

Enthought Canopy

Enthought Canopy is another IDE, specifically used for scientific and analytical purposes

using Python. Along with the standard tools, it also includes a number of fundamental scientific

libraries, and easy incorporation of third-party libraries into its interface, such as PyBrain. Thus

working on Enthought Canopy was highly informative due to the access gained to integral APIs

such as SciPy and NumPy, in addition to simple access to third party libraries. A number of

attempts at building a neural network were first tried here due to this reason. It was after

identifying key components in the system that development was shifted to IntelliJ Idea.

SciPy, NumPy and other essential machine learning libraries

A mention has to be made here of essential machine learning libraries such as SciPy and

NumPy, as despite being readily available on Enthought Canopy they had to manually

configured into IntelliJ due to the nature of its Python plugin. As is the norm, the packages were

downloaded from their respective websites, and then added to the relative path of the IntelliJ

project using the standard Python installation technique.

PyBrain

The most important machine learning library to this project is PyBrain. It is used to build

upon flexible, easy-to-understand, yet powerful algorithms to tackle machine learning tasks. It

also offers measures to compare the performances of different algorithms against each other.

Specifically in this case, it offers an API to develop a feed forward neural network using

backpropagation. Before getting into the specifics of how the ANN was implemented, it is

important to understand the characteristics of the network in representing the problem of

financial forecasting of a stock.

Page 35: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

35

The Concept It is evident from discussion on the nature of this problem, that neural networks can be

used in financial forecasting. An additional advantage of using neural networks is the scope to

develop a model that can utilize the best of technical and fundamental analysis. While historical

statistics could be used as the foundation for predictive analysis, enhancements to the network

could be made to utilize other useful indicators like news articles (text mining) and key

economic ratios. There is no doubting the predictive power of neural networks owing to the

success it has had in solving some complex problems in a number of fields such as medicine and

geology.

Neural networks are typically made up of three layers: the input layer, hidden layer and

output layer. For this problem there will be 4 input nodes and 1 output node. Each instance of

the input vector for training represents a market day's set of statistics on the performance of a

chosen stock. This input vector is made of four attributes: Open Price, Market High & Low and

Volume. The target for the neural network will be the closing price. On the basis of this

structure, the ANN is to be designed.

Page 36: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

36

Model of our neural network.

Page 37: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

37

SupervisedDataset for this Problem Supervised learning is a machine learning task of developing a function to describe the

nature of some organized input data. Basically each example pair of training data, consisting of

an input vector and an expected output value is analyzed by a supervised learning algorithm,

which can then be extended to forecast future example pairs. It is clear from studying this

problem that supervised learning is applicable here, using the five aforementioned inputs.

The first component needed for the network was a data structure capable of storing

these example pairs of stock data for training, in the format of [input vector, output]. It so

happens that PyBrain has the pybrain.dataset package for this, which contains the

SupervisedDataset class for standard learning. In this case, parameters for this object would be

4 and 1, the dimensions of the input and output vectors under the supervised.py class. Example

pairs of data can then be added to the data set using the SupervisedDataset.addSample()

method. A multitude of methods available to manipulate this dataset can be utilized here too.

SupervisedDataset Syntax

Page 38: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

38

Building a Feed Forward Network using buildNetwork After deciding upon developing a neural network solution, the algorithm to be used

during training had to be selected. During experimentation with Weka, a number of different

algorithms were studied and analyzed. However for the purpose of developing my first neural

network to forecast stock price data, I decided to use the backpropagation algorithm as a point

to start. From here, there is much scope for selecting a better-fitting algorithm in future work,

based on this model's shortcomings. The first step is to create the feed forward network itself

using PyBrain's buildNetwork() shortcut. This method can only be used after importing it from

the pybrain.tools.shortcuts library.

More information can be found in the buildNetwork function definition.

Page 39: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

39

In data analytics, a feed forward neural network is a standard model for use in machine

learning. Specifically it is a type of ANN where the connections between neurones do not form

a cycle. According to the Random Walk Theory, changes in commodity value have equal

distributions and are independent of each other, resulting in a market volatility that makes it

impossible to gauge future values. While acknowledging the improbability of accurate financial

forecasts, regardless of the approach, this does confirm the veracity of using such a network, as

inputs used indeed cannot form a cycle. Of course with the advent in research today, much

more sophisticated networks do exist, easily implementable using PyBrain. But since the focus

of this project is to better understand the potential in using neural networks for financial

forecasting rather than developing a real-time solution, using a feed forward ANN.

The parameters required to execute this method are the dimensions of the input,

hidden and output layers of the network. Since the input and output layer sizes need to match

the supervised dataset's input and output dimensions, we can use the dataset's indim and

outdim attributes here. The number of hidden layers can be defined as desired, although values

of 10, 15 and 20 were used in tests. It should be noted that is is very much possible to build

more customized networks on PyBrain using more advanced functionalities, something that

could be explored in future work after gaining more experience. The code snippet below builds

our feed forward neural network, specified to model our supervised data set. (Note -

trainingData refers to our SupervisedDataset object).

Page 40: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

40

Training with the BackpropTrainer The Backpropagation algorithm was first developed in the 1970's, gaining more

popularity in 1986 following a famous study conducted by Rumelhart, Hinton and Williams.

Since then it has now become one of the most important machine learning algorithms for

neural networks. It is essentially used to train ANN's in concurrence with a selected

optimization method, by a "backward propagation of errors", i.e. the network learning from its

mistakes. A good example of such a method would be gradient descent. An important

prerequisite in applying backpropagation is a known output term for each input vector so as to

continuously calculate the net loss function gradient so that the network can "learn". This is

why backpropagation is typically considered to be a supervised learning method. This works

hand in hand with our SupervisedDataset object, reiterating the correctness behind using this

approach.

For the purpose of this project, implementation of the trainer logic as backpropagation

is to be treated as a black box for the sake of simplicity. PyBrain's easy to use API for makes it

effortless for us to instruct the network to apply the pre-designed backpropagation algorithm in

its library to training, which is why we can do this. However it is important to note that after

generating results, understanding the inner workings of PyBrain's take on the backpropagation

algorithm would still be key to understanding the limitations of such an approach and reasons

for predictions turning out the way they did. So as discussed, So as discussed, a number of

different trainers exist for use,, available to see in the trainers class of the supervised library.

The backpropagation algorithm can be implemented here by initializing the trainer variable as

an instance of the BackpropTrainertrainer.

Page 41: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

41

An insight into the innerworkings of the BackpropTrainer in PyBrain

The BackpropTrainer has a number of different parameters that can be customized

before creation. Only two parameters are compulsory, namely the instance of the neural

network made using buildNetwork() and the dataset containing the training data. Additional

parameters such as learningrate, momentum and lrdecay and weightdecay are also available;

however while non-declaration will result in the use of pre-defined default values, the option is

there to customize this according to what type of output is desirable. Here are the information

regarding these parameters:

learningrate: gives the ratio at which parameters are changed into the direction of the

gradient; fundamental to the backpropagation in the model; default value = 0.01.

Page 42: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

42

lrdecay: gives the value by which the learning rate decreases at every epoch of training;

used to multiply the learning rate after each epoch; default value = 1.0.

momentum: gives the ratio by which the gradient of the last timestep was used;

parameters are adjusted with respect to momentum at every step of training.

weightdecay: used to adjust the weight decay rate if applicable; default values of

momentum and weightdecay, if not declared, will be 0.

Experimenting with these parameters could result in changes with how well the neural network

is "trained" on a specific dataset. While different rates were experimented with, the source

code submitted demonstrates neural network training for one standard configuration of these

parameters, in which only learningrate and momentum are defined.

Code snippet showing the syntax to use a BackpropTrainer

Page 43: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

43

Commencing Training and Forecasting Results The final component of the neural network are instructions on how training should last,

and when it should commence. There are a number of methods that seek to accomplish this,

but in different ways:

train(): trains the network on the dataset for one epoch. Can be extended to 'n' number

of epochs by using loops.

trainEpochs(): trains on the dataset for a given number of epochs, taking the desired

number of epochs as a parameter.

trainOnDataset():trains on a given dataset, taking the selected dataset as a parameter.

trainUntilConvergence(): trains the network on a dataset until it converges. The

combination of network parameters that resulted in the least error is returned.

For the source code submitted, the for-loop + train() combination is the call demonstrated.

Additionally, it is possible to make actual forecasts based on any input vector of choice. This is

done by using the activate() method that can be called by any instance of a neural network.

Here is an example of making a forecast for a single input vector in our code:, after training:

This can easily be extended to make 'n' number of predictions as desired, by simply

using this method with any viable input vector. A slightly more complex and useful

enhancement would be to tabulate an array with the actual values that were generated in real-

Page 44: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

44

time, after forecasting. Then the predictions made by the neural network can be compared to

the actual values observed, while continuously measuring how close the ANN got. These error

statistics indeed have immense value in determining if a neural network is functioning properly,

whether a particular solution works, as well as whether the choice of algorithm for training was

actually correct.

Page 45: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

45

Experiments and Results Here is a discussion on the experiments conducted and results observed after the

completion of neural network development and training.

Choice of Stocks for Training For this project, a selection of 15 stocks from different industries were selected for

experiments. By picking such a variety in stocks to test our neural network with, we can assess a

number of things about the network such as how well it was designed, how robust the training

dataset was and the limitations of our methodology. Here are the stocks chosen:

1. General Electric Motors (GE) 2. Bank of America (BAC) 3. JP Morgan & Chase (JPM) 4. Ford Motors (F) 5. Morgan Stanley (MS) 6. Visa (V) 7. Nike (NKE) 8. 7 Eleven (SE) 9. Johnson & Johnson (JNJ) 10. Exelon (EXC) 11. Coca Cola (KO) 12. Verizon (VZ) 13. Delta Air Lines (DAL) 14. Southwestern Energy (SWN) 15. Citigroup (C)

The stocks selected were subject to a number of different experiments. Here error ratios

calculated on each stock is recorded, along with some interesting observations

Page 46: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

46

Data Sources The main source for all stock training data collected was Yahoo Finance using the

Historical Prices tab. Daily stock price statistics were collected for a period of one month for

each stock. All this stock data, specifically the five inputs previously discussed, was classified

and stored into a database. For this purpose, MySQL was usedas it is easily available and fit the

purpose. For all experiments, the training dataset for the neural network was queried from this

database. During future work if training data collected was to be more vast, a more

sophisticated database could be used instead.

Yahoo Finance Historical Data tab

Collection of data into MySQL database

Page 47: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

47

Results When training was run for each stock, the neural network successfully learned over the

course of each epoch, as evidenced from the reduction in total error at every iteration. This

shows that our ANN fundamentally works. The total error recorded for all studied stocks at 10,

20 and 30 epochs have all been recorded in the table below.

Stock Under Study

(Symbol)

Total Error after 10

Epochs

Total Error after 20

Epochs

Total Error after 30

Epochs

GE 0.0478326325128 0.0147788314303 0.0144716662975

BAC 0.64534885432 0.47972568532 0.0405268908

JPM 1.88707576785 1.65382786958 1.64369566268

F 0.0223338409577 0.0174371947788 0.0174862366385

MS 0.439415738034 0.375976379256 0.367695140949

V 1.1593345643 0.656152332006 0.604255608943

NKE 2.26286753158 0.284309815023 0.281531187259

SE 2.08863133494 0.190362428371 0.190924320676

JNJ 3.38060957438 0.116861908606 0.115396230369

EXC 0.306949234472 0.309293248468 0.311354827012

KO 0.629210341658 0.0622298909726 0.060579857318

VZ 0.485702834569 0.272830450999 0.245383766598

DAL 0.918929047424 0.305389700618 0.308423169326

SWN 0.735902015951 0.736245393242 0.740400645229

C 1.47925609726 1.44944038106 1.46536778051

Page 48: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

48

Generation of total error occurs at every epoch as can be seen from the terminal window.

Performance Analysis For the stocks tested, a number of interesting observations were made.

Firstly, it is clear that beyond 20 epochs, the neural network is unable to learn

any further. This is clear from the way total error oscillates around a certain fixed

value for back-to-back epochs after this threshold is reached. Next, on

experimenting with the learning rate parameter of the neural network, it can be

seen that a learning rate of 0.005 returns the lowest final total error. Decreasing it

any further to improve accuracy results in faulty functioning of the network, as

Page 49: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

49

can be seen from the odd values generated in later epochs. Otherwise overall, the

designed neural network seems to perform well, although specifically for medium

sized datasets in the short-term. Long term predictions do not seem very viable as

can be seen from running some test forecasts.

To conclude here, a mention must be made here of the time complexity of

this solution. The time complexity of the original backpropagation neural network

on a single process is of O(W^3), where W is the number of weights used in the

network. But in our approach, backpropagation is essentially a non-linear

optimization algorithm for which convergence can only be guaranteed if the

number of training epochs are infinite. It is thus clear that this cannot be solved in

polynomial time, making it NP-Complete.

Limitations While this was a satisfactory first attempt at building a neural network, there are some

clear limitations to the approach taken as summarized below:

Firstly, it is clear from experiments conducted that went beyond 20-30 epochs, that the

periodic drop in total error seems to stagnate, with the error rate going no lower. This

suggests that there is insufficient information for the neural network to "learn" any

further. Its inability to make further correlations could be improved by raising the

Page 50: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

50

quality of data using some helpful financial and economic ratios to be discussed later.

On the other hand, increasing the volume of data, i.e. the number of instances available

for training, could also offer further improvement. But there is a point beyond which

this will also have no effect.

Additionally, it was observed that if the number of data instances used in training cross

a certain threshold, the forecasts made tended to be quite off track when experiments

were run. This suggests that this methodology seems to work best with smaller datasets

in the short term, rather than larger datasets. As a result, backpropagation may not be

the best training algorithm for use in our neural network. It is even possible that

PyBrain's implementation of the backpropagation algorithm may not be very

sophisticated. This is a clear limitation, that can only be fixed by improving upon the

existing backpropagation methodology used by PyBrain, or by using a different logic for

training.

Finally, there is the issue that our neural network is too simplistic to determine if

something is wrong with data before training. Even a small mistake in data

representation can result in a flawed learning process. Moreover, in case there are some

sudden volatile fluctuations in stock data due to some business decisions of the issuing

firm; the algorithm being used must be enhanced so as to reflect the real reasons for

these changes. It is only then that our neural network can be classed as relatively

intelligent. This is a tedious process, an improvement to the network that can only be

perfected on a trial-and-error basis.

Page 51: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

51

Oscillation of total error around a stationary value, when too many epochs have gone by.

Page 52: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

52

Future Work Despite one year of understanding this field, it is by no means enough in developing

neural networks and other predictive models. Data analysts and scientists spend years studying,

to even qualify for work in this domain. On top of this, stock price prediction is not the best

problem to first learn the inner workings of neural networks and data mining. So this is by no

means a final solution. While this network as well as the results was hugely significant

personally, due to its educational and informative value, its performance analysis shows that

there is much to be done before such a model could be applied real-time. Here is a discussion

of some future areas to explore with respect to improving the project methodology.

Hybrid Approaches The approach taken in this project was quite linear in the sense that it derived

inspiration predominantly from the technical school of thought, implemented using a soft

computing technique. Historical data for each stock was compiled and used to train the ANN

after which it could forecast results. But every neural network is only as good as the data it was

trained on leaving us with the question of whether this is enough. With the advent in research

being done, it has become increasingly agreed upon that a hybrid approach, combining both

technical and fundamental analysis, offers the best chance yet at financial forecasting. Aside

from historical data, there are a number of financial and economic indicators that could have a

direct or indirect correlation with stock price. Periodic instances of such indicators could

therefore improve forecasting accuracy, by enhancing the neural network training phase.

Page 53: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

53

Aside from these ratios, there is also the issue of those events affecting stock price that

cannot be easily valued as a number and fed into the neural network. From political events to

economic catastrophes, a number of different events could possibly occur and affect stock

prices. Internal decisions from bureaucracy are another important factor that comes under this

category. However all this has become possible today owing to the new, novel technologies

being explored in this regard, such as text mining. Text mining can be used to parse online

forums, websites and archives for key words based on a pre-developed algorithm. Essentially

artificial intelligence can be used in this way to discover events as they happen real-time, add

this to our existing database and utilize this during ANN training, although it would be up to the

developer to design a methodology for the ANN to understand this data. In addition to text

mining, sentiment analysis has also become a tool of interest in current research. Here is a

discussion on these concepts and how they could improve upon what was developed.

Financial Indicators to Improve Accuracy In trading, it is a common technique to utilize financial indicators, usually plotted using

charts, to determine whether the price of a stock will rise or drop. The main use of this in this

respect is to increase the mathematical probability of conducting a successful trade. This can

then be incorporated into an entry or exit strategy depending on the situation. Therefore it is

clear that these same financial indicators could prove useful during neural network training

before forecasting.

Page 54: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

54

The first step would be to determine which financial indicators to be used, as well as the

time interval for which these values are to be collected. Then after adding this data to our

training set, we can enhance the information available to the neural network to discover more

correlations, as the more complex the network, the higher the potential for accurate figures

during actual forecasting. After training, theoretically the neural network should be able to

minimize its scope for error. Of course this is highly dependent on how correct the choice in

financial indicators are, and whether they really impact that particular stock price. This is

variable on a case-to-case basis, and therefore would require an ad-hoc solution at best, i.e.

trying different indicators using trial and error, then utilizing the best combination possible in a

final solution.

According to research conducted, there are a number of financial ratios with much

scope for exploration in future work. Here is a brief summary of some of the indicators chosen:

Moving Averages (MA): as discussed, this helps determine if a trend has started or

completed. Also known as moving mean, it is a key component in ARIMA.

Moving Average Convergence / Divergence (MACD): An indicator that tells us the

momentum of a specific trend through the relationship between two moving prices.

Bollinger Bands: A band plotted two standard deviations away from a moving average.

Since standard deviations in essence measure volatility these prove highly useful in

technical analysis to determine whether a stock has been overbought or oversold.

Arms Index (TRIN): This technical indicator studies the rise / fall in trading volumes and

its correlations with stock issuance. It basically acts as a gauge of "market sentiment".

Page 55: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

55

To be discussed more later with respect to sentiment analysis, it is mainly used in

forecasting for futures and is usually recorded over the course of one trading session.

Relative Strength Index (RSI): Another technical indicator used to chart present and

historical strength of a stock, based on its closing prices over some time interval. This is

done by measuring the speed and extent of directional price movements. RSI also helps

in determining if market conditions are overbought or oversold.

VIX: The Volatility Index, better known as VIX, shows the market's expectation of

volatility over a one month period. Present as a ticker on the Chicago Board Options

Exchange, it is an indicator of market risk explaining why investors call it the "investor

fear gauge". Similar to TRIN, there is scope for using VIX in sentiment analysis.

These are just some financial indicators to consider for future work. While more basic indicators

like day of week, time, etc. could be used, the potential to raise the accuracy is low and these

are avenues that have been explored already. More complex technical and sentiment indicators

however offer much scope for future improvements through enhancing the training phase.

Page 56: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

56

Text Mining Text Mining refers to the act of deriving “high quality information from text.” Usually,

input text from a multitude of sources are parsed intelligently, with key pieces of information

being retrieved and stored in a DB, and some parts removed as they contain nothing valuable.

Then this useful text can be studied for patterns which could be used in predictive analysis, in

our case financial forecasting. This is the focus of study in Natural Language Processing, an area

of study in computer science.Aside from historical data, online news information regarding a

stock, or the firm offering the stock, would be highly useful in forecasting, as they offer

information from the fundamental school of thought. As discussed, forums such as Yahoo

Finance and Google Trends are two such examples. In the Yahoo Finance website for instance,

there is a “Key Developments” tab which can be used to extract all the latest important news

and events relating to a specific stock.As evidenced by our Weka tests, this information could

be useful in improving financial forecasting accuracy. A prerequisite to this of coursewould be a

methodology to incorporate these news into the forecasting process, but there is no doubt that

such information on what is happening at the firm, key business decisions as well as market

events that could impact stock value, would definitely raise accuracy if used correctly.

A good approach with regards to the methodology is suggested by Kim-Georg Aase for

his master's degree thesis. Here news articles are classified as being either positive, negative or

neutral. If a news article is classed as positive, it suggests that a particular stock may be moving

towards an upward trend. On the other hand, a negative article may suggest a downward

trend, while a neutral article may suggest the trend to be flat. Each article is to be put through

document preprocessing, document elimination and document representation after data

Page 57: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

57

acquisition is complete. After classification and a final accumulation of results, a large database

of news article would thus be labeled, organized and ready for use as input. This can then be

used as additional data for use in training offering more scope for the ANN to make more

connections. There are a number of tools available to use in the development of the text mining

component in any final solution such as:

General Architecture of Text Engineering (GATE) - an open source Java suite of tools

used for natural language processing tasks such as information extraction.

Natural Language Tool Kit (NLTK) - an open source suite of libraries for use in symbolic

and statistical natural language processing for Python.

OpenNLP - Apache's OpenNLP is another open source machine learning toolkit for use

in Java, specifically for natural language processing.

Commercial software like the SAS Text Miner developed by the SAS Institute.

Page 58: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

58

Natural Language Processing with GATE.

Since Python was used in developing our neural network, NLTK would be the ideal

choice. It provides a multitude of easy to use interfaces to over 50 lexical resources and an

entire suite of libraries to run standard text mining tasks such as classification, tokenization,

stemming, tagging, parsing and semantic reasoning. It is also has extensions to use Stanford

Text Analysis tools with your Python code. While NLP is a topic so vast that it requires a study of

its own, incorporating text mining of online resources according to Aase's methodology, using

NLTK, would be a powerful enhancement to prioritize in future work. Further research can be

done to improve on this methodology as well, though this would be a good place to start.

Page 59: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

59

Sentiment Analysis Sentiment Analysis, also known as opinion mining, is the process of determining

whether the mood associated with a piece of writing is positive, negative or neutral. With the

boom of online blogging and social media today, sentiment analysis could be an additional

indicator that could be of use in financial forecasting. For example, if sentiment analysis is done

on tweets made by users on some sort of stock, it may indicate some sort of positive or

negative feeling amongst the public towards purchasing said stock. So price may be expected to

drop since no one wants to buy, or rise in case demand is high. Some examples of websites

which could be used in this regard specifically for financial forecasting include Twitter, Google

Trends and StockTwits.

An important premise in behavioral economics is that the emotions and mood of people

impact the decisions they make, indicating a clear correlation between public sentiment and

the market. For example in April 2015, Twitter's disastrous quarterly earnings was leaked online

in a tweet, following which its stock dropped by an astonishing 20%. A fundamental

disadvantage to sentiment analysis is that sources like social media are too vast, with huge

quantities of data to be analyzed. However recent tools like CityFalcon offer a centralized

platform to retrieve, assimilate and organize data on public sentiment on financial events as

they occur real-time. While fundamentally similar in theory to text mining, the methodology

used here would have to be slightly different. One place to start would be Bollen et al's strategy

to use twitter data to assess the mood of the public before using this in forecasting. Here the

OpinionFinder and GPMOS algorithm was used to first classify public sentiment through tweets

into six categories ranging from calm to alert. Then after a complex process of causality

Page 60: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

60

analysis, the researchers used Self Organizing Fuzzy Neural Networks to predict the future

values of the Dow Jones Industrial Index using historical data, the resultant of which an 87%

accuracy in predicting increasing or decreasing trends in stock price. While extensive work is

being done today in this area, this would be a good foundation to build upon, as there is always

scope for improvement.

In order to put any proposed methodology into action, a number of tools exist such as:

Lexalytics' Semantria: Semantria, owned by sentiment analysis company Lexalytics,

offers sentiment and intent analysis via its API and Excel plugin.

Sysomos Media Analysis Platform (MAP): It is one of the most popular social media

analytical tools available today. MAP can be used to identify key influences to estimating

public sentiment with regards to some product or brand.

Lexalytic's Intention Analysis in action.

Page 61: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

61

The Sysomos MAP interface.

Many more tools do exist, although this is still a novel field. Research is currently ongoing and

this offers yet another avenue to explore further in future work. Incorporating the

functionalities of any of these tools with our neural network and any text mining extension is

highly viable.

Page 62: Final Year Project 2015/2016 - i.cs.hku.hki.cs.hku.hk/fyp/2015/report/final_report/ARUNMOZHI SIBIE_10973725... · THE UNIVERSITY OF HONG KONG Final Year Project 2015/2016 Final Report

62

References

Enthought Canopy. (n.d.). Retrieved April, 2016, from https://www.enthought.com/products/canopy/

Marsland, S. (2009). Machine learning: An algorithmic perspective. Boca Raton: CRC Press.

Nielsen, M. (n.d.). Neural networks and deep learning. Retrieved April, 2016, from http://neuralnetworksanddeeplearning.com/

PyBrain documentation!¶. (n.d.). Retrieved April, 2016, from http://pybrain.org/docs/index.html

Witten, I. H., Frank, E., & Hall, M. A. (2011). Data mining: Practical machine learning tools and techniques. Burlington, MA: Morgan Kaufmann.

Stock Market Prediction - Predicting the Future - Survey of Research Papers and Patents | thinkPat - Appreciating Innovation. (n.d.). Retrieved April, 2016, from http://www.thinkpatcri.com/2013/10/stock-market-prediction-predicting.html