Decision Support Systems Artificial Neural Networks for Data Mining.

Decision Support Decision Support SystemsSystems

Artificial Neural Networks for Artificial Neural Networks for Data MiningData Mining

Modified from Decision Support Systems and Business Intelligence Systems 9E.

1-2

Learning ObjectivesLearning Objectives Understand the concept and definitions of

artificial neural networks (ANN) Learn the different types of neural network

architectures Learn the advantages and limitations of ANN Understand how backpropagation learning

works in feedforward neural networks Understand the step-by-step process of how

to use neural networks Appreciate the wide variety of applications of

neural networks; solving problem types


1-3

Opening Vignette:Opening Vignette:

“Predicting Gambling Referenda with Neural Networks”

Decision situationProposed solutionResultsAnswer and discuss the case questions


1-4

Neural Network ConceptsNeural Network Concepts Neural networks (NN): a brain

metaphor for information processing Neural computing Artificial neural network (ANN) Many uses for ANN for

pattern recognition, forecasting, prediction, and classification

Many application areas finance, marketing, manufacturing, and

so on


1-5

Biological Neural Biological Neural NetworksNetworks

Soma

Axon

Axon

SynapseSynapse

Dendrites

Dendrites Soma

Two interconnected brain cells (neurons)


1-6

Processing Information in Processing Information in ANNANN

w1

w2

wn

x1

x2

xn

.

.

.

Y

Y1

Yn

Y2

Inputs Weights Outputs

.

.

.

Neuron (or PE)

n

iiiWXS

1

)( Sf

SummationTransferFunction

A single neuron (processing element – PE) with inputs and outputs


1-7

Elements of ANNElements of ANN Processing element (PE) Network architecture

Hidden layers Parallel processing

Network information processing Inputs Outputs Connection weights Summation function


1-8

Elements of ANNElements of ANN

(PE)

(PE)

(PE)

(PE)

(PE)

(PE)

(PE)

TransferFunction

( f )

WeightedSum(S)

x1

x2

x3

Y1

InputLayer

Hidden Layer

Output Layer

Neural Network with one Hidden Layer


1-9

Elements of ANNElements of ANN

x1

x2 2211 WXWXY

(PE)

(PE)

Y

(PE)

(PE)

w1

w1

w11

w21

w12

w22

w23

x1

x2

Y1

Y2

Y3

2121111 WXWXY

2221212 WXWXY

2323 WXY

(a) Single neuron (b) Multiple neurons

PE: Processing Element (or neuron)

Summation Function for a Single Neuron (a) and Several Neurons (b)


1-10

Elements of ANNElements of ANN Transformation (Transfer) Function

Linear function Sigmoid (logical activation) function [0 1] Tangent Hyperbolic function [-1 1]

X1 = 3

Processing element (PE)

X2 = 1

X3 = 2

W1 = 0.2

W2 = 0.4

W3 = 0.1

Y = 1.2

Summation function:

Transfer function:

Y = 3(0.2) + 1(0.4) + 2(0.1) = 1.2

YT = 1/(1 + e-1.2) = 0.77

YT = 0.77

Threshold value?


1-11

Neural Network Neural Network ArchitecturesArchitectures

Several ANN architectures exist Feedforward Recurrent Associative memory Probabilistic Self-organizing feature maps Hopfield networks


1-12

Neural Network Neural Network ArchitecturesArchitecturesRecurrent Neural Recurrent Neural NetworksNetworks


1-13

Neural Network Neural Network ArchitecturesArchitectures

Architecture of a neural network is driven by the task it is intended to address Classification, regression, clustering,

general optimization, association Most popular architecture: Feedforward, multi-layered perceptron with

backpropagation learning algorithm

Both for classification and regression type problems


1-14

Learning in ANNLearning in ANN

A process by which a neural network learns the

underlying relationship between input and outputs,

or just among the inputs Supervised learning

For prediction type problems (e.g., backpropagation)

Unsupervised learning For clustering type problems (e.g., adaptive

resonance theory)


1-15

A Taxonomy of ANN A Taxonomy of ANN Learning AlgorithmsLearning Algorithms

Learning Algorithms

Discrete/binary input Continuous Input

Surepvised Unsupervised

· Delta rule· Gradient Descent· Competitive learning· Neocognitron· Perceptor

· Simple Hopefield· Outerproduct AM· Hamming Net

· ART-1· Carpenter /

Grossberg

· ART-3· SOFM (or SOM)· Other clustering

algorithms

Architectures

Supervised Unsupervised

Recurrent Feedforward Extimator Extractor

· Hopefield · SOFM (or SOM)· Nonlinear vs. linear· Backpropagation· ML perceptron· Boltzmann

· ART-1· ART-2

UnsupervisedSurepvised


1-16

A Supervised Learning A Supervised Learning ProcessProcess

Compute output

Is desiredoutput

achieved?

Stoplearning

Adjustweights

Yes

No

ANNModel

Three-step process:

1.Compute temporary outputs

2.Compare outputs with desired targets

3.Adjust the weights and repeat the process


1-17

How a Network LearnsHow a Network Learns Example: single neuron that learns

the inclusive OR operation

Learning parameters:

Learning rate

Momentum


1-18

Backpropagation LearningBackpropagation Learning

Backpropagation of Error for a Single Neuron

w1

w2

wn

x1

x2

xn

.

.

.

Yi

Neuron (or PE)

n

iiiWXS

1

)( Sf

SummationTransferFunction

)(SfY

a(Zi – Yi)error


1-19

Backpropagation LearningBackpropagation Learning The learning algorithm procedure:

1. Initialize weights with random values and set other network parameters

2. Read in the inputs and the desired outputs

3. Compute the actual output4. Compute the error (difference

between the actual and desired output)

5. Change the weights by working backward through the hidden layers

6. Repeat steps 2-5 until weights stabilize


1-20

Development Process of Development Process of an ANNan ANN


1-21

An MLP ANN Structure for An MLP ANN Structure for the Box-Office Prediction the Box-Office Prediction ProblemProblem

1

2

3

4

5

6

7

...

1

2

3

4

5

6

7

8

9

...

MPAA Rating (5)(G, PG, PG13, R, NR)

Competition (3)(High, Medium, Low)

Star Value (3)(High, Medium, Low)

Genre (10)(Sci-Fi, Action, ... )

Technical Effects (3)(High, Medium, Low)

Sequel (2)(Yes, No)

Number of Screens (Positive Integer)

Class 1 - FLOP(BO < 1 M)

Class 2(1M < BO < 10M)

Class 3(10M < BO < 20M)

Class 4(20M < BO < 40M)

Class 5(40M < BO < 65M)

Class 6(65M < BO < 100M)

Class 7(100M < BO < 150M)

Class 8(150M < BO < 200M)

Class 9 - BLOCKBUSTER(BO > 200M)

INPUTLAYER

(27 PEs)

HIDDENLAYER I(18 PEs)

HIDDENLAYER II(16 PEs)

OUTPUTLAYER(9 PEs)


1-22

Testing a Trained ANN Testing a Trained ANN ModelModel

Data is split into three parts Training (~60%) Validation (~20%) Testing (~20%)

k-fold cross validation Less bias Time consuming


1-23

Sensitivity Analysis on Sensitivity Analysis on ANN ModelsANN Models A common criticism for ANN:

The lack of expandability The black-box syndrome! Answer: sensitivity analysis

Conducted on a trained ANN The inputs are perturbed while

the relative change on the output is measured/recorded

Results illustrates the relative importance of input variables


1-24

Sensitivity Analysis on Sensitivity Analysis on ANN ModelsANN Models

D1

Systematically Perturbed

Inputs

Observed Change in Outputs

Trained ANN“the black-box”

For a good exampleSensitivity analysis reveals the most important injury severity factors in traffic accidents


1-25

A Sample Neural Network A Sample Neural Network ProjectProjectBankruptcy PredictionBankruptcy Prediction

A comparative analysis of ANN versus logistic regression (a statistical method)

Inputs X1: Working capital/total assets X2: Retained earnings/total assets X3: Earnings before interest and

taxes/total assets X4: Market value of equity/total debt X5: Sales/total assets


1-26


Data was obtained from Moody's Industrial Manuals Time period: 1975 to 1982 129 firms (65 of which went bankrupt

during the period and 64 nonbankrupt) Different training and testing

propositions are used/compared 90/10 versus 80/20 versus 50/50 Resampling is used to create 60 data

sets


1-27


x1

x2

x3

x4

x5

BR = 1

NBR = 1

Network Specifics Feedforward

MLP Backpropagation Varying learning

and momentum values

5 input neurons (1 for each financial ratio),

10 hidden neurons,

2 output neurons (1 indicating a bankrupt firm and the other indicating a nonbankrupt firm)


1-28

A Sample Neural Network A Sample Neural Network ProjectProjectBankruptcy Prediction - Bankruptcy Prediction - ResultsResults


1-29

Other Popular ANN Other Popular ANN ParadigmsParadigmsSelf Organizing Maps Self Organizing Maps (SOM)(SOM)

Input 1

Input 2

Input 3

First introduced by the Finnish Professor Teuvo Kohonen

Applies to clustering type problems


1-30

Other Popular ANN Other Popular ANN ParadigmsParadigmsSelf Organizing Maps Self Organizing Maps (SOM)(SOM) SOM Algorithm –

1. Initialize each node's weights2. Present a randomly selected input vector

to the lattice3. Determine most resembling (winning)

node4. Determine the neighboring nodes5. Adjusted the winning and neighboring

nodes (make them more like the input vector)

6. Repeat steps 2-5 for until a stopping criteria is reached


1-31

Other Popular ANN Other Popular ANN ParadigmsParadigmsSelf Organizing Maps Self Organizing Maps (SOM)(SOM)

Applications of SOM Customer segmentation Bibliographic classification Image-browsing systems Medical diagnosis Interpretation of seismic

activity Speech recognition Data compression


1-32

Other Popular ANN Other Popular ANN ParadigmsParadigmsHopfield NetworksHopfield Networks

First introduced by John Hopfield

Highly interconnected neurons

Applies to solving complex computational problems (e.g., optimization problems)

. . .

I n p u t

Output


1-33

Applications Types of ANNApplications Types of ANN

Classification Feedforward networks (MLP), radial

basis function, and probabilistic NN Regression

Feedforward networks (MLP), radial basis function

Clustering Adaptive Resonance Theory (ART)

and SOM Association

Hopfield networks


1-34

Advantages of ANNAdvantages of ANN Able to deal with highly nonlinear

relationships Not prone to restricting normality

and/or independence assumptions Can handle variety of problem types Usually provides better results

compared to its statistical counterparts

Handles both numerical and categorical variables


1-35

Disadvantages of ANNDisadvantages of ANN

They are deemed to be black-box solutions, lacking expandability

It is hard to find optimal values for large number of network parameters Optimal design is still an art: requires

expertise and extensive experimentation It is hard to handle large number of

variables Training may take a long time for large

datasets; which may require case sampling


1-36

End of the ChapterEnd of the Chapter

Questions / comments…

Decision Support Systems Artificial Neural Networks for Data Mining.

Documents

Transcript of Decision Support Systems Artificial Neural Networks for Data Mining.