Artificial Neural Networks Lect1: Introduction & neural computation

34
CS407 Neural Computation Lecturer: A/Prof. M. Bennamoun

Transcript of Artificial Neural Networks Lect1: Introduction & neural computation

Page 1: Artificial Neural Networks Lect1: Introduction & neural computation

CS407 Neural Computation

Lecturer: A/Prof. M. Bennamoun

Page 2: Artificial Neural Networks Lect1: Introduction & neural computation

General InformationLecturer/Tutor/Lab Demonstrator: – A/Prof. M. BennamounTimetable:– Lectures: Mondays 10am-11:45am CS-Room

1.24– Tutorials: fortnightly starting in week 3– Laboratories: starting in week 3– Consultations: Thursdays 2-3pm.Assessment:– Project- week 9-13: 40%– Final Exam- 2 hours : 60%.Recommendations: Familiarize yourselves with Matlab and its neural network toolbox.

Page 3: Artificial Neural Networks Lect1: Introduction & neural computation

Text BooksM. Hassoun, Fundamentals of Artificial Neural Networks, MIT press.

– A thorough and mathematically oriented text.

S. Haykin, “Neural Networks: A Comprehensive foundation” -– An extremely thorough, strongly mathematically grounded, text

on the subject. – “If you have strong mathematical analysis basics and you love

Neural Networks then you have found your book”

L. Fausett, “Fundamentals of Neural Networks”– “clear and useful in presenting the topics, and more importantly,

in presenting the algorithms in a clear simple format which makes it very easy to produce a computer program implementing these algorithms just by reading the book”

Phil Picton, “Neural Networks”, Prentice Hall.– A simple introduction to the subject. It does not contain algorithm

descriptions but includes other useful material.

Page 4: Artificial Neural Networks Lect1: Introduction & neural computation

Today’s LectureMotivation + What are ANNs ?

The BrainBrain vs. ComputersHistorical overviewApplicationsCourse content

Page 5: Artificial Neural Networks Lect1: Introduction & neural computation

Motivations: Why ANNSMotivations: Why ANNS“Sensors” or “power of the brain”“Sensors” or “power of the brain”

TransverseSagittal

Page 6: Artificial Neural Networks Lect1: Introduction & neural computation

Computer vs. BrainComputer vs. Brain

Computers are good at: 1/ Fast arithmetic and 2/ Doing precisely what the programmer programs them to do

Computers are not good at: 1/ Interacting with noisy data or data from the environment, 2/ Massive parallelism, 3/ Fault tolerance, 4/ Adapting to circumstances

Page 7: Artificial Neural Networks Lect1: Introduction & neural computation

Brain and Machine

• The Brain– Pattern Recognition– Association– Complexity– Noise Tolerance

• The Machine– Calculation– Precision– Logic

Page 8: Artificial Neural Networks Lect1: Introduction & neural computation

The contrast in architecture

• The Von Neumann architecture uses a single processing unit;– Tens of millions of operations per

second– Absolute arithmetic precision

• The brain uses many slow unreliable processors acting in parallel

Page 9: Artificial Neural Networks Lect1: Introduction & neural computation

Features of the Brain

• Ten billion (1010) neurons• On average, several thousand connections • Hundreds of operations per second• Die off frequently (never replaced)• Compensates for problems by massive

parallelism

Page 10: Artificial Neural Networks Lect1: Introduction & neural computation

The biological inspiration

• The brain has been extensively studied by scientists.

• Vast complexity and ethics (which limit extent of research) prevent all but rudimentary understanding.

• Even the behaviour of an individual neuron is extremely complex

Page 11: Artificial Neural Networks Lect1: Introduction & neural computation

• Single “percepts” distributed among many neurons

• Localized parts of the brain are responsible for certain well-defined functions (e.g. vision, motion).

• Which features are integral to the brain's performance?

• Which are incidentals imposed by the fact of biology?

The biological inspiration

Page 12: Artificial Neural Networks Lect1: Introduction & neural computation

Do we want computers which get confused, and make Do we want computers which get confused, and make mistakes,,…? mistakes,,…?

Where can neural network systems help? •where we can't formulate an algorithmic solution. •where we can get lots of examples of the behaviour we require. •where we need to pick out the structure from existing data. •http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html

Page 13: Artificial Neural Networks Lect1: Introduction & neural computation

What are “Artificial neural networks”? http://webopedia.internet.com/TERM/n/neural_network.html

ANN’s are a type of artificial intelligence that attempts to imitate the way a human brain works. Rather than using a digital model, in which all computations manipulate zeros and ones, a neural network works by creating connections between processing elements, the computer equivalent of neurons. The organization and weights of the connections determine the output.

Page 14: Artificial Neural Networks Lect1: Introduction & neural computation

What are Artificial Neural Networks? …

(i) Hardware inspired by biological neural networks, e.g. human brain(ii) Parallel, Distributed Computing Paradigm (i.e. Method)(iii) Algorithm for learning by example(iv) Tolerant to errors in data and in hardware(v) Example of a complex system built from simple parts

Page 15: Artificial Neural Networks Lect1: Introduction & neural computation

Sims, history & areas of use…Strictly speaking, a neural network implies a non-digital computer, but neural networks can be simulated on digital computers. The approach is beginning to prove useful in certain areas that involve recognizing complex patterns, such as voice recognition and image recognition.

Page 16: Artificial Neural Networks Lect1: Introduction & neural computation

Definition of an ANNAn ANN is a massively parallel distributed processor that has a natural propensity for storing exponential knowledge and making it available for use. It resembles the brain in 2 respectsknowledge is acquired by the network thru a learning process.Interconnection strengths known as synaptic weights are used to store the knowledge.

Other terms/names• connectionist• parallel distributed processing• neural computation• adaptive networks..

Page 17: Artificial Neural Networks Lect1: Introduction & neural computation

Simple explanation – how NNs workhttp://www.zsolutions.com/light.htm

Neural Networks use a set of processing elements (or nodes) loosely analogous to neurons in the brain(hence the name, neural networks). These nodes are interconnected in a network that can then identify patterns in data as it is exposed to the data. In a sense, the network learns from experience just as people do (case of supervised learning). This distinguishes neural networks from traditional computing programs, that simply follow instructions in a fixed sequential order.

Page 18: Artificial Neural Networks Lect1: Introduction & neural computation

Simple explanation – how NNs workhttp://www.zsolutions.com/light.htm

The structure of a feed forward neural network

The bottom layer represents the input layer, in this case with 5 inputs labelled X1 through X5. In the middle is something called the hidden layer, with a variable number of nodes. It is the hidden layer that performs much of the work of the network. The output layer in this case has two nodes, Z1 and Z2 representing output values we are trying to determine from the inputs. For example, we may be trying to predict sales (output) based on past sales, price and season (input).

Page 19: Artificial Neural Networks Lect1: Introduction & neural computation

Simple explanation – hidden layerhttp://www.zsolutions.com/light.htm

Each node in the hidden layer is fully connected to the inputs. That means what is learned in a hidden node is based on all the inputs taken together. This hidden layer is where the network learns interdependencies in the model. The following diagram provides some detail into what goes on inside a hidden node (see more details later).Simply speaking a weighted sum is performed: X1 times W1 plus X2 times W2 on through X5 and W5. This weighted sum is performed for each hidden node and each output node and is how interactions are represented in the network.Each summation is then transformed using anonlinear function before the value is passed on to the next layer.

More on the Hidden Layer

Page 20: Artificial Neural Networks Lect1: Introduction & neural computation

Where does the NN get theweights? (case of supervised learning)http://www.zsolutions.com/light.htm

Again, the simple explanation... The network is repeatedly shown observations from available data related to the problem to be solved, including both inputs (the X1 through X5 in the diagram above) and the desired outputs (Z1 and Z2 in the diagram). The network then tries to predict the correct output for each set of inputs by gradually reducing the error. There are many algorithms for accomplishing this, but they all involve an interactive search for the proper set of weights (the W1-W5) that will do the best job of accurately predicting the outputs.

Page 21: Artificial Neural Networks Lect1: Introduction & neural computation

Historical Overview 40’s, 50’s 60’s

(i) McCulloch & Pitts (1943) - Threshold neuron McCulloch & Pitts are generally recognised as the designers of the first neural network(ii) Hebb (1949) - first learning rule(iii) Rosenblatt (1958) - Perceptron & Learning Rule(iv) Widrow & Huff (1962) - Adaline

Page 22: Artificial Neural Networks Lect1: Introduction & neural computation

Historical Overview 70’s

(v) Minsky & Papert (1969) – Perceptron Limitations described, Interest wanes (death of ANNs)(vi) 1970’s were quiet years

Page 23: Artificial Neural Networks Lect1: Introduction & neural computation

Historical Overview 80’s 90’s

(vii) 1980’s: Explosion of interest (viii) Backpropagation discovered (Werbos ‘74, Parker ‘85, Rumelhart ‘86)(ix) Hopfield (1982): Associative memory, fast optimisation(x) Fukushima (1988): neocognitron for character recognition

Page 24: Artificial Neural Networks Lect1: Introduction & neural computation

Signal Processing, e.g. Adaptive Echo CancellationPattern Recognition, e.g. Character RecognitionSpeech Synthesis (e.g. Text-to-Speech) & recognitionForecasting and predictionControl & Automation (neuro-controllers) e.g. Broom-BalancingRadar interpretationInterpreting brain scansStock market predictionAssociative memoryOptimization, etc…

For more reference, see the Proceedings of the IEEE, Special Issue on ArtificialNeural Network Applications, Oct. 1996, Vol. 84, No. 10

Applications of ANNs

Page 25: Artificial Neural Networks Lect1: Introduction & neural computation

A simple example

01

Pattern

classifier

2

43

5

76

89

We want to classify each input pattern as one of the 10 numerals (3 in the above figure).

Page 26: Artificial Neural Networks Lect1: Introduction & neural computation

ANN sol’n

ANNs are able to perform an accurate classification even if the input is corrupted by noise. Example:

Page 27: Artificial Neural Networks Lect1: Introduction & neural computation

Artificial Neural Networks (Taxonomy)

These neurons connected together will form a network.These networks (ANNs) differ from each other, according to 3 main criteria: (1) the properties of the neuron or cell (Threshold, and Activation Function)(2) The architecture of the network or topology and (3) the learning mechanism or learning rule (Weight Calculation), and the way they are updated: Update rule, e.g. synchronous, continuous.

And of course the type of implementation: Software, Analoghardware, digital hardware

Page 28: Artificial Neural Networks Lect1: Introduction & neural computation

Topology/Architecture

There are 3 main types of topologies:Single-Layer Feedforward NetworksMultilayer Feedforward NetworksRecurrent Networks.

Page 29: Artificial Neural Networks Lect1: Introduction & neural computation

Network ArchitectureMultiple layerfully connectedSingle layer

Unit delayoperator

Recurrent networkwithout hidden units

inputs

outputs

{

}

Recurrent networkwith hidden units

Page 30: Artificial Neural Networks Lect1: Introduction & neural computation

Topics to be covered

Background informationThreshold gatesMultilayer networksClassification problemLearning processCorrelation matrixPerceptron learning ruleSupervised learningMulti-layer perceptron (MLP): Backpropagation alg.Unsupervised learning

Page 31: Artificial Neural Networks Lect1: Introduction & neural computation

Topics to be covered (…)

Hebbian learning rule, Oja’s ruleCompetitive learningInstar/Outstar Networks (ART-Adaptive Resonance Theory)Self-organizing feature maps (Kohonen’s nets)Hopfield netws, stochastic neuronsBoltzmann machines & their applicationsRecurrent neural nets & temporal NNs

Page 32: Artificial Neural Networks Lect1: Introduction & neural computation

Assumed Background

Basic linear AlgebraBasic differential calculusBasic combinatoricsBasic probability

Page 33: Artificial Neural Networks Lect1: Introduction & neural computation

References:1. ICS611 Foundations of Artificial Intelligence, Lecture

notes, Univ. of Nairobi, Kenya: Learning –http://www.uonbi.ac.ke/acad_depts/ics/course_material-

1. Berlin Chen Lecture notes: Normal University, Taipei, Taiwan, ROC. http://140.122.185.120-

2. Lecture notes on Biology of Behaviour, PYB012- Psychology, by James Freeman, QUT.

3. Jarl Giske Lecture notes: University of Bergen Norway, http://www.ifm.uib.no/staff/giske/

4. Denis Riordan Lecture notes, DalhousieUniv.:http://www.cs.dal.ca/~riordan/

5. Artificial Neural Networks (ANN) by David Christiansen: http://www.pa.ash.org.au/qsite/conferences/conf2000/moreinfo.asp?paperid=95

Page 34: Artificial Neural Networks Lect1: Introduction & neural computation

References:•Jin Hyung Kim, KAIST Computer Science Dept., CS679 Neural Network lecture notes http://ai.kaist.ac.kr/~jkim/cs679/detail.htmtp://ai.kaist.ac.kr/~jkim/cs679/detail.htm