CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

36
CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1

Transcript of CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

Page 1: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

1

CSCI3230Entropy, Information &Decision Tree

Liu Pengfei

Week 12, Fall 2015

Page 2: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

2

Which feature to choose?

Page 3: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

3

Information GainSuppose we have a dataset of peopleWhich split is more informative?

Split over whether Account exceeds

50K

Over 50KLess or equal 50K EmployedUnemployed

Split over whether applicant is employed

Page 4: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

4

Information Gain

Uncertainty/Entropy (informal) Measures the level of uncertainty in

a group of examples

Page 5: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

5

Uncertainty

Very uncertain group Less uncertain Minimum uncertainty

Page 6: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

6

Entropy: a common way to measure uncertainty

• Entropy =

pi is the probability of class i, or that the proportion of class i in the set.

• Entropy comes from information theory. The higher the entropy, the less the information content

• --some say higher entropy,• more information

Page 7: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

7

Information Gain We want to determine which attribute in a

given set of training feature vectors is most useful for discriminating between the classes to be learned.

Information gain tells us how important a given attribute of the feature vectors is.

We will use it to decide the ordering of attributes in the nodes of a decision tree.

Page 8: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

Example of Entropy

Gender Major Like

Male Math Yes

Female History No

Male CS Yes

Female Math No

Female Math No

Male CS Yes

Male History No

Female Math Yes

=1

Whether students like the movie Gladiator

Page 9: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

Example of Entropy

Gender Major Like

Male Math Yes

Female History No

Male CS Yes

Female Math No

Female Math No

Male CS Yes

Male History No

Female Math Yes

P(Major=Math & Like =Yes) = 0.25

P(Major=History & Like =Yes) = 0

P(Major=CS & Like =Yes) = 1

=1

=0

=0

Page 10: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

Example of Entropy

Gender Major Like

Male Math Yes

Female History No

Male CS Yes

Female Math No

Female Math No

Male CS Yes

Male History No

Female Math Yes

P(Gender=male & Like =Yes) = 0.75

P(Gender =female & Like =Yes) = 0.25

𝐸 (𝐿𝑖𝑘𝑒∨𝐺𝑒𝑛𝑑𝑒𝑟=𝑚𝑎𝑙𝑒 )=− 14

log ( 14 )−− 3

4log( 3

4 )𝐸 (𝐿𝑖𝑘𝑒∨𝐺𝑒𝑛𝑑𝑒𝑟= 𝑓𝑒𝑚𝑎𝑙𝑒 )=− 3

4log( 3

4 )−− 14

log( 14 )

Page 11: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

CSCI3230Introduction to Neural Network I

Liu Pengfei

Week 12, Fall 2015

Page 12: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

12

The Angelina Effect Angelia is a carrier of the

mutation in the BRCA1 gene.

Angelina Jolie's risk of having breast cancer was amplified by more than 80 percent and ovarian cancer by 50 percent.

Her aunt died from breast cancer and her mother from ovarian cancer.

She decided to go for surgery and announced her decision to have both breasts removed.

Can we use a drug for the treatment?

Page 13: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

13

Neural Network Project Topic: Protein-Ligand Binding Affinity Prediction Goal: Given the structural properties of a drug, you are helping a

chemist to develop a classifier which can predict how strong a drug (ligand) will bind to a target (protein).

Due date will be announced later Do it alone / Form a group of max. 2 students

(i.e. >= 3 students per group is not allowed.)

You can use any one of the following language: C, C++, Java, Swi-Prolog, CLisp, Python, Ruby or Perl.

However, you cannot use data mining or machine learning packages

Start the project as early as possible !

Page 14: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

14

How to register your group ?

Page 15: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

15

How to register your group ?

Page 16: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

16

What to include in your zip file ?Your zip file should contain the followings:

preprocess.c – your source code (if you use C) preprocessor.sh – a script file to compile your

source code trainer.c – your source code (if you use C) trainer.sh – a script file to compile your source

code best.nn – your Neural Network

Case Sensitive !

Page 17: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

17

Grading Please note that the neuron network

interfaces/formats are different from the previous years.

We adopts a policy of zero tolerance on plagiarism. Plagiarism will be SERIOUSLY punished.

To make the project estimation easier, we will evaluate your work based on the F-measure

Page 18: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

18

Model Evaluation Cost-sensitive measures

cba

a

pr

rpba

a

FNTP

TPca

a

FPTP

TP

2

22(F) measure-F

(r) Recall

(p)Precision

Predicted Class

ActualClass

Class = Yes Class = No

Class = Yes a (TP) b (FN)

Class = No c (FP) d (TN)

Page 19: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

20

Introduction to Neural Network

Page 20: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

21

Biological Neuron A neuron is an

electrically excitable cell that processes and transmits information through electrical and chemical signals.

A chemical signal occurs via a synapse, a specialized connection with other cells.

Page 21: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

22

Artificial Neuron An artificial neuron is a

logic computing unit.

In this simple case, we use a step function as the activation function: only 0 and 1 are possible outputs

Mechanism: Input:

Output: y = g(in)

Activation Function g(u)

Artificial Neuron

Step function

𝜃

Page 22: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

23

Example

If w1 = 0.5, w2 = -0.75, w3 = 0.8, and step function g (with threshold 0.2) is used as activation function, what is the output?

Summed input = 5(0.5) + 3.2(-0.75) + 0.1(0.8) = 0.18Output = g(Summed input)

Since 0.18 < 0.2, so Output = 0

w1

w2

w3

Activation Function g(u)

Step function

𝜃

5

3.2

0.1

Page 23: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

24

Artificial Neuron An artificial neuron is a

logic computing unit.

In this case, we use sigmoid function as activation function: real values from 0 and 1 are possible outputs

Mechanism: Input:

Output: y = g(in)

Activation Function g(z)

Artificial Neuron

Sigmoid function

𝑔 (𝑧 )= 1

1+𝑒− 𝑧

Page 24: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

25

Example

If w1 = 0.5, w2 = -0.75, w3 = 0.8, and sigmoid function g is used as activation function, what is the output?

Summed input = 5(0.5) + 3.2(-0.75) + 0.1(0.8) = 0.18Output = g(Summed input) = = 0.54488

Activation Function g(z)

Sigmoid function

𝑔 (𝑧 )= 1

1+𝑒− 𝑧

w1

w2

w3

5

3.2

0.1

Page 25: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

26

Logic Gate Simulation[0,1]

[0,1]

[0,1]

[0,1]

[0,1]

I1 I2 Total Input

t Input > t?

AND 0 0 0 1.5 0

AND 0 1 1 1.5 0

AND 1 0 1 1.5 0

AND 1 1 2 1.5 1

Function:Sum(input)>t

?

Page 26: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

27

Logic Gate Simulation[0,1]

[0,1]

[0,1]

[0,1]

[0,1]

I1 I2 Total Input

t Input > t?

OR 0 0 0 0.5 0

OR 0 1 1 0.5 1

OR 1 0 1 0.5 1

OR 1 1 2 0.5 1

Function:Sum(input)>t

?

Page 27: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

28

Logic Gate Simulation[0,1]

[0,1]

[0,1]

[0,1]

[0,1]

I1 I2 Total Input

t Input > t?

NOT 0 N/A 0 -0.5 1

NOT 1 N/A -1 -0.5 0

Function:Sum(input)>t

?

Page 28: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

29

Logic Gate Simulation

For the previous cases, it can be viewed as a classification problem:

separate the class 0 and class 1.The neuron simply finds a line to separate the two classes

And: I1+I2 – 1.5 = 0Or: I1+I2 – 0.5 = 0Xor: ?

I1

I2

Output

0 0 0

0 1 1

1 0 1

1 1 0

Page 29: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

30

Linear SeparabilityTwo classes ('+' and '-') below are linearly separable in two dimensions.

i.e we can find a set of w1, w2 such that:

Every point x belongs to class ‘+’ satisfy &Every point x belongs to class ‘-’ satisfy

Two classes ('+' and '-') below are linearly inseparable in two dimensions.

The above example would need two straight lines and thus is not linearly separable.

http://en.wikipedia.org/wiki/Linear_separability

Page 30: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

31

Artificial Neural NetworksImportant concepts:

What is a perceptron? What is a single-layer perceptron? What is a multi-layer perceptron?

What is the feed forward property?

What is the general learning principle ?

Page 31: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

32

Technical Terms Perceptron = Neuron

Single-layer perceptron = Single-layer neural network

Multi-layer perceptron = Multi-layer neural network

The presence of one or more hidden layer is the difference between single-layer perceptron and multi-layer perceptron.

The simplest kind of neural network is a single-layer perceptron network, which consists of a single layer of output nodes; the inputs are fed directly to the outputs via a series of weights.

Page 32: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

33

Multi-layer Perceptron Multi-layer

Input layer Hidden layer(s) Output layer

Feed-forward Links go in one

direction only

A MLP consists of multiple layers of nodes in a directed graph. Except for the input nodes, each node is a neuron.

The multilayer perceptron consists of three or more layers (an input and an output layer with one or more hidden layers).

http://en.wikipedia.org/wiki/Multilayer_perceptron

Page 33: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

34

Feed Forward Property Given inputs to input layer

(L0)

Outputs of neurons in L1 can be calculated.

Outputs of neurons in L2 can be calculated and so on.

Finally, outputs of neurons in the output layer can be calculated

Page 34: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

35

General Learning Principle1. For supervised learning,

we provide the model a set of inputs and targets.

2. The model returns the outputs

3. Reduce the difference between the outputs and targets by updating the weights

4. Repeat step 1-3 until some stopping criteria is encountered

Target Input

w1

w2

w3

Input

Input

Input

Output

Page 35: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

36

Summary1. We have learnt the similarity between the biological neurons

and artificial neurons

2. We have learnt the underlying mechanism of artificial neurons

3. We have learnt how artificial neurons compute logic (AND, OR, NOT)

4. We have learnt the meaning of perceptron, single-layer perceptron and multi-layer perceptron.

5. We have learnt how information is propagated between neurons in multi-layer perceptron (Feedforward property).

6. We have learnt the general learning principle of artificial neuron network.

Page 36: CSCI3230 Entropy, Information &Decision Tree Liu Pengfei Week 12, Fall 2015 1.

37

Announcements Written assignment 3 will be released this

week.

Neural network project specification will be released this week.

Since there will be no tutorial on next Thursday due to congregation ceremony, please attend one of the other two tutorial sessions on Wednesday.