The Origin of Entropy
description
Transcript of The Origin of Entropy
![Page 1: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/1.jpg)
The Origin of EntropyRick Chang
![Page 2: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/2.jpg)
TEIL
@ N
TU
Agenda• Introduction• Reference• What is information?• A straight forward way to derive the form of
entropy• A mathematical way to derive the form of
entropy• Conclusion
2
![Page 3: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/3.jpg)
TEIL
@ N
TU
Introduction• We use entropy matrices
to measure dependencies of any pairs of genes, but why ?
• What is entropy?
3
![Page 4: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/4.jpg)
TEIL
@ N
TU
Introduction – cont.• I will :
try to explain what information, entropy are
• I will not :tell you how entropy is related to GA - I don’t know (may be a future work)
4
![Page 5: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/5.jpg)
TEIL
@ N
TU
References• A mathematical theory of communication
By C.E. Shannon 1949 part I , Appendix 2
• Information theory, Inference, and learning algorithms
By David J.C MacKay 2003 chapter 1, 4
• Information theory and reliable communication
By Robert G. Gallager 1976 chapter 2
5
![Page 6: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/6.jpg)
TEIL
@ N
TU
Shannon 1916 ~ 2001
6
![Page 7: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/7.jpg)
TEIL
@ N
TU
What is information?• Ensemble • The outcome x is the value of a random
variable, which takes on one of a set of possible values,
having probabilities
with and
7
( ) 1i x
ia AP x a
![Page 8: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/8.jpg)
TEIL
@ N
TU
What is information?
8
![Page 9: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/9.jpg)
TEIL
@ N
TU
What is information?
9
• Hartley R. V. L. “Transmission of Information “ :If the number of messages in the set is finite then this number or any monotonic function of this number can be regarded as a measure of the information produced when one message is chosen from the set, all choices being equally likely.
![Page 10: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/10.jpg)
TEIL
@ N
TU
A Straight forward way• When we try to measure the influence of
event y to event x, we may consider
10
> 1 : when occurrence of event y increase our belief of event x
= 1 :event x and y are independent
< 1
![Page 11: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/11.jpg)
TEIL
@ N
TU
A Straight forward way – cont.• We define the information provided about the
event x by the occurrence of event y is
11
> 0 : when appearance of event y increase our belief of event x
= 0 : event x and y are independent
< 0
![Page 12: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/12.jpg)
TEIL
@ N
TU
Why use logarithmic?• More convenient1. practically more useful
2. nearer to our intuitive feeling
we intuitively measures entities by linear comparison
3. mathematically more suitableMany of the limiting operations are simple in terms of the
logarithm
12
![Page 13: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/13.jpg)
TEIL
@ N
TU
Mutual information
= I (y ; x)
13
Mutual information between event x and event y
![Page 14: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/14.jpg)
TEIL
@ N
TU
Mutual information – cont.• Mutual information => use logarithmic to quantify the difference between the belief of event x given event y and the belief of event x
=> the amount of uncertainty of event x we can resolve after the occurrence of event y
14
![Page 15: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/15.jpg)
TEIL
@ N
TU
Self-information• Consider an event y, p(x | y) = 1
=> the amount of uncertainty of event x we resolve after we know event x will certainly occur
=> the priori uncertainty of the event x
• Define Self-information of event x
15
![Page 16: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/16.jpg)
TEIL
@ N
TU
Intuitively
16
We know everything about the system
Our priori knowledge about event x
Information about the system
![Page 17: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/17.jpg)
TEIL
@ N
TU
Intuitively – cont.
17
We know everything about the system
Our priori knowledge about event x
After we know event x will certainly occur
Information about the system
![Page 18: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/18.jpg)
TEIL
@ N
TU
Intuitively – cont.
18
Information of event x
Information about the system
Uncertainty of event x
![Page 19: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/19.jpg)
TEIL
@ N
TU
Conditional Self-information• Same, define conditional self-information of
event x, given the occurrence of event y
• We now have
19
( | )( ; ) log( ) log( ( | )) log( ( ))( )
( ) ( | )
p x yI x y p x y p xp x
I x I x y
![Page 20: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/20.jpg)
TEIL
@ N
TU
Intuitively – cont.
20
We know everything about event x (we know event x will certainly occur)
Our priori knowledge about event x
After the occurrence of event y
Information about event x
![Page 21: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/21.jpg)
TEIL
@ N
TU
Intuitively – cont.
21
Mutual Information between event x and event y
Information about event x
![Page 22: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/22.jpg)
TEIL
@ N
TU
A Straight Forward Way – cont. • Like above, define self-information of event x and event y
• We now have
22
( , ) ( ) ( | )I x y I x I y x ( )( | )( )
p x yp y xp x
( , ) ( | ) ( ) ( ) ( ) ( ; )I x y I y x I x I x I y I x y ( ; ) ( ) ( | )I x y I y I y x
![Page 23: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/23.jpg)
TEIL
@ N
TU
A Straight Forward Way – cont.
• The uncertainty of event y is never increased by knowledge of x
23
( ) ( ) ( , ) ( ) ( | )I x I y I x y I x I y x
( ) ( | )I y I y x
![Page 24: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/24.jpg)
TEIL
@ N
TU
From instance to expectation• I(x;y)
• I(x)
• I(x|y)
• I(x,y)
• I(x;y)=I(x)-I(x|y)
• I(x,y)=I(x)+I(y)-I(x;y) 24
• I(X;Y)
• H(X)
• H(X|Y)
• H(X,Y)
• I(X;Y)=H(X)-H(X|Y)
• H(X,Y)=H(X)+H(Y)-I(X;Y)
Average
![Page 25: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/25.jpg)
TEIL
@ N
TU
Relationship
25
H(X,Y)
H(X)
H(Y)
H(X|Y) I(X;Y) H(Y|X)
![Page 26: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/26.jpg)
TEIL
@ N
TU
Entropy• The entropy of an ensemble is defined to be
the average value of the self-information of all event x
26
1
1( ) ( ) log( )
n
i
H X p xp x
Average priori uncertainty of an ensemble
![Page 27: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/27.jpg)
TEIL
@ N
TU
Interesting Properties of H(X)• H = 0 if and only if all the but one are zero,
this one having the value unity. Thus only when we are certain of the outcome does H vanish. Otherwise H is positive.
• For a given n, H is a maximum and equal to log(n) when all the are equal, i.e., . This is also intuitively the most uncertain situation.
• Any change toward equalization of the probabilities
, …, increases H. 27
![Page 28: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/28.jpg)
TEIL
@ N
TU
A mathematical way• Can we find a measure of how uncertain we
are of an ensemble ?
• If there is such a measure, say, it is reasonable to require of it the following properties:1. H should be continuous in the 2. If all the are equal, =1/n, then H should be a
monotonic increasing function of n. 3. If a choice be broken down into two successive
choices, the original H should be the weighted sum of the individual values of H. 28
![Page 29: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/29.jpg)
TEIL
@ N
TU
A mathematical way – cont.3. If a choice be broken down into two successive
choices, the original H should be the weighted sum of the individual values of H.
29
1 1 1 1 1 1 2 1 1( , , ) ( , ) ( , ) (1)2 3 6 2 2 2 3 3 2
H H H H
Second choice occurs half the
time
![Page 30: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/30.jpg)
TEIL
@ N
TU
A mathematical way – cont.• Theorem: The only H satisfying the three above
properties is of the form:
30
1
1logn
ii i
H K pp
![Page 31: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/31.jpg)
TEIL
@ N
TU
A mathematical way – cont.• Proof: Let From property(3) we can decompose a choice from equally likely possibilities into a series of m choices from s equally likely possibilities and obtain
31
1 1 1( , ,..., ) ( )H A nn n n
mA(s ) ( )mA s
𝑠𝑚 𝑠𝑠
𝑠m
A(s)
![Page 32: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/32.jpg)
TEIL
@ N
TU
A mathematical way – cont.• Similarly• We can choose n arbitrarily large and find an m to
satisfy
32
nA(t ) ( )nA t
m n m+1s st
log log ( 1) loglog 1log
log , is arbitrarily small (1)log
m s n t m sm t mn s n
m tn s
![Page 33: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/33.jpg)
TEIL
@ N
TU
A mathematical way – cont.• from the monotonic property of A(n)
33
m n m+1 A(s ) ( ) (s )(s) ( ) ( 1) (s)
( ) 1( )
( ) , is arbitrarily small (2)( )
A t AmA nA t m Am A t mn A s n
m A tn A s
![Page 34: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/34.jpg)
TEIL
@ N
TU
A mathematical way – cont.• From equation (1) and (2)
• We get A(t) = K log(t) , K must be positive to satisfy property (2)
34
( ) log( ) 2 , is arbitrarily small( ) log( )A t tA s s
![Page 35: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/35.jpg)
TEIL
@ N
TU
A mathematical way – cont.• Now suppose we have a choice from n possibilities
with commeasurable probabilities where all are integers.
• We can break down a choice from possibilities into a choice from n possibilities with probabilities and then, if the was chosen, a choice from with equal probabilities.
35𝑛
𝑛1
𝑛𝑖
∑𝑛𝑛𝑖 𝑛2
![Page 36: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/36.jpg)
TEIL
@ N
TU
A mathematical way – cont.• Using property (3) again, we equate the total
choice from as computed by two methods
36
𝑛𝑛1
𝑛𝑛
∑𝑛𝑛𝑖 𝑛𝑖
1log ( ,..., ) ( log )i n i in n
K n H p p K p n
![Page 37: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/37.jpg)
TEIL
@ N
TU
A mathematical way – cont.• Hence
• If the pi are not commeasurable, they may be approximated by rational and the same expression must hold by our continuity assumption (property(1) ).
• The choice of coefficient K is a matter of convenience and amounts to the choice of a unit of measure.
37
1( ,..., )
1lo
[ log log ]
lo gg in
n i i i in n n
i
ni
i
n i
H p p K p n p n
K pp
nK pn
![Page 38: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/38.jpg)
TEIL
@ N
TU
Conclusion• We first use a intuitive method to measure
information content of an event or an ensemble• We explain why we choose logarithm
intuitively • Mutual information, entropy is introduced• We show the relationship between
information content and uncertainty• At last, we set three assumptions and derive
the only way to measure information content and show that logarithm must be adopted. 38
![Page 39: The Origin of Entropy](https://reader036.fdocuments.us/reader036/viewer/2022062410/56815fe8550346895dceed38/html5/thumbnails/39.jpg)
TEIL
@ N
TU
Thanks39