Applications of Information Theory in Science and in ... · Applications of Information Theory in...

Post on 01-Oct-2020

0 views 0 download

Transcript of Applications of Information Theory in Science and in ... · Applications of Information Theory in...

1 IST, Lisbon, 13-12-2016

Applications of Information Theory in Science and in Engineering

Mário A. T. Figueiredo

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAAAAAAAA

Some Uses of Shannon Entropy (Mostly) Outside of Engineering and Computer Science

2 IST, Lisbon, 13-12-2016

Communications: The Big Picture

Let’s focus here...

3 IST, Lisbon, 13-12-2016

What is entropy?

Source (discrete and memoryless)

X 2 {1, ..., N}

pi = P(X = i)

(bits/symbol)

H(X) = H(p1, ..., pN )

Shannon’s axioms (1948):

Continuity with respect to p1, ..., pN

Uncertainty grows with , if pi = 1/NN

Grouping doesn’t change uncertainty:

This entropy is the unique function satisfying these axioms.

A measure of uncertainty, i.e., lack of information

=

NX

i=1

pi log21

pi

4 IST, Lisbon, 13-12-2016

What did Shannon use entropy for?

Source Binary coder

…if the code is “20-questions” type (called instantaneous)

a 1/2 1

b 1/4 01

c 1/8 000

d 1/8 001

Optimal strategy/code: Huffman (1952)

a 1

0

?

bits/questions: first,

0

b 1

?

second,

c

d

1

0

?

third.

Optimal lengths:

' log

1

pi

Easy to extend to Markov sources

5 IST, Lisbon, 13-12-2016

Information and entropy in physics and other sciences

Entropy provides a measure of uncertainty/information

...information is a (the?) core concept in science!

At the very extreme, Wheeler (1989) claims:

"It from bit symbolises the idea (…) that what we call reality arises in the last analysis from the posing of yes-no questions (…); in short, that all things physical are information-theoretic in origin and this is a participatory universe."

“The universe is made of information; matter and energy are only incidental.”

6 IST, Lisbon, 13-12-2016

The impact of Shannon’s IT work

7 IST, Lisbon, 13-12-2016

The impact of Shannon’s IT work

Soon after Shannon, biologists started using IT:

S. Dancoff and H. Quastler (1953). "The Information Content and Error Rate of Living Things". Essays on the Use of Information Theory in Biology.

“How many bits have to go in there? And what is the informational content of that which produces these bits?”

The impact of information theory in the biological sciences is enormous!

...impossible to even scratch the surface in 10 minutes!

Sequence logos (Schneider & Stephens, 1990)

Information in evolution (Schneider 2000)

8 IST, Lisbon, 13-12-2016

(2007)

9 IST, Lisbon, 13-12-2016

(2014)

“For cells, we now know that (…) information is stored in a cell’s inherited genetic material, and is precisely the kind that Shannon described in his 1948 articles.”

“Information is the currency of life. One definition of information is the ability to make predictions with a likelihood better than chance. That’s what any living organism needs to be able to do, because if you can do that, you’re surviving at a higher rate.”

10 IST, Lisbon, 13-12-2016

“Shannon’s entropy-based diversity is the standard for ecological communities. The exponentials of Shannon’s and the related “mutual information” excel in their ability to express diversity intuitively, and provide a generalised method of considering microscopic behaviour to make macroscopic predictions, under given conditions.”

11 IST, Lisbon, 13-12-2016

12 IST, Lisbon, 13-12-2016

Other uses of Shannon’s entropy...

13 IST, Lisbon, 13-12-2016

14 IST, Lisbon, 13-12-2016

(2014)

“The application of entropy in finance can be regarded as the extension of the information entropy and the probability entropy. It can be an important tool in portfolio selection and asset pricing.”

15 IST, Lisbon, 13-12-2016

16 IST, Lisbon, 13-12-2016

“As an example of the utility of information theory to the analysis of animal communication systems, we applied a series of information theory statistics to a statistically categorized set of bottlenose dolphin whistle vocalizations.”

“Information theory measures (Shannon 1948; …) provide (…) tools for examining and comparing communication systems across species.”

17 IST, Lisbon, 13-12-2016

Yet another application of the Shannon entropy...

“We introduce a new quantity to petrology, the Shannon entropy, as a tool for quantifying mixing as well as the rate of production of hybrid compositions in the mixing system.”

18 IST, Lisbon, 13-12-2016

19 IST, Lisbon, 13-12-2016

What if we can’t estimate probabilities?

Markov source Lempel-Ziv coder

Asymptotically optimal, but adaptive/universal (Lempel & Ziv, 1977, 1978)

LZ coding exploits the redundancy of language (self-parsing):

LZ coding thus yields an entropy estimate, without estimating probabilities

LZ coding also closely related to Kolmogorov complexity

20 IST, Lisbon, 13-12-2016

One step further: comparing two sources/texts

Question: was this sequence produced by that source? e.g. was this text, produced by that author? (authorship attribution)

Compress a sequence, with respect to another (cross-parsing) (Ziv & Merhav, 1993)

ZM method (ZMM) coding thus yields a relative entropy estimate (dissimilarity)

Authorship attribution (Pereira-Coutinho & F, 2013)

21 IST, Lisbon, 13-12-2016

Summary:

•  The Shannon entropy, as a measure of information, has a huge impact, both inside and outside of its original field of application (communications)

•  However, beware the bandwagon...

(from L. Thims, 2012)