Measurement. xxx n : 2 Measurement Recap: emergence and self-organisation are associated with...

22
Measurement

Transcript of Measurement. xxx n : 2 Measurement Recap: emergence and self-organisation are associated with...

xxx n : 2

Measurement

• Recap: emergence and self-organisation are associated with complexity

• Can we identify systems that are complex?

• Can we distinguish types of levels of complexity?

• Intuitively, complexity is related to the amount of information needed to describe a system Need to measure information

entropy, information theoretic measures, …

xxx n : 4

Useful concepts: logarithms

• definition

• properties

• changing base

logybx b x y

lg( ) lg lg

lg( ) lg ; lg(1/ ) lg ; 0 lg 0 0y

xy x y

x y x x x

102

10

log loglog ; log

log log 2c

bc

x xx x

b

xxx n : 5

Useful concepts: probability (1)

• finite set X of size N, with elements xi

also covers infinite sets (eg integers), but gets trickier

• p(x) = probability that an element drawn at random from X will have value x coin: X = { , }; N = 2; p() = p() = 1/2

die: X = { 1, 2, 3, 4, 5, 6 }; N = 6; p(n) = 1/6 uniform probability: p(x) = 1 / N

• constraints on the function p

• average value of function f (x) :

( ) 0; ( ) 1x X

x p x p x

( ) ( ) ( )

x X

f x p x f x

xxx n : 6

Useful concepts: probability (2)

• example: sum of the values of 2 dice:

• X = { 2, …, 12 }; N = 11

1

2

3

4

5

6

2 3 4 5 6 7

8

9

10

11

12

2 3 4 51 6

3 4 5 6 7

84 5 6 7

985 6 7

10986 7

1110987

x 2 3 4 5 6 7 8 9 10 11 12

p(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/35 3/36 2/36 1/36

xxx n : 7

Information: Shannon entropy H

• Shannon was concerned about extracting information from noisy channels

• Shannon’s entropy measure relates to the number of bits needed to encode elements of X

For a system with unbiased probability, this evaluates to:

Entropy trivially increases with the number of possible states entropy

H = the amount of “information”

• Matlab code where p is a Matlab vector of the probabilities

2( ) ( ) log ( )x X

H X p x p x

H = - dot(p,log2(p));

C. E. Shannon. A Mathematical Theory of Communication. Bell System Technical Journal 27, 1948

2( ) log ( )H X p x

xxx n : 8

Shannon entropy : examples

• coin: N = 2, H = log2 2 = 1 You can express the information conveyed by tossing a coin

in one bit

• die: N = 6, H = log2 6 2.585 : < three bits note that H does not have to be an integer number of bits

• 2 dice: N = 11 for uniform probabilities : log2 11 3.459

can devise an encoding (“compression”) that uses fewer bits for the more likely occurrences, more bits for the unlikely ones: H 3.274

2( ) log ( )H X p x

xxx n : 9

Choosing how to measure entropy

• Any probability distribution can be used to calculate entropy

• Some give more meaningful answers than others

• Consider the entropy of Cas Look first at entropy of single sites, average entropy over

sites, and then at tiles

xxx n : 10

CA states : average (1)

• entropy of a “tile” of N sites, k states per site entropy of single site i

pi(s) = probability that site i is in state s

use average, over all N sites, if tiles of different sizes are to be compared

e.g. random: H = 1 (for k = 2), independent of N this average entropy is measured in “bits per site”

1

20

( ) log ( )k

i i is

H p s p s

1

21 1 0

1 1( ) log ( )

N N k

i i ii i s

H H p s p sN N

xxx n : 11

CA states : average (2)

• (a) random : each site randomly on or off pi(s) = ½ ; Hi = 1 ; H = 1 : 1 bit per site

• (b) semi-random : upper sites oscillating, lower random pi(s) = ½ ; Hi = 1 ; H = 1 : 1 bit per site

• (c) oscillating pi(s) = ½ ; Hi = 1 ; H = 1 : 1 bit per site

• so, not measure of any structure

xxx n : 12

CA states : tile (1)

• whole tile entropy k states per site, kN states per tile

p(s) = probability that the whole tile is in state s divide by N to get entropy in “bits per site”

• (a) random : each site randomly on or off p(s) = 1/16 ; H = 1 : 1 bit per site, or 4 bits for the tile

1

20

1( ) log ( )

Nk

s

H p s p sN

N. H. Packard, S. Wolfram. Two-Dimensional Cellular Automata. J. Statistical Physics. 38:901-946, 1985

xxx n : 13

CA states : tile (2)

• (b) semi-random : upper sites oscillating, lower random

8 states never occur, p = 0

the other 8 states are equally likely, p = ⅛

¾ bit per site, or 3 bits for the tile

• (c) oscillating only 2 states occur, with p = ½

¼ bit per site, or 1 bit for the tile

3 31 1 1 12 24 8 8 4 48 log log 2H

11 1 1 1 12 24 2 2 4 42 log log 2H

1

20

1( ) log ( )

Nk

s

H p s p sN

xxx n : 14

CA entropy on rules (1)

• N = number of neighbourhoods (entries in rule table)

• pit = proportion of times rule table entry i is accessed

at timestep t

• define the rule lookup entropy at timestep t to be

• initial random state, equal probabilities, max entropy long term behaviour depends on CA class

1

logN

t t ti i

i

H p p

A. Wuensche. Classifying cellular automata automatically. Complexity 4(3):47-66, 1999

xxx n : 15

CA entropy on rules (2)

• ordered only a few rules accessed : low entropy

• chaotic all rules accessed uniformly : high entropy

• complex non-uniform access : medium entropy fluctuations : high entropy variance use this to detect complex rules automatically

xxx n : 16

key properties of entropy H

• H = 0 one of the p(xi) = 1 if one value is “certain” (and hence all the other p(xj) are zero)

the minimum value H can take can add zero probability states without changing H

• if p is uniform, then H = log2 N if all values are equally likely

the maximum value H can take uniform p “random” maximum “information”

so if p is not uniform, H < log2 N

• H is extensive (additive) for independent joint events [next lecture]

• is the only function (up to a constant) that is uniform, extensive, and has fixed bounds

( ) lg ( )p x p x

xxx n : 17

designing an entropy measure

• design a measure p(x) analogous to a probability mathematical properties

non-negative values sum to one

entropic properties p uniform when maximally disordered / random / chaotic one p is close to one, the others all close to zero, when

maximally ordered / uniform

• use this p to define an associated entropy

• validate : check it “makes sense” in several scenarios

2( ) ( ) log ( )x X

H X p x p x

xxx n : 18

Example: Entropy of a flock

• N particles in 2D, indexed by i at time t, positions and velocities

relative to the flock mean position and velocity

• form matrix F of these coordinates over T timesteps

readily extendable to 3 (or more!) dimensions

( , )i it tx y

1 1 1 11 1 1 1 1 1 1 1

1 1 1 1

N N N N

N N N NT T T T T T T T

x y x y x y x y

F

x y x y x y x y

( , )i it tx y

W. A. Wright, R. E. Smith, M. Danek, P. Greenway. A Generalisable Measure of Self-Organisation and Emergence. ICANN 2001

xxx n : 19

aside: singular value decomposition (SVD)

• singular value decomposition “factorises” a matrix

S is a diagonal matrix; is the vector of singular values constructed from its diagonal elements

U and V are unitary matrices – “rotations”, orthogonal basis

• these singular values indicate the “importance” of a special set of orthogonal “directions” inthe matrix F the semi-axes of the hyperellipsoid

defined by F if all directions are of equal importance

(a hypersphere), all the singular values are the same more important directions have larger singular values

generalisation of “eigenvalues” of certain square matrices

s1

s2

s3

TF U SV

xxx n : 20

svd of a flock

• calculate vector , of the singular values of F singular values are non-negative : i 0

normalise them to sum to one : i = 1

use them as the “probability” in an entropy measure:

• Matlab code

21

logT

i i

i

H

sigma = svd(F); sigma = sigma/sum(sigma); entropy = - dot(sigma,log2(sigma));

xxx n : 21

ON ENTROPY OF FLOCKS

• Many ways to analyse flock entropy At boid level or at flock level Over time, over space

• Possible questions: Can entropy identify formation of a flock? Can entropy distinguish forms of social behaviour?

• Several projects have explored these questions Identification of flock formation is hard

The entropy of free boids distorts the results Might be able to distinguish ordered behaviours from

“random” swarms

P. Nash, MEng project, 2008: http://www.cs.york.ac.uk/library/proj_files/2008/4thyr/pjln100/pjln100_project.pdf

xxx n : 22

Is entropy always the right measure?

• Unpublished experiments on flock characteristics YCCSA Summerschool, 2010: Trever Pinto, Heather Lacy, Stephen Connor, Susan Stepney, Fiona Polack

• statistical measures of spatial autocorrelation are at least as good as entropy At least for identifying turning and changes in flocking

characteristics

• e.g. the C measure related to Geary’s C ratio where rpp is the minimum distance between two distinct boids

(nearest neighbour distance)

and rlp is the minimum distance from a random point to a given

boid For random locations, distances are equal, C = 1 Clustering indicated by C > 1

• Spatial autocorrelation can analyse clustering in any 2D space e.g. clustering in position or clustering in velocity

xxx n : 23

Entropy as fitness of interesting systems

• Can we evolve CA states for complex behaviour? fitness function uses tile entropy of 3x3, and

15x15 tiles low entropy = regular behaviour high entropy = chaotic behaviour want regions of regularity, and regions of chaos,

and “edge of chaos” regions

i.e. maximum variation of entropy across the grid

• fitness = variance in entropy Ht over NT

tiles (mean square deviation from the average)

21t t

tT

f H HN

D. Kazakov, M. Sweet. Evolving the Game of Life. Proc Fourth Symposium on Adaptive Agents and Multi-Agent Systems (AAMAS-4), Leeds, 2004