The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington,...

43
The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC [email protected]

Transcript of The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington,...

Page 1: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

The Particle Swarm: Theme and Variations on

Computational Social Learning

James KennedyWashington, DC

[email protected]

Page 2: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

The Particle Swarm

A stochastic, population-based algorithm for problem-solving

Based on a social-psychological metaphor

Used by engineers, computer scientists, applied mathematicians, etc.

First reported in 1995 by Kennedy and Eberhart

Constantly evolving

Page 3: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

The Particle Swarm Paradigm is a Particle Swarm

A kind of program comprising a population of individuals that interact with one another according to simple rules in order to solve problems, which may be very complex.

It is an appropriate kind of description of the process of science.

Page 4: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Two Spaces

The social network topology, andThe state of the individual as a point in a Cartesian coordinate system

Moving point = a particleA bunch of them = a swarm

Note memes:“evolution of ideas”vs.“Changes in people who hold ideas”

Page 5: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Cognition as Optimization

Cognitive consistency theories, incl. dissonance

Feedforward Neural Nets

Minimizing or maximizing a function result by adjusting parameters

Parallel constraint satisfaction

Particle swarm describes the dynamics of the network, as opposed to its equilibrium properties

Page 6: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Mind and Society

Minsky: Minds are simply what brains do.

No: minds are what socialized human brains do.

… solipsism …

Page 7: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Dynamic Social Impact Theory

i=f(SIN)

Computer simulation – 2-d CA

Each individual is both a target and source of influence

“Euclidean” neighborhoods

Binary, univariate individuals

“Strength” randomly assigned

Polarization

Nowak, A., Szamrej, J., & Latané, B. (1990). From private attitude to public opinion: A dynamic theory of social impact. Psychological Review, 97, 362-376.

Page 8: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Particle Swarms: The Population

To understand the particle swarm, you have to understand the interactions of the particles

One particle is the stupidest thing in the world

The population learns

Every particle is a teacher and a learner

Social learning, norms, conformity, group dynamics, social influence, persuasion, self-presentation, cognitive dissonance, symbolic interactionism, cultural evolution …

Page 9: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Neighborhoods: Population topology

Gbest

Lbest

Particles learn from one another. Their communication structure determines how solutions propagate through the population.

(All N=20)

Page 10: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

von Neumann (“Square”) Topology

Regular, easy to understand, works adequately with a variety of versions – perhaps not the best for any version, but not the worst.

Page 11: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

The Particle Swarm: Pseudocode

Initialize Population and constantsRepeatDo i=1 to population sizeCurrentEvali = eval( )

If CurrentEval < pbesti then do

pbesti = CurrentEvali

For d=1 to Dimension pid =xid

Next d If CurrentEval i < Pbestgbest then gbest=i

End ifg = best neighbor’s indexFor d=1 to Dimension vid = W*vid + U(0, AC) × (pid – xid) + U(0, AC) × (pgd – xid)

xid = xid + vid

Next dNext iUntil termination criterion

x i

Page 12: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Pseudocode Chunk #1

CurrentEvali = eval( ) If CurrentEvali < pbesti then do pbesti = CurrentEvali For d=1 to Dimension pid =xid

Next d If CurrentEval i < pbestgbest then gbest=iEnd if

Can come at top or bottom of the loopIn gbest topology, g=gbestIt’s useful to track the population best

x i

Page 13: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Pseudocode Chunk #2

g = best neighbor’s indexFor d=1 to Dimension vid = W*vid + U(0, AC) × (pid – xid) + U(0, AC) × (pgd – xid)

xid = xid + vid

Next d

Constriction Type 1”: W=0.7298, AC=W*2.05=1.496Clerc 2006; W=0.7, AC=1.43Inertia: W might vary with time, in (0.4, 0.9), etc., AC=2.0 typically

Note three components of velocity

Vmax - not necessary, might help

Q: Are these formulas arbitrary?

Page 14: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Some Standard Test Functions

Sphere

Griewank

Rosenbrock

Rastrigin

Schaffer’s f6

Page 15: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Test Functions: Typical Results

Page 16: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Step-Size Depends on Neighbors

pi=0pg=0

pi=+2pg=-2

pi=+0.1pg=-0.1

Movement of the particle through the search space is centered on the mean of pi and pg on each dimension, and its amplitude is scaled to their difference.

Exploration vs. exploitation: automatic

… “the box” …

Page 17: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Search distribution – Scaled to Neighborhood

Q: What is the distribution of points that are tested by the particle?

Previous bests constantat +10

A million iterations

Page 18: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

“Bare Bones” particle swarm

x = G((pi + pg)/2, abs(pi – pg))

G(mean, s.d.) is Gaussian RNG

Simplified (!)

Works pretty well, but not as good as canonical.

Page 19: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Kurtosis

Tails trimmed

Not trimmed

Empirical observations with p’s held constant

Peaked -- fat tails

Page 20: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Kurtosis

High peak, fat tails

Mean moments of the canonical particle swarm algorithm with previous bests set at +20, varying the number of iterations.

IterationsMean S.D.

Skew-ness

Kurtosis

1,000 0.0970 37.7303 -0.0617 8.008

3,000 0.0214 41.5281 0.0814 18.813

10,000 -0.0080 41.6614 -0.0679 40.494

100,000 0.0022 41.7229 0.2116 170.204

1,000,000 0.0080 41.3048 0.3808 342.986

Page 21: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Bursts of Outliers

-60-50-40-30-20-10

01020304050

“Volatility clustering” seems to typify the particle’s trajectory

Page 22: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Adding Bursts of Outliers to Bare Bones PSO

Center = (pid + pgd)/2

SD = |pid - pgd|

xid = G(0,1)

if Burst = 0 and U(0,1)< PBurstStart then Burst = U(0, maxpower)

Else If Burst > 0 and U(0,1)< PBurstEnd then Burst = 0End If

If Burst > 0 then xid = xid ^ Burst

xid = Center + xid * SD

Sphere

-60

-50

-40

-30

-20

-10

0

10

20

Rosenbrock

3

5

7

9

11

13

15

17

Rastrigin

4.1

4.3

4.5

4.7

4.9

5.1

5.3

5.5

5.7

Griewank30

-5

-4

-3

-2

-1

0

1

2

3

4

5

Griewank10

-3.1

-2.6

-2.1

-1.6

-1.1

-0.6

-0.1

0.4

f6

-8

-7.5

-7

-6.5

-6

-5.5

-5

-4.5

-4

(Bubbled line is canonical PS)

Page 23: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

“The Box”Where the particle goes next depends on which way it was already going, the random numbers, and the sign and magnitude of the differences.

The area where it can go it crisply bounded,but the probability inside the box is not uniformly dense.

vid = W*vid +

U(0, AC) × (pid – xid) +

U(0, AC) × (pgd – xid)

xid = x id + v id

Page 24: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Empirical distribution of means of random numbers from different ranges

Simulate with uniform RNG, trim tails

Page 25: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

TUPS: Truncated-Uniform Particle Swarm

Start at current position: x(t)

Move weighted amount same direction: W1× (x(t) – x(t-1))

Find midpoint of extremes, and difference between them, on each dimension (the sides of the box)

Weight that, add it to where you are, call it the “center”

Generate uniformly distributed random number around the center, range slightly less than the length of the side

That’s x(t+1)

Page 26: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

TUPS: Truncated-Uniform Particle Swarm

x(t+1)=x(t) + W1× (x(t) – x(t-1)) + W2 × ((U(-1,+1) × (width/2.5)) + center) W1=0.729; W2=1.494

Width is difference between highest and lowest (p-x)

Center is width/2

Generates a point less than Width/2 from the center

Sphere

-23

-18

-13

-8

-3

2

0 1000 2000 3000

Canonical

TUPS

FIPS

Griewank30

-3.8

-2.8

-1.8

-0.8

0.2

1.2

2.2

0 500 1000 1500 2000 2500 3000

Griewank10

-2

-1.5

-1

-0.5

0

0 500 1000 1500 2000 2500 3000

Griewank10

-2

-1.5

-1

-0.5

0

0 500 1000 1500 2000 2500 3000

Rastrigin

1.4

1.6

1.8

2

2.2

2.4

2.6

0 500 1000 1500 2000 2500 3000

Rosenbrock

1.4

2.4

3.4

4.4

5.4

6.4

7.4

0 500 1000 1500 2000 2500 3000

f6

-3.2

-3

-2.8

-2.6

-2.4

-2.2

-2

0 500 1000 1500 2000 2500 3000

Page 27: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Binary Particle Swarms

S(x) = 1 / (1 + exp(-x))

v = [the usual]

if rand() < S(v) then x = 1else x = 0

Transform velocity with sigmoid function in (0 .. 1)

Use it as a probability threshold

Though this is a radically different concept, the principles of particle interaction still apply (because the power is in the interactions).

Page 28: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

FIPS -- The “fully-informed” particle swarm (Rui Mendes)

v(t+1) = W1 × v(t) + Σ(rand() × W2/K × (pk – x(t)))

x(t+1)=x(t)+v(t+1)

(K=number of neighbors, k=index of neighbor, W2 is a sum.)

Note that pi is not a source of influence in FIPS.Doesn’t select best neighbor.Orbits around the mean of neighborhood bests.This version is more dependent on topology.

Page 29: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Deconstructing Velocity

xid (t+1) = xid (t) + W1(xid (t) - xid(t-1)) + (rand()×(W2)×(pid - xid(t)) + (rand()×(W2)×(pgd - xid(t))

Because x(t+1) = x(t) + v(t+1)

we know that on the previous iteration,

x(t) = x(t-1) + v(t)

So we can find v(t)

v(t) = x(t) – x(t-1)

and can substitute that into the formula, to put it all in one line:

Page 30: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

xid (t+1) = xid (t) + W1(xid (t)- xid (t-1)) +

Σ(rand()×(W2/K)×(pkd - xid(t)))

Generalization and Verbal Representation

… or in words …

NEW POSITION = CURRENT POSITION + PERSISTENCE + SOCIAL INFLUENCE

We can generalize the canonical and FIPS versions:

Page 31: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Social Influence

has two components:

Central Tendency, and Dispersion

NEW POSITION= CURRENT POSITION + PERSISTENCE + SOCIAL CENTRAL TENDENCY + SOCIAL DISPERSION

… Hmmm, this gives us something to play with …!

Page 32: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

NEW POSITION= CURRENT POSITION + PERSISTENCE + SOCIAL CENTRAL TENDENCY + SOCIAL DISPERSION

Gaussian “Essential” Particle Swarm

Note that only the last term has randomness in it – the rest is deterministic

meanp=(pid + pgd)/2 disp=abs(pid – pgd)/2

xid (t+1)= xid (t) + W1(xid (t)- xid (t-1)) + W2*(meanp – xid) + G(0,1)*disp

G(0,1) is a Gaussian RNG

Page 33: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Gaussian Essential Particle Swarm

xid (t+1)= xid (t) + W1(xid (t)- xid (t-1)) + W2*(meanp – xid) + G(0,1)*disp

Function Trials Canonical Gaussian

F6 20 0.0015 13E-10

GRIEWANK 20 0.0086 0.0103

GRIEWANK10 20 0.0508 0.045

RASTRIGIN 20 56.862 49.151

ROSENBROCK 20 41.836 44.197

SPHERE 20 88E-16 38E-24

Page 34: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Various Probability Distributions

Function Trials Canonical Gaussian Triangular Double-Exponential

Cauchy

F6 20 0.0015 13E-10 0.0057 0.001 0.0029

GRIEWANK 20 0.0086 0.0103 0.0275 0.0149 0.0253

GRIEWANK10 20 0.0508 0.045 0.0694 0.0541 0.0768

RASTRIGIN 20 56.862 49.151 140.94 47.26 33.829

ROSENBROCK

20 41.836 44.19767.894 41.308 42.054

SPHERE 20 88E-16 38E-24 1.70E-18 2.40E-22 1.60E-17

There is clearly room for exploration here.

Page 35: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Gaussian FIPS

FIPScenter= mean of (pkd – xid)FIPSrange = mean of abs(pid - pkd)

xid= xid + W1 × (xid(t)-xid(t-1)) + W2 × FIPScenter + G(0,1) × (FIPSrange/2);

Fully-informed – uses all neighbors

Page 36: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Gaussian FIPS

Function Trials Canonical Gaussian FIPS

F6 20 0.0015 0.001

GRIEWANK 20 0.0086 0.0007

GRIEWANK10 20 0.0508 0.0215

RASTRIGIN 20 56.862 36.858

ROSENBROCK 20 41.836 40.365

SPHERE 20 88E-16 41E-29

Gaussian FIPS compared to Canonical PSO, square topology.

Page 37: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

t(38), alpha=0.05

func t p-value rank inv. rank

Newalpha

Sig.

SPHERE 3.17 0.0030 1 6 0.008333 *GRIEWANK10 2.99 0.0048 2 5 0.010000 *GRIEWANK 2.64 0.0118 3 4 0.012500 *RASTRIGIN 2.21 0.0333 4 3 0.016667 .

F6 0.45 0.6583 5 2 0.025000 .

ROSENBROCK 0.15 0.8782 6 1 0.050000 .

Gaussian FIPS vs. Canonical PSO

Ref: Jaccard, J. & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Thousand Oaks, CA: Sage Publications.

Modified Bonferroni correction

Page 38: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Mendes: Two Measures of Performance

Color and shape indicate parameters of the social network – degree, clustering, etc.

Page 39: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Best Topologies

Best-neighbor versions

FIPS versions

Page 40: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Worst FIPS Sociometries

and Proportions Successful

Page 41: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Understanding the Particle Swarm

Lots of variations in particle movement formulas

Teachers and learners

Propagation of knowledge is central

Interaction method and social network topology … interact

It’s simple, but difficult to understand

Page 42: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

In Sum

There is some fundamental process• Uses information from neighbors• Seems to require balance between “persistence” and “influence”

Decomposing a version we know is OK• We can understand it• We can improve it

Particle Trajectories• Arbitrary? -- not quite• Can be replaced by RNG (trajectory is not the essence)

How it works• It works by sharing successes among individuals• Need to look more closely at the sharing phenomenon itself

Page 43: The Particle Swarm: Theme and Variations on Computational Social Learning James Kennedy Washington, DC Kennedy.Jim@gmail.com.

Send me a note

Jim Kennedy

[email protected]