Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department,...

24
Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas Maximum Entropy Correlated Equilibria by L. Ortiz, R. Schapire and S. Kakade

Transcript of Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department,...

Page 1: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Course:

Applications of Information Theory to Computer Science

CSG195, Fall 2008CCIS Department, Northeastern University

Dimitrios Kanoulas

Maximum EntropyCorrelated Equilibria

by L. Ortiz, R. Schapire and S. Kakade

Page 2: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Maximum Entropy Correlated Equilibria

Page 3: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Information Theory

Maximum Entropy Correlated Equilibria

Page 4: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Information Theory Algorithmic Game Theory

Maximum Entropy Correlated Equilibria

Page 5: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

AlgorithmicGame Theory

Page 6: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Definitions

Game Theory:

Studies the behavior of players in competitive and collaborative situations

[Christos Papadimitriou in SODA 2001]

Page 7: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Game(example: Road Intersection)

Problem (Game):Two cars, a red and a white one [players of the game] get to a road intersection without traffic light, at the same time.Each driver decides to stop (S) or go (G) [two pure strategies of the game]

Payoffs for red/white car are defined from the matrix:

red car

white car

S G

S (1,1) (0,5)

G (5,0) (-10,-10)

GOAL for each player: Maximize his payoff

Page 8: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Game(example: Road Intersection)

Equilibrium in a Game:

Each player picks a strategy such that:no one wants to unilaterally deviate from this.

Page 9: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Game(example: Road Intersection)

Payoff matrix

red car

white car

S G

S (1,1) (0,5)

G (5,0) (-10,-10)

Nash Equilibria:(1)White car stops && Red car goes (pure NE)(2)Red car stops and White car goes (pure NE)(3)Both cars with 1/2 go and 1/2 stop (mixed NE)

Page 10: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Game(example: Road Intersection)

John Nash Movie: Beautiful Mind

Page 11: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Game(example: Road Intersection)

There always exists a mixed strategy Nash Equilibrium.

Page 12: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Game(example: Road Intersection)

red car

white car

SIGNAL

Red Light

GreenLight

Red Light

0 0.5

GreenLight

0.5 0

Correlated Equilibria:• The suggestion to go if you see green light and stop if you see red.

[Mixture of two NE. For each car: ½ to go and ½ to stop]

There is a traffic light that suggest individually to the cars:

Page 13: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Quote

The general problem of equilibrium computationis fundamental in Computer Science

[Christos Papadimitriou]

Page 14: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Some DefinitionsGame:

Players • A set of n players

Pure Strategies

• Each player i has a set Ai of pure strategies (actions)• Joint-action space A = (a1, . . . , an), where player i plays ai strategy.• A-i is the joint-action space of all players except player i and a-i is the joint-action in A-i

Payoffs Payoff matrix Mi: player i’s payoff Mi(a1, . . . , an)An -> Real NumbersFormally: payoff = Σa-i P(a-i|ai) Mi(ai, a-i)

Mixed Strategies

• Each player i could have a probability distribution P over Ai, which is called mixed strategy.• P(ai): the mixed strategy of player i• P(a-i|ai): the condinional joint mixed strategy of all players except i given the action of player i

Page 15: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Some DefinitionsEquilibrium:

Every player is “happy” by playing a [pure or mixed] strategy, which means that he cannot increase his payoff by unilaterally deviate from his strategy.

Correlated equilibrium (CE): A joint probability distribution P(a1, . . . , an) such that:• Every player individually receives “suggestion” from P• Knowing P, players are happy with this “suggestion” and don’t want to deviate from this.

Nash Equilibria (NE):Is a special case of CE: P a product distribution -> P = Π P(ai)NE always exists but the problem of finding a NE is hard even for a 2-players game.

[Chen & Deng]

Page 16: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Good vs. Bad Equilibria

Is the equilibrium “good” or “bad” ?

What if I want to add some properties to my equilibrium ?

Page 17: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Connection between:

Algorithmic Game TheoryAND

Information Theory

Page 18: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Maximum EntropyCorrelated Equilibria

[MaxEnt CE]

• In a game we have at least one correlated equilibrium P.

• P is the joint mixed strategy

• Given P, let H(P) = Σa in A P(a) ln(1/P(a)) be its (Shannon) entropy.

Page 19: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

The property of this Equilibrium

Changed Game:A player is willing to negotiate and agree to some form of “joint” strategywith the other players.

BUTAt the same time, the player wants to try to hide as much as he can his own behavior, by making it difficult to predict.

OR

We want to suggest a joint strategy that satisfies all the players but complicates their prediction of each others’ individual strategies

Page 20: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

The property of this Equilibrium

The conditional entropy in information theory providesa measure of the predictability of a random process from another

The larger the conditional entropy, the harder the prediction.[Cover and Thomas]

Ai : the strategy of player i (random variable) A−i: the strategy of the rest of the players (random variable)

P(ai|a−i): the conditional mixed strategy where: player i picks ai given that the rest of the

players pick a−I

Page 21: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

The property of this Equilibrium

HAi|A−i (P) = − Σ a−i in A−i P(a−i) Σ ai in Ai P(ai|a−i) logP(ai|a−i)

the conditional entropy of the strategy of player i given the strategies of the rest of the playersSO

the larger the conditional entropy,the harder the prediction.

Page 22: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

The property of this Equilibrium

MaxEnt CE:is the joint mixed strategy P* = arg maxP in CE HAi|A−i (P)

[The probability distribution over the strategies which give a CE such that maximizes its entropy]

MaxEnt CE satisfies all the players and maximizes the hardness of predictions.

Page 23: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Additional Info

• There are some other interesting properties of MaxEnt CE, which have to do with the representation of this CE, which is much better than a arbitrary CE

• It is proposed two algorithms that converges to a MaxEnt CE and uses LP to solve the maximization problem we have

• There is also an other algorithm to compute MaxEnt CE which uses a method that in each iteration each player “learn” from the previous iteration and updates his payoff. That also converge to a MaxEnt CE [but not to a NE]

Page 24: Course: Applications of Information Theory to Computer Science CSG195, Fall 2008 CCIS Department, Northeastern University Dimitrios Kanoulas.

Thank You

A mathematician is a devicefor turning coffee into theorems.~Paul Erdos