Rejoinder to the discussion of ‘Adversarial risk analysis: Borel games’

3
Rejoinder (wileyonlinelibrary.com) DOI: 10.1002/asmb.891 Published online in Wiley Online Library Rejoinder to the Discussion of ‘Adversarial risk analysis: Borel games’ We thank Professors Kadane and Polson for their thoughtful reading and insightful comments. We appreciate the feedback and the suggestions. Regarding Kadane’s discussion, the historical perspective is important. There is a long history in this area, and strategic analysis has drawn the attention of some of the great mathematicians, psychologists, and philosophers. Our work in this area is a small pebble offered to a large intellectual edifice. As Kadane indicates, the core question, for someone who wants to use maximum expected utility as the solution concept in an adversarial game, is how to develop a marginal distribution over the actions of one’s opponent. Previous literature does not provide much guidance on processes for doing this. We certainly stipulate that there are many possible ways to approach this, but we believe that the mirroring argument has attractive plausibility and flexibility. The basic strategy in the mirroring argument is to build a specific model for the analysis the opponent is making. Then one uses subjective distributions on all the quantities that are unknown to the decision-maker. If one believes that the opponent is not doing the same, we call that a zero-order analysis. It is still possible that the decision-maker believes that the opponent is doing sophisticated modeling and analysis, but he does not believe that the opponent is building a model for the decision-maker’s analysis. A first-order analysis supposes that the opponent is strategic, and has built a model for the decision-maker’s thinking; in that case the decision-maker must maximize the expected utility for a structure with a nested model that describes the opponent’s model for the decision-maker. This can go further, if the decision-maker supposes that the opponent has a model that includes the decision-maker’s model for the opponent. Irresistibly, one is reminded of the dialogue from The Princess Bride: Vizzini: But it’s so simple. All I have to do is divine from what I know of you: are you the sort of man who would put the poison into his own goblet or his enemy’s? Now, a clever man would put the poison into his own goblet, because he would know that only a great fool would reach for what he was given. I am not a great fool, so I can clearly not choose the wine in front of you. But you must have known I was not a great fool, you would have counted on it, so I can clearly not choose the wine in front of me. Man in Black: You’ve made your decision then? Vizzini: Not remotely. Because iocane comes from Australia, as everyone knows, and Australia is entirely peopled with criminals, and criminals are used to having people not trust them, as you are not trusted by me, so I can clearly not choose the wine in front of you. Man in Black: Truly, you have a dizzying intellect. Vizzini: Wait till I get going! Now, where was I? And of course it is easy to satirize this kind of reasoning, but (to a limited and approximate degree) it seems to be the kind of thing that humans do. Stahl and Wilson (1995) report behavioral evidence for this, and in the context of poker, the whole basis for bluffing is contingent upon a nested model for the opponent’s analysis. Kadane’s discussion mentioned games with simultaneous play, and Polson’s discussion asked for more general guidance on when the ARA approach might be used. So perhaps an example from auctions would be responsive to both. Suppose that Apollo is engaged in a first-price sealed-bid auction with an opponent, say Daphne. Both are bidding for a first edition of the Theory of Games and Economic Behavior. In a zero-order analysis Apollo might suppose that Daphne’s bid will not take account of Apollo’s strategy. In that case Apollo could develop his subjective distribution F for Daphne’s bid d in many ways, but one natural choice would be to use the empirical c.d.f. of historical data on eBay bids for similar books. Then the maximum expected utility rule would have Apollo make the bid a 0 = argmax aR + (a a) F (a), where a is the true value of the first edition to Apollo. (The F (a) is the probability that Apollo’s bid wins, and a a is the profit he makes from that bid.) 92 Copyright © 2011 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2011, 27 92–94

Transcript of Rejoinder to the discussion of ‘Adversarial risk analysis: Borel games’

Page 1: Rejoinder to the discussion of ‘Adversarial risk analysis: Borel games’

Rejoinder

(wileyonlinelibrary.com) DOI: 10.1002/asmb.891 Published online in Wiley Online Library

Rejoinder to the Discussion of ‘Adversarial riskanalysis: Borel games’

We thank Professors Kadane and Polson for their thoughtful reading and insightful comments. We appreciate the feedbackand the suggestions.

Regarding Kadane’s discussion, the historical perspective is important. There is a long history in this area, and strategicanalysis has drawn the attention of some of the great mathematicians, psychologists, and philosophers. Our work in thisarea is a small pebble offered to a large intellectual edifice.

As Kadane indicates, the core question, for someone who wants to use maximum expected utility as the solution conceptin an adversarial game, is how to develop a marginal distribution over the actions of one’s opponent. Previous literaturedoes not provide much guidance on processes for doing this. We certainly stipulate that there are many possible ways toapproach this, but we believe that the mirroring argument has attractive plausibility and flexibility.

The basic strategy in the mirroring argument is to build a specific model for the analysis the opponent is making. Thenone uses subjective distributions on all the quantities that are unknown to the decision-maker.

If one believes that the opponent is not doing the same, we call that a zero-order analysis. It is still possible that thedecision-maker believes that the opponent is doing sophisticated modeling and analysis, but he does not believe that theopponent is building a model for the decision-maker’s analysis. A first-order analysis supposes that the opponent is strategic,and has built a model for the decision-maker’s thinking; in that case the decision-maker must maximize the expected utilityfor a structure with a nested model that describes the opponent’s model for the decision-maker. This can go further, if thedecision-maker supposes that the opponent has a model that includes the decision-maker’s model for the opponent.

Irresistibly, one is reminded of the dialogue from The Princess Bride:

Vizzini: But it’s so simple. All I have to do is divine from what I know of you: are you the sort of man who would putthe poison into his own goblet or his enemy’s? Now, a clever man would put the poison into his own goblet, because hewould know that only a great fool would reach for what he was given. I am not a great fool, so I can clearly not choosethe wine in front of you. But you must have known I was not a great fool, you would have counted on it, so I can clearlynot choose the wine in front of me.Man in Black: You’ve made your decision then?Vizzini: Not remotely. Because iocane comes from Australia, as everyone knows, and Australia is entirely peopled withcriminals, and criminals are used to having people not trust them, as you are not trusted by me, so I can clearly notchoose the wine in front of you.Man in Black: Truly, you have a dizzying intellect.Vizzini: Wait till I get going! Now, where was I?

And of course it is easy to satirize this kind of reasoning, but (to a limited and approximate degree) it seems to be the kindof thing that humans do. Stahl and Wilson (1995) report behavioral evidence for this, and in the context of poker, the wholebasis for bluffing is contingent upon a nested model for the opponent’s analysis.

Kadane’s discussion mentioned games with simultaneous play, and Polson’s discussion asked for more general guidanceon when the ARA approach might be used. So perhaps an example from auctions would be responsive to both.

Suppose that Apollo is engaged in a first-price sealed-bid auction with an opponent, say Daphne. Both are bidding for afirst edition of the Theory of Games and Economic Behavior. In a zero-order analysis Apollo might suppose that Daphne’sbid will not take account of Apollo’s strategy. In that case Apollo could develop his subjective distribution F for Daphne’sbid d in many ways, but one natural choice would be to use the empirical c.d.f. of historical data on eBay bids for similarbooks. Then the maximum expected utility rule would have Apollo make the bid

a0 =argmaxa∈R+

(a∗−a)F(a),

where a∗ is the true value of the first edition to Apollo. (The F(a) is the probability that Apollo’s bid wins, and a∗−a is theprofit he makes from that bid.)

92

Copyright © 2011 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2011, 27 92–94

Page 2: Rejoinder to the discussion of ‘Adversarial risk analysis: Borel games’

REJOINDER

In a first-order analysis, Apollo expects that Daphne will be modeling Apollo’s decision process. Depending on whatApollo thinks about her solution concept, he could ascribe any of several possible models to Daphne. One such modelassumes that Daphne is also an expected utility maximizer, and that she is using such a model for Apollo’s process.

As Kadane emphasizes, Apollo must find his marginal distribution F for Daphne’s bid. Given that, then Apollo maximizeshis expected utility by bidding a0 =argmaxa∈R+(a∗−a)F(a). But in this first-order analysis, he uses mirroring to modelDaphne’s bid as the solution to d0 =argmaxd∈R+(d∗−d)G(d); and symmetrically, where G is the distribution Daphnebelieves that Apollo’s bid will follow.

But Apollo cannot duplicate Daphne’s calculation since he neither know her value d∗ for the book, nor the value shethinks Apollo puts on the book, nor the value she thinks Apollo believes is her value for the book. As a Bayesian, Apollomust express his uncertainty on all three quantities through distributions.

The notation becomes complicated; the following key is helpful:

• a∗ is Apollo’s value for the book• D∗ is Daphne’s value for the book; since it is unknown to Apollo, he assigns it the distribution HD• A∗ is the random variable that Apollo thinks Daphne uses to represent Apollo’s value for the book; it has

distribution HA• F is Apollo’s belief about the distribution of Daphne’s bid.• G is Apollo’s inference about Daphne’s distribution on Apollo’s bid.• D0 is Daphne’s bid; A0 is Apollo’s bid from Daphne’s perspective.

These probabilities are all belong to Apollo; he imputes the beliefs that Daphne holds. If he is mistaken, he diminisheshis chance of maximizing his gain.

To determine his bid a0, Apollo needs F , the distribution of Daphne’s bid. He knows that Daphne’s bid D0 should satisfyD0 =argmaxd∈R+(D∗−d)G(d) where D∗ is Daphne’s value (a random variable, to Apollo) for the book and G is Apollo’sestimate of Daphne’s probability that a bid of d exceeds Apollo’s bid A0.

And, to Daphne, A0 =argmaxd∈R+(A∗−a)F(a) where A∗ is Daphne’s belief about Apollo’s value for the book and Fis Apollo’s estimate of Daphne’s probability that a bid of a exceeds her bid D0. Thus D0 ∼ F and A0 ∼G.

Apollo must find his personal belief about F by solving:

argmaxd∈R+

(D∗−d)G(d) ∼ F,

argmaxa∈R+

(A∗−a)F(a) ∼ G.

The distributions for D∗ and A∗ are HD and HA, respectively. Once Apollo has F , he solves a0 =argmaxa∈R+(a∗−a)F(a)to determine his bid.

To solve this system of equations, one iteratively alternates between the two equations until convergence:We believe that this algorithm converges, and are working to obtain a fixed-point theorem.We note that this kind of ARA framework allows Apollo to incorporate secret information. For example, suppose Apollo

alone knows that the book was owned by Sir Ronald Fisher, with annotations in his hand. In that case, his personal value a∗is high, but his distribution for Daphne’s value, HD , will concentrate on much smaller values. Apollo would make a slightlylarger bid than typical, and expect to win the book and make a handsome profit.

Similarly, he might know that Daphne knows the provenance of the book but thinks that Daphne believes, falsely, thatApollo does not. In that case HD will give concentrate on large values, but Apollo’s belief about what Daphne thinks ishis value for the book, HA, will concentrate on small values. Then Apollo should make a substantially larger bid, since heanticipates that Daphne would act as he would, and make a larger bid than typical. He still expects to win the book, but witha smaller profit.

In principle, one could go into an infinite regress: Apollo thinks that Daphne thinks that Apollo thinks that Daphne thinksthat . . .. But for most human reasoning, it is probably quite reasonable to stop at the third step, with the distribution HA forA∗, as described in the mirroring analysis.

Obviously, no human (except possibly Vizzini) reasons in this way. Nor do human use minimaxity, or Bayesian Nashequilibria, or any of the other formal approaches to strategic decisions. Cammerer (2002) suggests that behavioral studiesindicate that people use heuristic shortcuts and implicitly incorporate a cost for computation when making their mentalcalculations.

Even so, the outline of the calculation that Apollo makes seems to us similar to the analysis that a contractor might usein bidding on, say, building a bridge. The contractor would know exactly who his competitors were, and a great deal abouttheir financial situations, current obligations, and so forth. The contractor would also have a shrewd guess about what hiscompetitors know about his own business situation. In some heuristic and imprecise way, we believe that an experiencedcontractor would use some sort of ‘I think that he thinks that I think’ logic in developing his bid.

Copyright © 2011 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2011, 27 92–94

93

s

Page 3: Rejoinder to the discussion of ‘Adversarial risk analysis: Borel games’

REJOINDER

By a natural segue, this brings us back to Polson’s discussion of the computational methods. For the game with simul-taneous play, we find that the main computational question is whether there is a fixed point (F,G) to which the algorithmconverges. In another paper, regarding simultaneous routing games, Wang and Banks (2011) develop sufficient conditionsto ensure convergence, and the calculations are much like those that Polson suggests. For sequential play, such as the Borelgame treated in this paper, the calculations are lengthy and generally tedious, but the underlying principle is straightforward.As stated in the paper, symbolic computation would enable solutions for games with multiple players, multiple rounds ofbetting, and continuous bets—each of these possibilities is difficult for classical game theory, and the combination seemsnearly intractable (cf. Karlin and Restrepo, 1957; Ferguson and Ferguson, 2007).

In closing we entirely agree with both discussants: the minimax principle is too pessimistic, and a poor description ofhuman behavior. The ARA approach is rooted in expected utility maximization, and thus enjoys the advantages noted inKadane’s historical survey. Whether the ARA approach is a good reflection of human decision making is arguable—clearlyit is not something that anyone actually does, but we feel it may represent the kind of calculation that humans aspire to do.Chess players certainly engage in this kind of ‘if I do this then she’ll do that and I’ll do this ...’ thinking, and if the game hada random component (e.g. fairy chess) then it would closely match the logic in this paper. And the ARA method also allowsthe decision-maker to directly factor in beliefs about the opponent’s style, in the way that Polson believes that on-line pokerplayers bluff more aggressively than old-school green eyeshade players would do.

DAVID BANKS

FRANCESCA PETRALIA

Department of Statistical Science, Duke UniversityE-mail: [email protected]; [email protected]

SHOUQIANG WANG

Fuqua School of Business, Duke UniversityE-mail: [email protected]

94

Copyright © 2011 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2011, 27 92–94