# Temptation and Subjective Feasibility - Semantic …€¦Temptation and Subjective Feasibility -...

### Transcript of Temptation and Subjective Feasibility - Semantic …€¦Temptation and Subjective Feasibility -...

Temptation and Subjective Feasibility

Madhav Chandrasekher∗

Current Draft: February 26, 2014

Abstract: This paper develops a new model of menu choice induced by temptation. Thepresence of temptation creates a utility cost in that choices cannot be recovered by max-imization of consumption utility. However, in our model the cost of temptation is bornethrough constraints on the DM’s choice domain as opposed to an explicit cost function.The idea is that, within menu, objects are grouped together by attribute and the DM isonly able to select an attribute possessed by a target choice, as opposed to selecting thechoice itself. For example, when the DM looks at a dinner menu he can partly controlhimself by looking at the low-calorie portion of the menu, but within this subset he mightbe drawn to the most tempting option. The main feature of our model is the manner inwhich choices are grouped by attributes, we refer to these as categories. Categories aresubjective, so that the exercise of the paper is to show how these can be elicited andidentified from choices over menus.

Keywords: Menu Choice, Preference for Commitment, Temptation.

1 Introduction

This paper revisits the issue of temptation-driven preferences. We begin by recallingthe choice environment. In period 1 (the menu choice stage), a decision-maker (DM)commits to some option set (menu) of period 2 choices. In period 2 (the consump-tion choice stage), the DM makes a selection from the period 1 menu. Choices inperiod 2 might look anomalous since they need not agree with the period 1 rankingover consumption commitments, i.e. singleton menus. The latter ranking is oftenlabelled the “normative” preference. In cases where the period 2 choices do not agreewith the maximizer of normative preference we will say that the DM is constrainedby “temptation”. For a typical example, imagine the DM expresses a period 1 pref-erence for a hard deadline to complete a project, e.g. he chooses the option of a harddeadline over a flexible deadline whenever both are offered. This indicates a (nor-mative) preference not to delay. On the other hand, we also observe that the DMchooses to delay completion when he is not offered a firm deadline. One explanationfor examples of this kind is that there is some unobservable constraint (temptation)affecting the DM at the point of choice that prevents him/her from choosing thenormative maximizer on the menu.

∗Mailing address: Department of Economics, W.P. Carey School of Business, P.O. Box 873806,Tempe, AZ 85287-3806; E-mail address: [email protected].

The decision-theory literature on temptation, after Gul and Pesendorfer (2001)(GP), provides a time-consistent explanation for this conflict between normativepreference and period 2 choice. The GP model shows that we can recover this seem-ingly inconsistent behavior via maximization of an alternative “net welfare” rankingon choices. This ranking is composed from two utility functions u, v representing(resp.) normative and “temptation” preference. The net welfare of a choice is thedifference between its normative value and the cost of choosing this option over thev-maximizer (i.e. the most tempting options) on the menu. Choices which maxi-mize net welfare fall into one of two categories. Either the DM pays a utility costto select something other than a v-maximizer on the menu or the DM is selectingthe v-maximizer on the menu. In the former case we say that the DM is exerting(costly) self-control and in the latter that he is giving in to temptation. On accountof this dichotomy, we refer to this approach to modeling temptation as the “costlyself-control” approach.

In this paper, we develop a different model of temptation. The motivation comesfrom a class of examples where the DM is typically giving in to temptation, butasserts self-control in how severely he gives in. Casual empiricism suggests thatthere are settings in which this describes how temptation distorts choices: (i) aDM tempted by multiple gambling options mitigates this temptation by choosinglow stakes options (e.g. slot machines), (ii) a flexible completion deadline inducessome procrastination but not indefinite delay, (iii) deviations from a diet involvea minor indulgence which only occur in the presence of more severe temptations,and so on. In each of these cases, background temptations distort the DM’s choiceeven though the selected choice is not by itself particularly tempting – so that theDM can avoid this option when it is offered in isolation. In this sense, we thinkof the DM’s choice as a proxy commitment device and the act of self-control is inavoiding the more harmful temptations on the menu by virtue of giving in to theless harmful ones. This type of temptation-induced choice is important. First, inaddition to its plausibility, there is some experimental evidence for this behavior asit nests (temptation-driven) versions of two well-documented choice anomalies, viz.the compromise and attraction effects. Second, DM’s who respond to temptationin this manner offer more room for welfare-improving intervention than DM’s withcostly self-control preferences (after Gul and Pesendorfer (2001)).

The latter property is a consequence of two features. First, the menu of temp-tations induces a choice that the DM would not be tempted enough to choose oth-erwise. Second, choice distortion takes the form of a nudge rather than a push, e.g.playing the slot machines rather than a high-stakes card game. In these situations, apolicy which prohibits tempting options (usually) improves welfare since it removesthose options which are inducing the nudge. Recall that an important feature of theGP model is that the DM only expresses a preference for commitment when he is not

2

giving in to temptation. Hence, in the GP model partially prohibitive policies whichtake away some options can harm DM’s who are giving in to temptation becausethis was less costly than exerting self-control, which intervention now forces uponthem.1 The reason for the different welfare implications is that, for the behaviorconsidered here, changes in welfare can only occur by changing choices. This type ofchoice behavior is, in general, difficult to capture using a costly self-control approach,after Gul and Pesendorfer (2001).2 At the same time, temptation must create costssomehow since the DM expresses a preference for commitment. Moreover, the DMis exerting self-control since he is just allowing himself to be nudged by temptation.Thus, we need to find an alternative way to model both the welfare cost of temp-tation and the fact that the DM is optimally exerting self-control against temptation.

The idea behind our model is that choices are implicit bundles of attributesand these attributes can account for the choice of one object over another. Ob-serving a choice of x over y (more specifically, a menu containing x over a menucontaining y) we should be able to explain this choice as arising from a choice ofan attribute possessed by x over attribute(s) possessed by y. Accordingly, we recastthe DM’s optimization problem over the proxy choice domain of attributes. The de-cision procedure is as follows. In the menu choice phase, the DM evaluates a menuby grouping consumption choices by attributes. If two consumption choices differin some attribute, then the DM can use that attribute as a commitment device toselect one choice over another. We view the (implicit) choice of an attribute as aperiod 1 decision, i.e. when the DM selects a given menu it is as if he commits toonly selecting a consumption choice with a certain attribute. In this sense, the setof attributes is the self-control variable in the model. When these map one-to-one toactual choices, the DM has perfect self-control and we recover the standard model.However, in the given menu there may be more than one consumption choice withthe chosen attribute. This indeterminacy is resolved in period 2 in which the DM,having “chosen” an attribute and anticipating temptation’s influence, realizes thathe may not be able to stay away from tempting options which possess that attribute.Hence, with a worst-case evaluation in mind, the period 1 assessment of a menu isthe value of the least-worst temptation, i.e. equivalently the DM chooses the at-tribute possessed by the least harmful temptation.

To illustrate the procedure, imagine the DM values timely completion of a projectbut is tempted to shirk when given a flexible deadline. A consumption choice here

1The analysis in Gul and Pesendorfer (2007) shows that price policies (e.g. taxing temptations)can affect choices but can also distort welfare downwards. From this perspective, prohibitive policiesare better than taxes since they can (so long as the DM is not giving in) lead to a welfare increase,even though they might not affect choices.

2The source of the difficulty is a subtle point that requires more details to do it full justice. Werefer the interested reader to section 4.

3

would be a completion time. The DM can control his desire to shirk by groupingcompletion times into categories, e.g. a little late, late, very late, etc. In this wayhe can partially control the temptation to delay. However, these categories do notperfectly discriminate between choices – so that the decision maker cannot separatetwo choices which offer similar levels of delay. Hence, all else equal, given two com-pletion times that fall into the same group the temptation to shirk wins over andthe DM selects – within that group – the completion time which affords maximaldelay. That the “level of delay” variable only coarsely maps to choices and that thisis the only attribute that he can use to discriminate between choices is how tempta-tion creates welfare distortion, viz. the presence of a flexible deadline nudges a DM,who has a normative preference for timely completion, into a choice of incrementaldelay. More generally, there may be multiple attributes that the DM can use todiscriminate among choices. Each attribute serves as a commitment device, but – asin the previous example – there is a limit to how well this controls period 2 choiceon account of multiple choices which possess this attribute.

An attribute in our model is a subset of choices consisting of those elementsin consumption space (implicitly) possessing that attribute. Moreover, the space ofattributes is a subjective construct. After Gul et al. (2010), the intuition is that it isnot so much the physical attributes of the choices that matter – these may well beobservable, so that we can just enlarge the objective choice space to take objectiveattribute descriptions into account. What matters are those physical attributes thataffect the DM’s choice. Attributes and attribute combinations relevant for choice aresubjective objects and must be elicited from choice behavior. The main results ofthe paper elicit and identify attributes taking preferences over menus as primitive.The menu choice viewpoint is important since our interpretation of the model, viz.where the DM anticipates temptation and solves an optimization to assess the exante value of a menu, makes more sense when there is a temporal lag between thepoint of commitment (when the DM chooses an attribute) and the point of consump-tion (when he yields to temptation). On the other hand, temptation causes welfareloss only by changing choices (in our model). Hence, it is natural to ask whether wecan turn the table around on the typical menu preference exercise and, assumingperiod 2 choice, try to recover the implicit model over menus that generated thisperiod 2 choice.

While welfare loss only occurs via choice distortion in our model, a critical featureis that there is non-trivial preference for commitment. For example, the menu pref-erence exhibits non-degenerate Set-Betweenness (after Gul and Pesendorfer (2001)).Nevertheless, it turns out that the model can be recovered from an a priori weakerset of observables, viz. if we observe not just period 2 choices but also the normativeranking, i.e. the DM’s ranking on singleton menus. These two observables turn outto be sufficient to recover the model over menus. Having mentioned this equivalence,

4

however, we believe the most intuitive way to view the model is as a particular exam-ple of optimal self-control in anticipation of temptation (as described above). Hence,when we initially introduce primitives and axioms in the next section we use thelanguage of menu preferences. Section 3 presents our two main results – a represen-tation theorem for the model and identification of the space of attributes. In section4 we compare with other temptation models and present the results recovering themodel from the ex post observables. All proofs are in the appendix.

1.1 Related Literature

This paper is most closely related to the axiomatic literature on temptation and self-control. Our menu preferences are “temptation-driven” in the spirit of Dekel et al.(2009) (DLR). The model is inspired by some of the examples in DLR, whichare used to motivate a multiple temptation generalization of Gul and Pesendorfer(2001). The DLR temptation model is still one of costly self-control, hence it can-not explain the types of temptation we are interested in modeling. However, weshould add that our examples are themselves modifications of the DLR examples.The model in Noor and Takeoka (2010) can explain some of our examples (viz. thecompromise effect), but not others (viz. aggregation effects). The connections tothe Gul and Pesendorfer (2001), Dekel et al. (2009), and Noor and Takeoka (2010)models are examined in section 4.

There is also a “reduced form” representation of the model where the util-ity on menus can be expressed as maximization of the normative preference on achoice filter, following the recent literature on quasi-rational choice, e.g. Lleras et al.(2008), Ok et al. (2010), de Clippel and Eliaz (2012), Manzini and Mariotti (2012),Masatlioglu et al. (2012), Cherepanov et al. (2013). This group of papers developsmodels of two-stage constrained optimization which can also be expressed as maxi-mization of a preference against a choice filter.3 In section 4, we characterize condi-tions on the choice filter that determine when it is derived from the (ex ante) modelover menus.

2 Model

The choice environment is described as follows.

• Let X = {x1, . . ., xn} be an enumeration of the consumption space.

• Let M denote the collection of non-empty subsets of X (menus).

3For some of the papers the models explicitly invoke choice filters as a parameter and for othersthe choice filter is a reduced form representation.

5

• Let P(X) be the set of complete, transitive preference relations defined on M.

The behavioral primitive for this paper is a preference relation � ∈ P(X). Hence-forth, preference relations will always be assumed to be complete and transitive. Forany menu A we take sup(A), inf(A) to be (resp.) the set of (�)|X -maximal (resp.(�)|X-minimal) elements in the menu A, where (�)|X is the ranking on singletonmenus. Finally, we denote (−∞, x) (resp. (x,∞), [x,∞), (−∞, x], and so on) to bethe (�)|X-order interval, (−∞, x) := {y ∈ X : {x} ≻ {y}}.

2.1 Model Description

The model has two parameters: (i) a collection of subsets of the consumption space,denoted C ≡ {Ci}i, with the sole restriction that ∪i Ci = X and (ii) a cardinalrepresentation, u(·), of the ranking on consumption choices. These are put togetherto form the following utility on menus:

U(A) := maxCi:Ci∩A 6=∅minx∈Ci∩A u(x)

We call this utility on menus the category model, denoted in short via the pair (u, C).Hereafter, we omit the requirement that the maximum is taken only over those isuch that Ci ∩ A 6= ∅.4 The model is summarized in two steps. In step 1 the DManticipates temptation and optimally chooses an attribute. This means that he com-mits to avoid choices which don’t posses this attribute, e.g. he commits to avoid highrisk gambles, calorie rich desserts, excessive delay opportunities, and so on. In step2, within the set of consumption choices possessing the chosen attribute, the DMgives in to temptation and picks the most tempting alternative.

There are many ways of modeling how the DM succumbs to temptation. Forinstance, a milder form of succumbing to temptation might have the DM yieldingto temptation by maximizing some v (a la Strotz), or milder yet, a collection of vi’s(where the selected vi is allowed to depend on the menu).5 These are more generalways of giving in to temptation, e.g. than just taking v = −u, were there no re-striction on the domain of these functions. However, we evaluate the ex post choicefunction not on the full choice space, but on the subset of choices which possess aselected attribute – where attributes are themselves choice variables. Hence, withan appropriately enlarged attribute space the manner in which we have describedthe DM as giving in to temptation, i.e. minimizing u on a chosen attribute domain,is rich enough to nest all of the listed alternatives.6 The max-min structure does,

4This can be avoided if we declare, by fiat, the minimum of a function over the empty-set to besome −K where −K < u(x), ∀x.

5By this we mean a DM who gives in to temptation via argmaxx∈∪i Aviu(x), where Avi =

argmaxy∈A vi(y).6The manner in which the nesting occurs involves some details and is delayed to section 4 (see

footnote 14), wherein we compare with the Strotz model and costly self-control models.

6

however, assume an ex ante attitude towards the impact of temptation – since theDM uses a worst-case evaluation to assess menus. In section 4 we find the source ofthis attitude towards temptation.

Recall the GP model: U(A) = maxx∈A [u(x) − [maxy∈A v(y) − v(x))]. In thismodel, the optimal choice involves a tradeoff with consumption utility, u(x), andself-control cost, maxy∈A v(y) − v(x). In the category model, the (anticipated) expost choice can also be expressed as an outcome of tradeoff. To see this, it is usefulto rewrite the model as follows. For each x ∈ X , put

C(x) := {Ci ∈ C : x ∈ sup(Ci)}

That is, C(x) is the (possibly empty) sub-collection of C consisting of sets whoseu-maximal element is x. Note that we have C = ∪x∈X C(x). We can rewrite thecategory model:

U(A) = maxCi∈C minz∈Ci∩A u(z)

= maxx∈X [maxCi∈C(x) minz∈Ci∩A u(z)]

The full maximization problem breaks up into a collection of smaller problems where,in each subproblem, the DM’s choice domain is the collection C(x). Notice that xis the unconstrained welfare optimum over all elements in the sets Ci ∈ C(x). Theobjective of this subproblem is to select the attribute whose associated period 2choice, i.e. argminx∈Ci∩A u(x), has normative rank as close to x as possible. Noticethat x may not be choosable in this subproblem (if, for example, each set Ci containsan element y with u(y) < u(x)) in which case the DM treats u(x) as a consump-tion target and optimally selects Ci to approximately attain u(x). The quantity,maxCi∈C(x)miny∈Ci∩A u(y), is the (optimal) value of the subproblem associated tosetting a consumption target of u(x). Let zx ∈ A be the choice at which this valueis attained. Hence, the utility of a menu represents a DM who trades off normativerank of a consumption target, u(x), against the gap u(x)− u(zx), which representsthe temptation cost due to the shortfall between the utility of the target x andthat of the second-best choice zx. Utility of a menu can (vacuously) be written asU(A) = maxx∈A [u(x)− (u(x)− u(zx))] = maxx∈A u(zx).

We now consider some examples which show how categories constrain the DMinto picking a second-best zx when the target is x. In both examples the “choicedata” to be explained consists of a pair (u, C), where u is commitment (normative)preference (i.e. the ranking on singleton menus) and C(·) is a choice correspondence.For each example we construct a category model representation, which yields amenu preference whose (i) singleton ranking agrees with u and (ii) whose impliedsecond stage choices agree with C(·). Moreover, by the results in section 4, thereis a reversible path connecting menu preferences (satisfying our axioms) and pairs

7

(u, C) satisfying the reformulated versions of these axioms on the domain of ex postchoices. The first example is a temptation-version of the well-known compromiseeffect, see Simonson and Tversky (1993). The labels on the consumption choicesfollow Dekel et al. (2009), who use a similar example to motivate their generalizationof the Gul and Pesendorfer (2001) model. Fix a triple x, y, z where x is a healthy(slightly sweet) snack, y is a salty snack, and z is a decadently sweet snack. Letu(·) denote the normative ranking, with u(x) > u(y) > u(z). Consider the followingchoice behavior:

C({x, y, z}) = {y}, C({x, y}) = {x}, C({x, z}) = {z}.

We find a category model that rationalizes this choice behavior.

Example 1 (Compromise). Put C1 := {y}, C2 := {x, z} and C ≡ {C1, C2}. Let U(·)denote the associated menu utility, i.e. U(A) := maxCi minz∈A∩Ci u(z). Notice that:

• U({x, z}) = u(z), U({x, y}) = u(x).

• U({x, y, z}) = u(y).

The story driving the example is one of choice induced by “compromise”. TheDM would not be baited into choosing y over x in the absence of the more deca-dent option. He only chooses y to mitigate his urge to select the decadent option.In this sense, the choice of y is clearly an act of self-control. Yet, as pointed out inDekel et al. (2009) this example does not afford a Gul and Pesendorfer (2001) repre-sentation. It also does not admit a Dekel et al. (2009) representation, although somemore recent extensions, e.g. Noor and Takeoka (2010), of the Gul and Pesendorfer(2001) model can capture this example. With the given story, the sets {C1, C2} com-prising the category have the following interpretation. The first set consists of agrouping of all choice objects (snacks) which are salty, and the second is a group-ing of all snacks which are sweet (to any degree). In this example, there is onlyone salty snack insofar as the DM’s preferences are concerned. Since the attribute“salty” maps to a singleton choice, by selecting this attribute the DM can avoid thetemptation to pick something more (ex ante) harmful so long as y is available.

An important feature of the compromise example is that the presence of themiddle option is welfare-improving as it prevents the choice of the more harmfultemptation. Now consider a variation of this example. It is similar to compromisein that (i) the normative preference u is the same and (ii) the introduction of amore harmful temptation (z) induces a choice of a lesser temptation (y). However,in contrast to the previous example the presence of y is also harmful – the DM isbetter off when any temptation is removed since it is the sum total of these temp-tations that induces the choice distortion. This can be interpreted as a temptationanalogue of the well-known attraction effect (a.k.a “asymmetric dominance” effect),

8

see Simonson and Tversky (1993) for an experiment in which this effect has beendocumented. One story motivating the attraction effect is salience. The presence ofthe dominated option makes the dominating option more salient, which accounts forthe choice switch. In our example, the story would be that the presence of a worseand unchosen temptation makes the lesser temptation seem more appealling (as itis still tempting and doesn’t seem as harmful in comparison), hence inducing theswitch. This is the sense in which we think of the example as a temptation-analogueof the attraction effect. This is just one explanation for the attraction effect.7 Thereare many recent contributions which together provide a diverse set of rationaliza-tions for this effect, see e.g. Natenzon (2010), Ok et al. (2010), de Clippel and Eliaz(2012), Manzini and Mariotti (2012). Again place the menu preference and secondstage choices in the context of the Dekel et al. (2009) example. Put x = a healthysnack, y = a salty snack, z = a sweet snack. Assume also, for the sake of this exam-ple, that x is both a little salty and a little sweet, but not enough on either accountto be viewed as unhealthy:8

C({x, y, z}) = {y}, C({x, z}) = {x}, C({x, y}) = {x}.

We explain these choices with the following category model representation:

Example 2 (Aggregation). Let C1 := {x, z}, C2 := {x, y}, C ≡ {C1, C2}. Let U(·)denote the associated menu utility. Notice that:

• U({x, z}) = u(x) = U({x, y}).

• U({x, y, z}) = u(y).

We call this “aggregation” since it is a twist of a similar aggregation example inDekel et al. (2009). The story is that the DM is tempted by salty or sweet snackswhen they are on the menu. He can exert self-control by committing to avoid ei-ther excessively salty snacks or excessively sweet snacks, just not both at the sametime. Grouping choices by whether they are salty or sweet (resp. C2 or C1) he can“choose” ex ante which temptation he will avoid. Anticipating that at the point ofconsumption he might not be able to control his temptation, by choosing a menu itis as if the DM commits to a choice attribute (as opposed to a consumption choice).The two representations provide an explanation for why the introduction of y is

7Moreover, in our model the attraction is asymmetric as it always pulls away from the u-maximalelement.

8To be consistent, we could have taken x to be a little of both in the preceding example as well– so that we are looking at the same set of objective choices and only preferences are changingbetween the two examples. Since attributes only “count” if they can change choices the fact thatx is not grouped with y in the first example says that the DM finds the objective characteristic“saltiness” in choice x irrelevant in determining whether he will or won’t choose it when it isavailable.

9

welfare enhancing in the first example and harmful in the second. Adding y adds anattribute in example 1, hence provides a self-control device which did not exist in y’sabsence. By contrast, adding y in example 2 is harmful since it shares an attributewith x. Adding it to the choice set dilutes the DM’s self-control as he could haveobtained x via attribute choice in y’s absence.

2.2 Axioms

The main tool to elicit attributes from menu preference (and also to define theaxioms) is a concept dubbed “subjective feasibility”. To motivate the idea, let usassume the menu preference admits a category representation and say that x ischoosable in A via attribute choice. This happens exactly when the consumptiontarget u(x) can be attained on the sub-collection C(x), so that there is some Ci ∈ C(x)with x ∈ argminy∈Ci∩A

u(y). A revealed preference implication of this is that the DMcan assure himself a consumption utility of at least u(x) when the option set is A,i.e. A � {x}. Consequently, u(x) is also a lower bound on consumption utility forany subset of options (from A) which contains x.

Definition 1. An element x ∈ A is subjectively feasible if whenever A′ ⊆ A andx ∈ A′, then A′ � {x}.

In words, if the DM can always do at least as well as u(x) whenever x is available,then we infer that x is choosable (i.e. u(x) is attainable on the sub-problem C(x) ina putative category representation). We use this definition to state our first axiomentitled “CRW”, i.e. choice reveals welfare. The existential quantifier in the axiomgoes away once we express the axioms in terms of ex post choices (see section 4).

Axiom 1: (CRW) Every menu A possesses an x ∈ A with the property that:

• {x} ∼ A.

• If A′ ⊆ A with x ∈ A′, then A′ � A.

These two conditions are supposed to reveal that x is chosen from the menu inthe second stage. The first and second condition say that every menu possessesa welfare-equivalent which is subjectively feasible. Thinking of x as the choicefrom the menu, the second condition also says that if x is still available in A′

then the DM cannot be worse off since he has the option of choosing x from A′.We require one more axiom for our representation result. Let A∗ denote the sub-set of subjectively feasible elements in a menu A. From Definition 1, we haveA∗ := {x ∈ A : ∀A′ ⊆ A s.t. x ∈ A′, A′ � {x}}.

Axiom 2: (Strong Reduction) If A∗ ⊆ A′ ⊆ A, then A′ ∼ A.

10

The axiom combines two postulates. First, the set of subjectively feasible el-ements A∗ determines the indifference class of the menu. Intuitively, if these areindeed the only elements that can be chosen on account of temptation, then if weremove options that were not choosable in the first place this shouldn’t change theDM’s welfare. The second postulate is that if there is a subset A′ sandwiched inbetween, then everything feasible in A is feasible in the subset A′. In addition, noth-ing in A′ is feasible that was not already feasible in A, i.e. (A′)∗ = A∗. The firstpart of this second postulate is itself a sort of contraction consistency condition onfeasibility and the second part is a restricted expansion consistency condition: if thecontracted menu A′ has the property that it contains all the feasible elements of A,then anything that is feasible in A′, i.e. x ∈ (A′)∗, remains feasible in A.9

3 Main Results

3.1 Representation Theorem

The main results of the paper are that order, CRW, and strong reduction charac-terize the category model and are strong enough to identify categories from menupreferences.

Theorem 1. A preference � ∈ P(X) satisfies Axiom 1 (CRW) and Axiom 2 (StrongReduction) if and only if it admits a category representation.

A basic ingredient in the construction of the representation is the notion of an ag-gregation set. Call a menu A(x) an aggregation set for x if (i) {x} ≻ {y}, ∀y ∈ A(x),(ii) {x} ≻ {x} ∪ A(x), and (iii) {x} ∼ {x} ∪ (A(x)\y), ∀y ∈ A(x). The difficultyin constructing a category model representation is in accommodating the presenceof these aggregation sets. When there are no aggregation sets, the recursive con-struction we sketch below concludes at the first step of the recursion. In this case,we get what we call “no aggregation” categories. These are an interesting subclassof categories, e.g. they nest Strotzian preferences, Set-Betweenness categories (i.e.preferences satisfying CRW and Set-Betweenness), and so on. We relegate discussionof no aggregation categories and related submodels to a supplement.

Turning to the definition, the first two conditions say that x is not subjectivelyfeasible in the menu A(x)∪{x}. The last condition says that A(x)∪{x} is a minimal(w.r.t. set inclusion) set at which x is not subjectively feasible. We summarize thefact that A(x) is an aggregation set for x with the notation “x →t A(x)”. Also saythat x→t y if {x} ≻ {y} and {y} ∼ {x, y}. To elicit attributes, we imagine that x is

9In proposition 2 (section 4) we decouple some of these restrictions by considering choice filtersinduced by category models, viz. one of the conditions on the filters is a version of this restrictedexpansion condition and the other is a contraction consistency condition.

11

some unobservable bundle of attributes. These attributes are revealed to the analystas we offer menus to the DM of the form {x} ∪A(x), where A(x) is an aggregationset. Each attribute that describes x serves as a self-control device against possiblytempting options on the menu. Moreover, when there are no other options on themenu which share that attribute then by “choosing” this attribute in stage 1 theDM enables the selection of x in period 2. The fact that the DM cannot choose xfrom the menu {x} ∪ A(x) means that all attributes that describe x are accountedfor by elements of A(x), viz. every attribute (relevant for choice) possessed by x ispossessed by some element of A(x).

Were we in the setting of example 2 we would stop here since the elements of theaggregation set are each described by a single attribute. The general constructionis complicated by two facts. First, elements of aggregation sets can themselves bebundles of attributes – so that the underlying space of attributes is larger thanA(x). Second, there may be more than one aggregation set for x and each element ofthese aggregation sets can itself be an (implicit) bundle of attributes. To take bothpossibilities into account, we use a recursive procedure. Using the decompositionC ≡ ∪x C(x), we construct a representing C by constructing C(x) “one x at a time”.To this end, we start from (�)|X -minimal x and induct upwards on (�)|X-rank. Wethen use a recursive construction called an “x-tree” to recover the space of attributesthat describe x. The tree structure comes out of a repeated “unbundling” process.Let A(x) := {y1, . . ., yk} be the aggregation set for x, and create a coarse attributespace for x via {x, y1}, {x, y2}, . . ., {x, yk}. Since x is not choosable in {x}∪A(x), thisreveals that every attribute possessed by x is also possessed by some yi. However,the yi’s can themselves be (implicit) bundles of attributes so that each set {x, yi}represents choices which share a set of attributes. In this sense, the collection {x, yi}is a coarse attribute space for x. For each yi which possesses an aggregation setwe “unbundle” the attributes implicitly grouped together in the single set {x, yi}to form the n sets, (put A(yi) = {y1i , . . ., y

ni }), {x, yi, y

1i }, {x, yi, y

2i }, . . ., {x, yi, y

ni }.

This yields a tree of sets:

{x}

{x, y} {x, z}

{x, y, u} {x, y, v} {x, z, p} {x, z, q}

Figure 1: An x-tree with A(x) = {y, z}, A(y) = {u, v}, A(z) = {p, q}.

Now iterate the previous step. If any of the yji themselves possess an aggregationset, then the group {x, yi, y

ji } is also a coarse attribute as it consists of all choices

12

described by the set of attributes which (implicitly) describe yji . We then use theelements of an aggregation set A(yji ) to unbundle the coarse attribute {x, yi, y

ji }.

Repeatedly unbundling until we exhaust all aggregation sets, we obtain a collectionof sets C(x). Putting {x} as a root node the iterative refinement process splits each(unique) predecessor node into several successor nodes, i.e. the splitting procedureyields a tree of successively refined attributes. Calling this “splitting” might seema misnomer since at subsequent nodes the sets Ci are actually getting larger. Thesplitting refers to the implicit bundle of attributes shared by elements in the setCi. As these sets get larger, the shared set of attributes shrink – eventually to asingleton, which is when the splitting process stops. The entire set of nodes, start-ing from the root {x} all the way down to the terminal nodes in C(x) is what werefer to as an “x-tree”. Carrying out the x-tree construction for each x we obtain acategory {Tx}x∈X which we allege is a representation of the menu preference. Thegeneral x-tree construction follows this sketch closely, but it too is complicated bysome facts. First, as already noted, choices possess multiple aggregation sets. Whenwe unbundle the coarse attribute {x, yi}, we need to account for the fact that theremay be several distinct sets A(yi), each of which represents a bundle of attributesand, hence, each of which is (possibly) culpable for the infeasibility of x in the menu{x, yi}.

Second, the axioms imply “transitive infeasibility”, i.e. if x →t A(x), z →t A(z)for z ∈ A(x), then {x} ≻ {x}∪((A(x)\z)∪A(z)). This means that the y-tree is nestedwithin the x-tree whenever {x} ≻ {y} and y ∈ A(x) for some aggregation set A(x).Hence, when we refine, say, {x, yi} we need to account for not just aggregation setsA(yi), but also aggregation sets A(z) for any z ∈ A(yi), and – in turn – aggregationsets A(w) for any w ∈ A(z), z ∈ A(yi), and so on. In other words, when we refine{x, yi} we need to account for the entire yi-tree of successively refined attribute setscorresponding to each element yi ∈ A(x). This forces the construction of categoriesto be recursive. At the end of the recursion we obtain the collection of sets Tx

for each x ∈ X which we refer to as the “tree category”. The key step in showingrepresentability is in showing that if x is subjectively feasible in A then it is choosablein A via attribute choice. In other words, we need to find a set Ci ∈ Tx with x ∈argminy∈Ci∩A u(y). Equivalently, we find a path,

ℓ := {C0 (= {x}) → C1 → · · · → Ct (= Ci)}

through the x-tree, where for each coarse attribute Cj on the path x is the onlyelement in A possessing this attribute, i.e. {x} = argminy∈Cj∩A u(y). Call this an“unobstructed” path.

Construct the first two steps of such a path. Since x is feasible in A, fixing anyaggregation set A(x) we know that there is some z ∈ A(x) with z /∈ A. That is, if theDM commits to the coarse attribute represented by z (when the option set is A), then

13

he enables the choice of x. Unbundling the attributes that implicitly describe z, thismeans that for any aggregation set A(z) there must be some p ∈ A(z) with p /∈ A.Note that we are using transitive infeasibility here, viz. if A(z) is an aggregationset, then {z} ≻ {z} ∪ (A(z)\p) ∪ A(p) – i.e. some subset of (A(z)\p) ∪ A(p) isan aggregation set for z. Let C0 denote the root node {x}, C1 denote the coarseattribute {x, z}, C2 denote the (coarse) attribute {x, z, p}. Since z, p /∈ A we havean unobstructed path through the first two levels of the x-tree:

{x}

{x, y} {x, z}

{x, y, u} {x, y, v} {x, z, p} {x, z, q}

Figure 2: An unobstructed path through part of the x-tree when A = {x, y, q}.

Successively refine the attribute sets {x, z, p} by unbundling p (if necessary) thenunbundling elements of aggregation sets of p, and so on. At each unbundling step wefind an unobstructed extension of the path ℓ which we concatenate with the existingunobstructed path. Inductively proceeding until aggregation sets are exhausted, weobtain an unobstructed path through the full x-tree, ℓ := C0 → C1 → C2 → · · · → Ct,where Ct is a terminal node of the x-tree.

3.2 Identification

To identify categories from preferences we will need to place some restrictions onthe class of admissible models. First, we exclude ties in the singleton ranking asthese provide a trivial source of multiplicity in the set of model representations.For example, say {x} ∼ {x′} and put x →t y, x

′ →t y. Consider the two categorymodels: (u, C(1)), (u, C(2)), where C(1) ≡ {{x, x′, y}}, C(2) ≡ {{x, y}, {x′, y}}. Notethat both models represent the same menu preference although the attribute sets{x, y}, {x′, y} are only cosmetically distinct. To avoid this issue, we restrict (forthe identification result only) to menu preferences in which the singleton rankingis strict. There are two additional ways in which categories could be non-identifieddue to attribute sets that are in the model (u, C) yet do not “count” towards therepresentation. The first of these is analogous to the Dekel et al. (2001) notion of arelevant state and is a property that we call “sharpness”.

Definition 2. A model (u, C) is sharp if for any sub-collection C′ ⊆ C, the inducedmenu preferences �(u,C) and �(u,C′) are unequal.10

10The sub-collection is assumed to satisfy ∪i C′i = X , else the statement that � 6= �′ is obvious.

14

The preferences �(u,C),�(u,C′) denote the menu preferences underlying the model(u, C) (resp. (u, C′)). Sharpness requires that all attribute sets are necessary in thefollowing sense: if we delete any one of these sets from the model, then we change themenu preference (underlying the new model). The second way in which attributesets could be superfluous is a little more subtle. To describe it we first require adefinition.

Definition 3. Given two models (u, C1), (u, C2) say that (u, C2) is a prolongation of(u, C1) if there is a bijection π : C1 → C2 where for every set C1(i) ∈ C1 we haveπ(C1(i)) ⊇ C1(i).

When C2 is a prolongation of C1 we will equivalently refer to this by saying C1 is a“retraction” of C2. We will show that sharp categories are identified up to a specifickind of prolongation. Given a pair of sets (A,B) where A ⊆ B (both contained inX) say that A is a lower bound order interval in B if there is some z ∈ B such that(−∞, z] ∩ B := {x ∈ B : {z} � {x}} = A. Now consider two models (u, C2), (u, C1)where C2 prolongs C1 (under bijection π) and, moreover, C1(i) is a lower bound orderinterval in π(C1(i)) for every C1(i) ∈ C1. This is the second sense in which categoriescould be non-identified. While sharpness is a criterion for eliminating redundantattribute sets in the representation, the criterion we have just described is a methodfor determining redundant elements of attribute sets in the representation. To seethis, let x ∈ π(C1(i))\C1(i) so that x possesses the attribute represented by elementsof the set π(C1(i)). This attribute only “counts” in the representation if there is somemenu A where x is choosable precisely by committing to this particular attribute,i.e. x ∈ argminy∈π(C1(i))∩A u(y). However, since C1 is also a representation of thesame menu preference this means there is some C1(j) with x ∈ argminy∈C1(j)∩A u(y),i.e. x is choosable in A via committing to attribute C1(j) (using the model C1).

Since C2 prolongs C1 and C1(j) is a lower bound order interval in π(C1(j)) ∈ C2this means that there is more than one way to choose x (in A) via committingto an attribute. Using the model C2 either the DM can choose attribute π(C1(i))and enable the second stage choice of x or he can choose attribute π(C1(j)) andinduce the same second stage choice. In other words, while x may physically sharethe attribute represented by the set π(C1(i)) this object description is irrelevant forchoice. Hence, after reducing the attribute set π(C1(i)) by deleting x and keepingother attributes unchanged we still have a representation. Putting the sharpnessand prolongation conditions together is, therefore, tantamount to requiring thatboth attributes (i.e. sets Ci ∈ C) and object descriptions (i.e. elements of sets Ci)are non-redundant. These are two ways in which one could create multiplicity inthe category representations. The question which remains is – are there others?The answer is “no”. Once attribute sets and object descriptions are required to benon-redundant (and the singleton ranking is strict), the category model is identified.

15

Theorem 2. Assume (�)|X is strict. There is a unique, sharp model (u, C∗) suchthat any other sharp representation, (u, C), is a prolongation of (u, C∗).

11 Moreover,any prolongation π : C∗ → C has the property that Ci ∈ C∗ is a lower bound orderinterval in π(Ci).

We now consider two examples. The first example formally shows that bothsharpness and prolongation are necessary for identification. The second illustrateswhy the identification problem is in general non-trivial. A caution to the reader: thelatter example requires some notation which is used in the proof of Theorem 1.

Example 3 (Sharpness 6⇒ Uniqueness). Let X = {x ≻ y ≻ z ≻ p ≻ q}. As-sume A1(x) = {z, p}, A2(x) = {z, q}, and p →t q, y →t p. Put C1 = {y, p, q}, C2 ={x, z}, C3 = {x, p, q} and consider the category model (u, C) given by C ≡ {C1, C2, C3}.Note that (u, C) is a sharp representation of �. Now consider the following prolon-gation of C. Put C′

1 = {x, y, p, q}, C′2 = C2, C

′3 = C3 and let C′ ≡ {C′

1, C′2, C

′3}.

Note that the model (u, C′) also represents � and, importantly, is also sharp. Tosee this, note that since y is not a member of any aggregation set for x we cannotomit C′

3 from the representation. Also note that the model (u, C′) prolongs (u, C) inthe manner described in the theorem: the sets comprising C are lower bound or-der intervals of the sets comprising C′. The object description of x using the modelC′ (i.e. the sets in the category C′ containing x) is redundant since whenever x ischoosable by committing to the attribute C′

1 it is also choosable via committing toC′3. Thus, we need the prolongation refinement in order to pin down the model.

We conclude with a somewhat complex example. The objective is two-fold. First,to show why the tree category does not, by itself, yield a sharp representation.Second, to give an example of a menu preference complex enough that illustratesthe non-triviality of the identification, viz. if there are many attribute sets in agiven representation, where some are relevant and others are not, how do we findthe unique, minimal set of attributes C∗ whose existence is asserted by the precedingtheorem? To this end, we note that the proof of the theorem is constructive. It givesan algorithm that locates the collection C∗.

Example 4. LetX = {x, y, z, p, q}where x has aggregation sets {y, z}, {p, q}, {y, q},and {p, z}. Also put y →t p and z →t q, and assume there are no other constraintson choice. Consider the following category,

C1 := {x, y, p}, C2 := {x, z, q}

where C∗ ≡ {C1, C2}. Notice that the pair (u, C∗) represents �. Now compare thiswith the category we obtain from the algorithm in the theorem, i.e. the construction

11The unique, minimal category C∗ is constructed explicitly in the proof and is a sharp sub-category of the tree category, (u, T ), constructed in the proof of Theorem 1.

16

of the “tree” category {Tx}x∈X . It is straightforward to see that the only relevantcategories are those in the x-tree. There are 4 levels in the x-tree corresponding tothe four x-aggregation sets, A1(x) = {y, z}, A2(x) = {p, q}, A3(x) = {y, q}, A4(x) ={p, z}. A list of the categories on each level is as follows: (put Bt(x) := {y ∈ X :x→t y})

1. L(0) ≡ {C01 = {x}} (since Bt(x) = ∅, the root node of the x-tree contains only

the singleton {x}).

2. L(1) ≡ {C11 = {x, y, p}, C1

2 = {x, z, q}} (the two nodes are obtained by sepa-rately attaching Bt(y) ∪ {y} = {y, p} and then Bt(z) ∪ {z} = {z, q} to theinitial node of the tree).

3. L(2) ≡ {C21 = {x, y, p}, C2

2 = {x, y, p, q}, C23 = {x, z, p, q}, C2

4 = {x, z, q}}(separately attach p and q to each node in the preceding level – note thatBt(p) = ∅ = Bt(q)).

4. L(3) ≡ {C31 = {x, y, p}, C3

2 = {x, y, p, q}, C33 = {x, y, p, q}, C3

4 = {x, y, p, q}, C35 =

{x, y, z, p, q}, C36 = {x, z, p, q}, C3

7 = {x, y, p, z, q}, C38 = {x, z, q}} (separately

attach Bt(y) ∪ {y} and (resp.) {q} to each node in the preceding level – notethat we include replicas in the tree construction).

5. L(4) ≡ {C41 = {x, y, p}, C4

2 = {x, y, p, z, q}, C43 = {x, y, p, q}, C4

4 = {x, y, p, z, q},C45 = {x, y, p, q}, C4

6 = {x, y, p, z, q}, . . ., C415 = {x, z, p, q}, C4

16 = {x, z, q}} (thepattern at this stage should be clear, hence we omit the full enumeration ofall 16 nodes in level 4 of the tree).

Notice that several of the nodes in the terminal level of the x-tree are identi-cal. However, notice also that the category model we started out with, (u, C∗), isa sub-category of the terminal nodes of the x-tree. In particular, the sets in C∗ areexactly the minimal (w.r.t set inclusion) nodes in the x-tree. This example points tothe identification strategy, i.e. where to look for the model (u, C∗) whose existence isannounced by the theorem. The first step in the argument shows that any categorymodel (u, C) is a prolongation of some sub-category of the tree category, {Tx}x∈X .We call this the “embedding” step. The intuition is simple: use the decompositionC ≡ ∪x C(x) and embed each C(x) in Tx. By representability, each Ci ∈ C(x) mustcontain some element from each aggregation set A(x). And, similarly, for each ofthese elements that it contains it must, in turn, contain some member of their ag-gregation sets, and so on. Hence, at each level of the x-tree we can find a coarseattribute that is contained in Ci – showing that the category C prolongs a subcate-gory of {Tx}x∈X . This is the crux of the embedding argument. If the original modelis sharp, the embedding procedure finds a sharp submodel of the tree category ofwhich the original model C is a prolongation.

17

This would nearly seem to deliver identification, but there is a possible inde-terminacy in which sharp submodel of the tree category the model (u, C) prolongs.This is where the main use of sharpness comes in and is also the main challenge inidentification, viz. we need to show that there is just one sharp submodel of the treecategory that retracts (u, C). There isn’t a simple intuition that explains why weget uniqueness here; however, we do at least have a hint as to where to look. As thepreceding example suggests, if we look at the submodel consisting of minimal (w.r.t.set inclusion) nodes across all Tx trees, then we have a category representation. Inthe example, the set of minimal nodes turns out to be the minimal category C∗.In general, the set of minimal nodes will always provide a representation though itneed not be a sharp representation. What turns out to be true, however, is that theminimal representation C∗ lies within the set of minimal nodes in the tree category.Moreover, one can extract from the proof an algorithm that always finds the set C∗within the sub-category of minimal nodes.

4 Comparison between models

4.1 Comparison with temptation models

The main behavioral postulate in the GP model is their Set-Betweenness axiom:

A � B ⇒ A � A ∪B � B.

GP show that as self-control costs become large the DM’s preference for commitmentvanishes. We show that a similar connection holds for the category model. Thisbegins with the observation that the category model nests the Strotz model, i.e. wecan express the dual-self model with a corresponding “Strotz category”. Moreover,just as the Strotz model is characterized by the (degenerate) Set-Betweenness axiomwhen the DM has no preference for commitment, we find that the Strotz model hasan alternative axiomatization via a strengthened version of the CRW axiom. Thestrengthening of CRW also expresses the fact that the DM never has preferencefor commitment. Akin to the GP setting, where the model collapses to Strotzianchoice when self-control costs get large, in our case the category model collapses toStrotzian choice when the space of self-control variables (i.e. the set of attributes)collapses to a singleton (we will be more clear about what we mean by “singletonattribute” in short order).

Definition 4. An element x ∈ A is a Strotzian choice if it satisfies the followingtwo properties:

• {x} ∼ A.

• If A′ ⊆ A and x ∈ A′, then {x} ∼ A′.

18

The element satisfying these two conditions is the putative choice from the menu.The key feature of the definition is that when the DM is choosing such an x, he neverhas a preference for commitment. Note that this is exactly where the difference withCRW lies – what is possibly a strict preference in the second condition (under CRW)is always indifference under the definition. In the GP model, this happens preciselywhen the DM picks the v-maximizer on the menu, i.e. choice is “Strotzian”.12 Letthe set of Strotzian choices be denoted as ΣST(A) and define a restriction that welabel Strong CRW: If A 6= ∅, then ΣST(A) 6= ∅. By contrast, recall the GP axiomthat characterizes the Strotz model, denoted Degenerate Set-Betweenness (DSB):For any menus A,B either A ∪ B ∼ A or A ∪ B ∼ B. When we restrict to menuorders, both axioms are equivalent.

Lemma 1. A preference � ∈ P(X) satisfies DSB if and only if it satisfies StrongCRW.

When Set-Betweenness degenerates to DSB, all choice is Strotzian in the GPmodel and there is never preference for commitment. Similarly, when CRW is strength-ened to Strong CRW, all choice is Strotzian in the category model and the DM nolonger has preference for commitment. Now we find the avatar of Strotz prefer-ences when expressed through a category model.13 Given a Strotz pair (u, v), wefirst write down a category C(u,v) where the model (u, C(u,v)) generates the sameutility function on menus as the pair (u, v). That is, maxx∈Av

u(x) = U(A) =maxCi∈C(u,v) minx∈A∩Ci u(x). Fix (u, v) and define the following sets

Cx := {y ∈ X : v(y) > v(x), u(x) > u(y)} ∪ {x}

and let C(u,v) := {Cx : x ∈ X}.

Observation 1. Fix a Strotz pair (u, v) and a category model (u, C(u,v)). These twopairs generate the same utility function on menus.14

In the Strotz model, there is just a single temptation whose intensity is repre-sented by the v-function. When coded into a category representation, for each x ∈ X

12The Strotz model is defined by a pair of functions (u, v) which form a utility on menus givenby U(A) := maxx∈Av

u(x), Av := argmaxx∈A v(x).13It is straightforward to see that Strong CRW implies Strong Reduction, so there must be a

category representation.14When introducing the category model, we asserted that ex post Strotzian choice (conditional

on attribute choice) and multi-Strotzian choice are all sub-classes of the category model. Thisfollows via the embedding we have just described. Fix a triple (u, C, v) and say that, on Ci ∩ A,choices are determined via maxx∈(Ci∩A)v u(x) (as opposed to −u), where Av = arg maxx∈A v(x).Replace each Ci with {Cx

i }x∈Ciwhere Cx

i := {y ∈ Ci : u(x) > u(y), v(y) > v(x)}. Doing this foreach Ci we obtain a category model (u, C′) which represents the same menu preference. A similarargument applies if the ex post choice function were determined via: maxx∈

⋃k(Ci∩A)vk

u(x), where

(u, C, {vk}k) is the model. Hence, flexibility in ex post choice can be equivalently captured byflexibility in the space of attributes.

19

there are several sets Ci in the category that might contain x but the collection sat-isfies the following two conditions15: (put {x} ≻ {y} and C(x) = {Ci : Ci ∈ C, x =sup(Ci)})

1. x ∈ Ci, y ∈ Cj\Ci ⇒ (−∞, y) ∩ Ci ⊆ Cj .

2. ∀x ∈ X, ∃Cx ∈ C s.t. (−∞, x) ∩ Cx ⊆ Ci, ∀Ci ∈ C(x).

We refer to the latter property as a “single attribute” condition. To explain why,recall the decomposition where we expressed the category model as the maximumof the respective maxima from the sub-problems consisting of sets in C(x), i.e.

U(A) = maxCi minx∈Ci∩A u(x)

= maxx∈X maxCi∈C(x)miny∈Ci∩A u(y)

The interpretation of each sub-problem, maxCi∈C(x)miny∈Ci∩A u(y), is that u(x) isthe utility target and the DM chooses the attribute in C(x) that puts him closest tothis target. When the category satisfies condition (2) above (which we call a NAG –for “no aggregation” – category in the supplement) this sub-problem is trivial sinceall sets Ci ∈ C(x) contain a common set (−∞, x) ∩ Cx. In this case the solutionto the sub-problem, zx (recall the notation from section 2), always lies in the setCx∩A. Hence, to reach the target u(x) the DM has only one attribute to choose from.

The single attribute condition doesn’t by itself imply that the DM is Strotzian.For example, note that it allows compromise effects, e.g. the category constructed inexample 1 satisfies this condition. In this case there is a single attribute describingeach element (as categories are partitions), however these attributes are distinct sothat the representation cannot be recovered from a single numerical measure as inobservation 1. To this end, condition (1) says that if {x} ≻ {y} and y does nottempt x, then all consumption choices that tempt x, i.e. (−∞, y) ∩ Cx, and thatcould potentially tempt y do indeed tempt y. This condition is clearly necessaryin order to preclude representations of the compromise effect. Evidently it is alsosufficient to imply that the menu preference induced by the category model must beStrotzian, so that there is a v from which we can recover the categories (using theformula in the observation).16

We now address the max-min structure in the category model and show that asimilar structure is present in all models of temptation which share the “PositiveSet-Betweenness” axiom. Recall that this is the moniker given by Dekel et al. (2009)

15In a supplemental appendix, we show that categories that have these two properties charac-terize Strotzian menu preferences.

16The proof of this along with a characterization of categories which satisfy the Set-Betweennessaxiom is in the supplement.

20

to the following half of the Set-Betweenness axiom, A � B ⇒ A � A ∪ B. Fix acardinal U : M → R and consider the map

umin(x,A) := min{U(A′) : A′ ⊆ A, x ∈ A′}.

It turns out that the formula, U(A) = maxx∈A umin(x,A), recovers the utility (asopposed to just the underlying menu preference) on menus if and only if the underly-ing preference satisfies Positive Set-Betweenness.17 For the GP model, we interpretthe number umin(x,A) as the value of the most costly self-control problem facedby the DM when he has the option to choose x. The GP utility can be expressedas U(A) = maxx∈A umin(x,A), but for this expression to have meaning we want tocompute the kernel umin(x,A) explicitly. The kernel umin(x,A) need not agree withthe GP maximand u(x) − (maxz∈A v(z) − v(x)). For a menu A, define two subsetsA1(x), A2(x) to be (resp.) A1(x) := {z ∈ A : (i) u(z)+v(z) > u(x)+v(x), (ii) u(z) ≤u(x)}, A2(x) := {z ∈ A : (i) u(z) + v(z) ≤ u(x) + v(x), (ii) u(z) ≤ u(x)}. Note thatA1(x) is the set of options in A that are (weakly) normatively worse than x yetare chosen over x head-to-head (i.e. these are the “overwhelming” temptations) andA2(x) consists of those normatively worse elements which lose head-to-head with x.

Observation 2. Fixing a GP pair (u, v), let U(·) denote the induced utility onmenus and umin(x,A) the associated max-min kernel. We have,

umin(x,A) = min{minz∈A1(x) u(z), u(x)− (maxz∈A2(x) v(z)− v(x))}.18

The value umin(x,A) is determined in two steps. First, consider the set of temp-tations which make it costly, but not overwhelming, to choose x. This is the GPvalue of the menu A2(x). Now consider the temptations in A against which it istoo costly to choose x. If there aren’t any, then the value of umin(x,A) coincideswith A2(x). If A1(x) is non-empty then we compare the value of each overwhelmingtemptation against the most costly choice problem in which x is still chosen, i.e.A2(x), and pick the minimum from these pairwise minima yielding umin(x,A).

17Proof: Necessity is straightforward. We check sufficiency. For each x ∈ A let Ax denote themaximal menu (w.r.t. set inclusion) in the set argminA′⊆A,x∈A′ U(A′). By Positive Set-Betweenness(PSB) the unique such maximal element is the union of all sets in the argmin. We now claim thatif Ax∗ is �-maximal in the set {Ax : x ∈ A}, then Ax∗ � A. To see this, pick any x1 ∈ A andwrite A = ∪x 6=x1

Ax ∪ Ax1. By PSB, either Ax1

� A or ∪x 6=x1Ax � A. In the former case, we are

done. In the latter case, pick any x2 6= x1 and consider A′(:= ∪x 6=x1Ax) = ∪x 6=x1,x2

Ax ∪ Ax2. By

PSB, either ∪x 6=x1,x2Ax � A′ � A or Ax2

� A′ � A. In the latter case, we are done. In the formercase, iterate the preceding argument. Eventually we arrive at an x with Ax � A, which impliesAx∗ � A. OTOH, A � Ax, ∀x by definition of the sets Ax. Hence, U(A) = maxx∈A umin(x,A).

18When we take the maximum of umin(x,A) we obtain a pair {x∗, y∗}, where y∗ is either themaximizer of v on A2(x) or the minimizer of u on A1(x), and A ∼ {x∗, y∗}. This is a well-known factabout the GP model (see lemma 2 in Gul and Pesendorfer (2001)) – that every menu is equivalentto a binary submenu that solves the max-min problem, U(A) = maxx∈Aminy∈A U({x, y}).

21

Now consider a category pair (u, C) and let U(·) be the utility given by theformula U(A) = maxCi minz∈A∩Ci u(z). For each x ∈ X , recall that C(x) := {Ci ∈ C :x ∈ Ci}. That is, C(x) is the sub-collection of sets in C that contain x.19

Observation 3. Fix a category model (u, C). The associated max-min kernel is givenby: umin(x,A) = maxCi∈C(x) minz∈Ci∩A u(z).

Putting C ≡ {Ci}i we can simply reorganize the sets in C as ∪x∈X C(x). It followsthat

U(A) = maxCi minz∈Ci∩A u(z)

= maxx∈X[maxCi∈C(x)minz∈Ci∩A u(z)

]

= maxx∈X umin(x,A).

This shows that the utility on menus for both the GP model and the category modelhas a max-min representation, where we interpret the latter as expressing a period1 pessimism towards the welfare impact of temptation. Moreover, as the followingobservation shows, this interpretation is not sensitive to the selection of a particularutility kernel.

Observation 4. Fix any function u(·, ·) where u(x,A) ≥ u(x,B) whenever A ⊆ B.Let U(A) := maxx∈A u(x,A) and put umin(x,A) := min{U(A′) : A′ ⊆ A, x ∈ A′}.Then, u(x,A) ≤ umin(x,A), ∀x ∈ A, ∀A.

In other words, fixing the cardinal utility and interpreting u(x,A) as the value ofthe sub-problem where x is the consumption target, the number umin(x,A) functionis an upper bound on this value across all functions u(·, ·) (for every x and for everymenu A) which generate the same utility on menus, with the caveat that the kernelu(·, ·) is non-increasing in the menu argument (the caveat being both necessary andsufficient for Positive Set-Betweenness).

Let us conclude by looking at two extensions of the GP model that also invokePositive Set-Betweenness. The first is the Dekel et al. (2009) extension, which iscomprised of an (n+ 1)-tuple (u, {vk}

n+1k=1) assembled into a menu utility as follows:

U(A) = maxx∈A [u(x)−n∑

k=1

ck(x,A)],

where ck(x,A) := maxy∈A vk(y)− vk(x). This is actually a “no uncertainty” versionof the model they axiomatize – one of their theorems shows that the model with

19The reader will note that we have switched the definition of the collection C(x), viz. in thediscussion in section 2 we defined C(x) to be those attribute sets within which x was the top-rankedsingleton. On account of identification some (or all) of these sets may be redundant if they arenested within y-trees for some {y} ≻ {x}. For this reason, we require the more general definitionof C(x).

22

uncertainty reduces to the displayed one if and only if Positive Set-Betweenness issatisfied. Another extension of Gul and Pesendorfer (2001) is the Noor and Takeoka(2010) model of non-linear costly self-control,

U(A) = maxx∈A [u(x)− ψ(maxy∈A v(y))·(maxy∈A v(y)− v(x))],

where ψ(·) is an increasing non-negative (distortion) function. As mentioned earlier,our examples 1 and 2 are twists of related examples in Dekel et al. (2009) which theyused to motivate their generalization of Gul and Pesendorfer (2001). Dekel et al.(2009) also note that the compromise effect (example 1) eludes representation bytheir model.

Noor and Takeoka (2010) point out that to capture this effect it is necessary forex post choice to violate WARP – which is captured in their model by allowing viola-tions of independence (resulting in the distortion term ψ(·)). Neither the Dekel et al.(2009) nor Noor and Takeoka (2010) extension can represent aggregation effects (un-less the menu has three elements). To see this, we consider ex post choices and showthat these cannot be captured by the implied second stage choices in these models.Fix u(x) > u(y1) ≥ u(y2) ≥ · · · ≥ u(yk), where the yi are temptations. Temptationdistorts choices via aggregation when we have:

C({x} ∪ {yi}ki=1) = {y1} & C({x} ∪ A) = {x}, ∀A ( {y1, . . ., yk}.

The Dekel et al. (2009) model cannot capture these choices since second stage choiceis generated by maximization of u +

∑n

i=1 vi, and is hence rational. For k ≥ 3, theNoor and Takeoka (2010) model cannot capture this either since it shares with GPthe property that every menu is indifferent to the binary subset consisting of thechoice from the menu and its v-maximal element (which must be distinct under eithercompromise or aggregation). This implies that choice from any subset is the sameso long as it contains the original choice and the v-maximal element – a requirementthat is violated by aggregation.

4.2 Comparison with ex post choice models

While choices from menus are implied by the category model, we don’t directlyobserve them. However, since temptation distorts welfare only through choices (inour model) it is natural to ask the following converse question: Were we to assumeinstead that we observe choices from menus (i.e. second stage choice), under whatcircumstances can we say that these choices were generated by a category model?20

It is important to be able to map ex post choice data back into the class of menuchoice models that we presume to explain these choices. Otherwise, layering a menu

20I thank Wolfgang Pesendorfer and an anonymous referee for suggesting this question.

23

preference on top of the choice data we want to explain might make the categorymodel seem ad hoc. By recovering the same model from choices from menus, weshow that the assignment of menu preferences is determined.21

Our analysis in this section uses an extra observable. Most models of ex postchoice only require an ex post choice correspondence, but we will assume a pair(u, C) consisting of a ranking on consumption choices as well as a choice corre-spondence.22 With a temptation story in mind, u(·) is a representation of the DM’scommitment preference. This might be overtly observable in cases where it repre-sents a shared norm, e.g. all DM’s view smoking cigarettes and excessive risk takingas harmful habits. If there is any temporal lag between committing to consume andconsuming, it can also be elicited by offering the DM choices between consump-tion commitments, i.e. singleton menus. However u happens to be revealed, theobjective is to study DM’s who, while sharing the same u, respond differently totemptation and, hence, make different choices. By expressing axioms in terms of expost choices we obtain an equivalent description of the model in terms of “choicefilters”. This allows us to compare choice behavior derived from the category modelwith models from the recent literature on quasi-rational (equiv. boundedly ratio-nal) choice. For example, the models axiomatized in (resp.) Manzini and Mariotti(2007), Lleras et al. (2008), Manzini and Mariotti (2012), Masatlioglu et al. (2012),Cherepanov et al. (2013) fall under the rubric of the following general model:

U(A) = maxx∈Θ(A) u(x)

The correspondence Θ : A ⇉ A (where Θ(A) ⊆ A) is what is referred to as achoice filter. In Lleras et al. (2008), Masatlioglu et al. (2012), and Cherepanov et al.(2013), the models are expressed in the language of choice filters, with the main dif-ferences arising from the conditions imposed on these filters.

For Manzini and Mariotti (2007), Manzini and Mariotti (2012), the filters area reduced form expression of their original models. The rational shortlist methodmodel (RSM) proposed in Manzini and Mariotti (2007) consists of two binary re-lations (P1, P2). Fixing a menu A, choice maximizes P2 against the residual set ofP1 maximizers on A. Hence, under RSM, the filter on A would be the residual setof P1 maximizers on A. In Manzini and Mariotti (2012) (CTC), the model consistsof two relations (�, P2) where � is a binary relation on menus and P2 is a binaryrelation on choices. Under CTC, fixing a menu A, in the first stage the DM uses � toonly pick elements x ∈ A which lie in �-undominated menus (i.e. undominated cat-egories borrowing the language from Manzini and Mariotti (2012)). Choice is then

21More precisely, what is proven is that the model is recoverable from a pair (u,C) consisting ofa commitment ranking on singleton menus and a period 2 choice correspondence.

22For brevity we confuse the ordinal ranking with a cardinal representation u.

24

obtained by maximizing P2 on the residual elements of A which are drawn from�-undominated menus, so that the filter in this case is the set of elements whichlie in �-undominated menus (contained in A). The category model also induces achoice filter via the formula,

Θ(A) = ∪i argminx∈Ci∩A u(x).

The implied ex post choices would be, C(u,C)(A) = argmaxx∈Θ(A) u(x). Using thedescription of the category model in terms of the observables (u, C), we find theproperties on choice filters which determine when they derive from category models(u, C).23

Definition 5. An element x ∈ A is subjectively feasible if it satisfies the followingproperty. For any A′ ⊆ A with x ∈ A′, u(z) ≥ u(x), ∀z ∈ C(A′).

The interpretation of the condition is identical to the one given in Definition1. For a given menu A, let A∗ denote the subset of subjectively feasible elements.Recast the axioms as follows.

A1: u(x) = u(y), ∀x, y ∈ C(A).

A2: If A∗ ⊆ A′ ⊆ A, then C(A′) = C(A).

A3: If A′ ⊆ A and C(A) ∩ A′ 6= ∅, then u(x) ≥ u(y), ∀x ∈ C(A′), ∀y ∈ C(A).

A1 and A3 are together the analogue of CRW and A2 is the analogue of StrongReduction. The intuition for A1 is that if choices maximize welfare, then it mustbe that all elements of C(A) yield the same welfare value. A2 is nearly identical toStrong Reduction. The intuition for A3 is that, if choice is constrained by tempta-tion, then when we pass to a submenu but keep at least one of the original choicesavailable, welfare is weakly improved since this choice should still be feasible inthe submenu. Let ΣP(X) denote the subset of menu orders which satisfy Axioms 1and 2. Let Σ(X) denote the set of pairs (u, C) consisting of (i) a utility u on con-sumption choices and (ii) a choice correspondence C, where the pair (u, C) satisfiesA1-A3. Define a map Λ : ΣP(X) → Σ(X) as follows. Given �, let u be a represen-tation of the order on singleton menus. Second, let C(A) = arg maxx∈A∗ u(x). PutΛ(�) := (u, C).

Proposition 1 (Equivalence of Observables). The sets Σ(X) and ΣP(X) are bijec-tively equivalent under the map Λ.

It is possible for two distinct menu preferences with a common singleton rankingto induce the same set of ex post choices. The content of the proposition is that

23I thank Faruk Gul for suggesting this result.

25

for menu preferences which satisfy Axioms 1-2 this cannot happen. We use thisfact to find an intrinsic characterization of choice filters Θ(·) that are induced bycategory models. Denote a filter induced by a category (u, C) by Θ(u,C)(·), so thatwe have Θ(u,C)(A) = ∪i argminx∈Ci∩A u(x). When this equality holds we say thatthe choice filter comes from a category model. Consider the following restrictions onan abstract filter Θ(·). The first two are direct analogues of expansion/contractionconsistency conditions from choice theory and the third condition is exactly the partof Strong Reduction that pertains to expansion consistency (see our discussion ofStrong Reduction when we introduced this axiom).

1. (Sen’s α) If A ⊆ B, then Θ(B) ∩A ⊆ Θ(A).

2. (Restricted Sen’s β, I) If A ⊆ B and minx∈B\A u(x) ≥ maxx∈A u(x), thenwhenever x, y ∈ Θ(A) and y ∈ Θ(B) we have x ∈ Θ(B) as well.

3. (Restricted Sen’s β, II) If A ⊆ B and Θ(B) ⊆ A, then whenever x, y ∈ Θ(A)and y ∈ Θ(B) we have x ∈ Θ(B) as well.

4. (Worst choices are always feasible, WCF) If z ∈ A and u(y) ≥ u(z), ∀y ∈ A,then z ∈ Θ(A).

The first condition says that if a choice x is feasible in a menu B, i.e. it is anelement of Θ(B), then it is still feasible in a subset A. Since we may have removedsome temptations in passing to the subset A but could not have added any newtemptations, what is temptation-constrained feasible in B remains so in A. Thesecond condition says that whenever we add options that normatively dominateeverything in A, then objects feasible in A remain feasible in B (since we usuallyonly think of y as a temptation for x if u(x) > u(y)). The third condition says thatif we reach B from A by just adding infeasible elements, then anything feasible in Aremains so in B. Again, the intuition is based on temptation: If we add an infeasibleelement, the cause for its infeasibility is a temptation (or menu of temptations) whichwas already present in A. In this sense, the restriction is saying that this infeasibleelement cannot constitute a “new” temptation which was not already present inA.24 Hence, if x ∈ A is feasible in the presence of the temptations in A, it remainsfeasible in B.

Proposition 2. A filter Θ comes from a category model (u, C) if and only if, fixingthis u, it satisfies Sen’s α, the restricted Sen’s β conditions, and WCF.

The Lleras et al. (2008) model imposes Sen’s α on the choice filter, hence fil-ters which arise from category models are a sub-class of this class of filters. Filterswhich come from categories are also nested in the Manzini and Mariotti (2012),

24This can be interpreted as a version of the “transitivity of infeasibility” property that wementioned in sketching the elicitation of categories.

26

Masatlioglu et al. (2012), and Cherepanov et al. (2013) models. In the first of these,the nesting can be seen from functional forms. For the latter two papers, this fol-lows from the behavioral characterization – as second stage choices from the categorymodel satisfy the relaxation of WARP which is the main axiom that characterizesthese models. Turning to the RSM model (Manzini and Mariotti (2007)), there aresome sub-classes of the category model that admit an RSM representation, e.g. forspecial cases such as categories induced by Strotz preferences or categories whichare partitions there is an equivalent expression in terms of an RSM model – aspointed out in Horan (2011). Once we allow aggregation effects there is no RSMrepresentation.25

5 Conclusion

This paper develops a new axiomatic model of menu choice induced by temptation.The goal is to model behavior where background (and, in some cases, unchosen)temptations nudge the decision-maker into choosing milder temptations which hewould not otherwise have chosen. We explain this behavior with a model of (implicit)attribute choice. The idea is that choice objects are bundles of attributes, and thisallows the DM to use attributes as a commitment device to avoid certain temp-tations, i.e. by “choosing” a certain attribute he avoids temptations which don’tpossess that attribute. Since attributes are subjective, the exercise of the paper isto elicit and identify attributes from menu choice data. Moreover, the model turnsout to be recoverable from “ex post” observables, viz. observing (i) the ranking onsingleton menus (i.e. normative preference) and (ii) choices from menus is sufficientto elicit and identify the attribute-based model (on menus) which generated thesechoices.

25Consider the implied choices from the temptation-driven attraction effect (Example 2) andtowards contradiction allege an RSM-style representation (P1, P2), where P1, P2 is a pair of asym-metric binary relations. Note that C({x, y}) = {x}, C({x, z}) = {x} implies that x survives theelimination round using P1 since when the choice set is either {x, y} or {x, z}, the choice is x.Hence, ¬(zP1x) ∧ ¬(yP1x). OTOH, since C({x, y, z}) = {y}, this means that x is eliminated inthe second stage (the choice round), so that (yP2x)∨ (zP2x). Either case contradicts the fact thatx is chosen from {x, y} (resp. {x, z}).

27

6 Appendix

6.1 Proofs for Section 3

First some preliminaries. Introduce an auxiliary function,

umin(x,A) := min{U(A′) : A′ ⊆ A, x ∈ A′}.

Here we take U(·) to be any cardinal representation of the menu order �. Notethat the function umin(·, ·) yields a family of rankings {�A}A∈M that is independentof the choice of U(·). Recall the following axiom, which is “one-half” of the Set-Betweenness axiom introduced in Gul and Pesendorfer (2001): A � B ⇒ A � A∪B.Note that this property, referred to as “Positive Set-Betweenness” (PSB) in theliterature, is implied by CRW. Introduce two pieces of notation which will be usefulin the proof of Theorems 1 and 2. First, let Ax denote the maximal menu (w.r.t.set inclusion) such that U(Ax) = umin(x,A) (existence follows from PSB). Second,put θx(A) := {y : x �A y}. A key property of these “local” menu preferences �A isthat they satisfy the following condition:

(∗) Ax = θx(A).

To prove this, note that if x �A y, then Ax � Ay. This implies, by PSB, thatAx � Ax∪Ay. Maximality then implies Ax = Ax∪Ay. Thus, θx(A) ⊆ Ax. To checkthe reverse containment, let y ∈ Ax with y �A x. Then, since y ∈ Ax, we obtainumin(y, A) = U(Ay) ≤ U(Ax) = umin(x,A). On the other hand, y �A x implies, bydefinition, umin(y, A) ≥ umin(x,A) so that y ∼A x.

Let I1A, . . ., IkA denote a top-down enumeration of the �A-indifference classes and

let Σ(A) denote the set of implied choices in A, i.e.

Σ(A) := {x ∈ A : (i) {x} ∼ A, (ii)x ∈ A′ ⊆ A⇒ A′ � A}.

We check that Σ(A) = inf{y : y ∈ I1A}. This connects the set of choices with therevealed relation �A.

Lemma 2. Σ(A) = inf{y : y ∈ I1A}.

Proof. First observe that Σ(A) ⊆ I1A: Otherwise, if x ∈ Σ(A) ∩ IjA for some j > 1,then by CRW we obtain ∪j≥2 I

jA � A. On the other hand, by (∗), A = ∪k

j=1 IjA ≻

∪j≥2 IjA – contradiction. Thus, Σ(A) ⊆ I1A. Now for any y ∈ I1A consider the menu

A′ := {y, I2A, . . ., IkA} and note that A′ � A, again by definition of �A and (∗). Take

any x ∈ Σ(A) so that we obtain: A′ � A ∼ {x}. On the other hand, we claim thatumin(y, A

′) = U(A′). To see this, observe that umin(x,A′) = umin(x,A), ∀x ∈ A′\y

by (∗). Since umin(y, A) > umin(x,A), ∀x ∈ A′\y it follows that

umin(y, A′) ≥ umin(y, A) > umin(x,A) = umin(x,A

′), for all x ∈ A′\y.

28

Therefore, by (∗), we must have umin(y, A′) = U(A′). Thus, we obtain

U({y}) = umin(y, {y}) ≥ umin(y, A′) = U(A′) ≥ U(A) = U({x}).

It follows that {y} � {x}, ∀y ∈ I1A, implying that Σ(A) ⊆ inf{y : y ∈ I1A}. To showthe reverse containment, simply note that if x ∈ inf{y : y ∈ I1A} and x ∈ A′ ⊆ A,then by definition of �A and (∗) we have A′ � A.

Proof of Theorem 1. Necessity of Axiom 1 (CRW) is straightforward, hence we omitthe argument. For necessity of Axiom 2 (Strong Reduction), let (u, C) be a categorymodel and let � denote the underlying menu preference represented by (u, C). LetA∗ = {x ∈ A : A′ � {x}, ∀A′ ⊆ A, x ∈ A′} be the subjectively feasible subset of A,and let A∗

(u,C) = ∪i arg minx∈Ci∩A u(x), where we take C ≡ {Ci}. Note that Axiom 2is implied by the equality

A∗ = A∗(u,C)

To check the equality, consider the left-to-right inclusion. Let x ∈ A∗ and putΣ(x) = {Ci ∈ C : x ∈ Ci}. We claim that there must be some Ci ∈ Σ(x) suchthat x ∈ inf(Ci ∩ A). Else, for each Ci ∈ Σ(x) choose some zi ∈ inf(Ci ∩ A)with {x} ≻ {zi} and consider the menu A′ := {zi : zi ∈ Ci} ∪ {x}. Note that{x} ≻ A′ – contradicting the fact that x ∈ A∗. Thus, there is some Ci ∈ C such thatx ∈ inf(Ci ∩ A) ⊆ A∗

(u,C). For the right-to-left inclusion take x ∈ A∗(u,C) and take any

set A′ ⊆ A with x ∈ A′. Find Ci ∈ C such that x ∈ inf(Ci ∩ A) and note that thisimplies x ∈ inf(Ci ∩ A′). It follows that A′ � A, implying that x ∈ A∗. Now takeA∗ ⊆ A′ ⊆ A and substitute A∗ = A∗

(u,C). Notice that this implies (A′)∗(u,C) = A∗(u,C).

Hence, A′ ∼ (A′)∗(u,C) = A∗(u,C) ∼ A.

We now turn our attention to the sufficiency of the axioms. To each x ∈ X weassociate an “x-tree”. The terminal nodes in each x-tree will be the elements of theoverall category.

Step 1: Constructing x-trees.Introduce the following terminology. Fix an index set {1, 2, . . ., N}. An x-tree is atriplet of data ({Ci

j}ni

j=1, {Li}Ni=1, {C

ij → Ci+1

k }), consisting (resp.) of nodes, levels,and branches, with the following structure:

• A collection of nodes Ci1, C

i2, . . ., C

ini

for each index i.

• A collection of levels L(1), . . .,L(N), where each Li := {Ci1, C

i2, . . ., C

ini}

• A collection of branches {Cij → Ci+1

k } connecting nodes on consecutive levels.

Call Cij the root of the branch {Ci

j → Ci+1k }.

• Every node Cik in level i (for i > 0) has a unique root in level i− 1.

29

• Every node Cik in level i is the root of a branch.

Using these objects, we inductively construct an x-tree as follows. First, make thefollowing simplification. Since representability requires that Bt(x) ∈ Ci wheneverx ∈ Ci, we make no distinction between the element {x} and the set {x} ∪ Bt(x).That is, whenever we say x ∈ Ci what we implicitly mean (unless explicitly statedotherwise) is that Bt(x) ∪ {x} ⊆ Ci. Let A1(x), A2(x), . . ., An(x) enumerate theaggregation sets of x (here we distinguish between {x} and Bt(x)∪{x}). The x-treeconstruction proceeds by double-induction. The outer induction is on the �-rank ofthe element x for which the x-tree has been constructed. The inner induction is onthe tree construction for a fixed element x. For a (�)|X -minimal element x (in X),let the x-tree be just the singleton node {x}. Taking this as the base step of theouter induction, induct upwards on (�)|X -rank to construct an x-tree as follows.Let Ai(x) = {xi1, . . ., x

ik} be the elements of the aggregation set and take

C1i = {x, xi1},L1 = {C1

1 , C12 , . . ., C

1k}

Note that the sizes of the aggregation sets Ai(x) need not be the same. For nota-tional brevity, we suppress this dependence – it will make no difference whatsoeverfor the ensuing arguments. Inductively, assume we have defined nodes and branches(with unique root restriction) for levels {1, 2, . . ., m} (for m ≤ n – where n is thetotal number of aggregetion sets for x) and define level m + 1 as follows. Let{Cm

i }Nm

i=1 be an enumeration of the nodes that form Lm. For each node Cmi cre-

ate |Am+1(x)| branches as follows. Let Am+1(x) = {xm+11 , xm+1

2 , . . ., xm+1k } and put

Cm+11 = Cm

1 ∪{xm+11 }, Cm+1

2 = Cm1 ∪{xm+1

2 }, . . ., Cm+1k = Cm

1 ∪ {xm+1k }. Similarly, put

Cm+1(i−1)k+1 = Cm

i ∪ {xm+11 }, Cm+1

(i−1)k+2 = Cmi ∪ {xm+1

2 }, . . ., Cm+1ik = Cm

i ∪ {xm+1k }. Thus,

level L(m+ 1) consists of Nm·|Am+1(x)| nodes, Cm+1j , and Nm·|Am+1(x)| branches,

{Cmi → Cm+1

j } (where (i− 1)·k+1 ≤ j ≤ i·k). Inductively proceed until we exhaustall of the aggregation sets {A1(x), . . ., An(x)}. Let L(n) = {C(1), C(2), . . ., C(N)} bean enumeration of the nodes at level L(n). For the next step, find the (�)|X -maximaly such that {x} ≻ {y} and for each C(i) with y ∈ C(i) attach a y-tree (which has beenconstructed by the induction hypothesis). This extends the levels in the original x-tree by the number of levels in the y-tree. For each C(i) in level L(n) that does notcontain y we just extend a single branch {C(i) → Cn+1(ji)}, {Cn+1(ji) → Cn+2(ki)},and so on, for each subsequent level, where we put C(i) = Cn+1(ji) = · · · = Cn+M(li)(here we take M to be the number of levels in a y-tree). Thus, we obtain a treewith n +M levels. Now continue this procedure. Take a (�)|X-maximal z with{y} ≻ {z} and for each C(i) ∈ L(n +M) with z ∈ C(i) attach a z-tree. Iterativelyproceed as above. Since X is finite, this process terminates at some level L(Nx).This concludes the construction of the x-tree.26

26If there are no ties in the singleton ranking, this construction is canonical. If there are ties,then we fix a top down labeling {x1, . . ., xn} at the outset and at each step of the recursion pick a(�)|X -maximal z. When there are ties, break the tie according to the pre-selected labeling.

30

Step 2: Check the inequality UC(·) ≤ U(·).Having defined the x-tree for each x ∈ X we take the category, C, to be the set ofall terminal nodes in the level L(Nx) for every x ∈ X . Taking u to be a represen-tation of the singleton ranking, the claim is that the pair (u, C) represents �. LetU(·) be any cardinal representation of � (which extends u) and let UC(·) denotethe utility defined by the category formula for the pair (u, C), where u ≡ U(·)|X .We show representability by checking equality UC(·) = U(·) on all menus. The treestructure of the categories allows for a useful decomposition of the function UC(·).Let {L(Ny)} denote the set of all terminal levels across all trees. Abusing notation,let Ty1, Ty2 , . . ., Tyn be an enumeration of all trees (where |X| = n) and define

UTy(A) := maxCi∈L(Ny):Ci∩A 6=∅minz∈A∩Ci u(z)

As before, we suppress the requirement that we maximize only over terminal nodesCi which intersect A. Observe that we have the equality

UC(A) = maxx∈X UTx(A)

Thus, the value the category utility assigns to menu A is the maximum of its valueacross trees. We analyze UC(·) by analyzing its behavior on a tree-by-tree basis. Inparticular, we check that for each x we have UTx(A) ≤ umin(x,A). If umin(x,A) =u(x), then this claim is obvious since we clearly have minz∈Ci∩A u(z) ≤ u(x) for eachterminal node Ci of the Tx-tree. Consider the case where umin(x,A) < u(x). For agiven tree Tx, let L(1), . . .,L(Nx) denote its levels. For each level L(i) consider thefunction

UL(i)(A) := maxCj∈L(i)minz∈A∩Cj u(z).

Note that the menu A contains either a singleton temptation or a non-trivial ag-gregation set A(x) for x. Consider the latter case and note that by Strong Re-duction and the minimality property of A(x), we may find z ∈ sup(A(x)) suchthat {x} ∪ A(x) ∼ {z}. Let L(i) denote the level at which this aggregation set isintroduced into each node of the Tx-tree. We then have

(∗) UL(i)(A) ≥ UL(i+1)(A) ≥ · · · ≥ UL(Nx)(A) = UTx(A)

Note that UL(i)(A) ≤ u(z), so that UTx(A) ≤ u(z). This holds for all aggregationsets A(x) ⊆ A.

Now we construct a particular aggregation set A∗(x) with U({x} ∪ A∗(x)) =umin(x,A). Consider the orders �A underlying the function umin(·, A).

27 Let IA(x)

27This part of the argument makes essential use of structural properties of these orders �A underaxiom CRW, viz. lemma 2

31

denote the �A-indifference class of x and let I1A, . . ., IkA denote a top-down enumer-

ation of those �A-classes where I1A := IA(x). Put

A′(x) = ∪j≥1 inf(IjA)

Note that {x} ∪ A′(x) ∼ {z} for any z ∈ inf(I1A). We would like to claim A′(x) isthe desired aggregation set, but it may not be minimal. Consider the collection ofall subsets, say A′′(x), of A′(x) which have the property that {x} ∪ A′′(x) ≺ {x}.Find a minimal (w.r.t. set inclusion) such set, call it A∗(x). Since umin(x,A) = u(z)for z ∈ inf(I1A) and inf(I1A) ≻ inf(I2A) ≻ · · · ≻ inf(IkA) (by lemma 2) we must havesup(A∗(x)) ∩ inf(I1A) 6= ∅. It follows that A∗(x) is an aggregation set for x with theproperty that U(sup(A∗(x))) = umin(x,A). Apply the conclusion of the precedingparagraph to the aggregation set A∗(x) to obtain UTx(A) ≤ umin(x,A).

Now consider the case where A contains a singleton temptation, i.e. z ∈ Bt(x)with the property that umin(x,A) = u(z). In this case, z is introduced at the rootnode of the tree Tx, so that the u-minimum of each terminal node in this tree istrivially bounded above by u(z) when z is on the menu. It follows that, for eachx ∈ A, the value of the category function UC(·) on the x-tree Tx is bounded aboveby umin(x,A). In other words, for each x ∈ A w have UTx(A) ≤ umin(x,A). For eachx /∈ A we consider two cases, either (i) A ∩ Ci = ∅, ∀Ci ∈ L(Nx) or (ii) Ci ∩ A 6= ∅for some Ci ∈ L(Nx). In the former case, the function UTx(A) does not enter intothe domain of the maximization UC(A) = max UTx(A). In the latter case, each nodeCi with Ci ∩ A 6= ∅ contains a node C′

i ∈ L(Nz) for each z ∈ Ci ∩ A. Note thatminw∈Ci∩A u(w) ≤ minw∈C′

i∩Au(w) ≤ UTz(A), where C′

i ∈ L(Nz) and z ∈ Ci ∩ A.Hence, we obtain UTx(A) ≤ maxz∈A U

Tz(A) ≤ maxz∈A umin(z, A). Now recall thatU(A) = maxx∈A umin(x,A) and putting together the bounds on UTx(A) for each treeTx we obtain

UC(A) = maxUTx(A) ≤ maxx∈A umin(x,A) = U(A).

Step 3: Check the inequality UC(·) ≥ U(·).We now check the reverse inequality. First introduce some terminology. For nota-tional brevity (for this step alone) we denote level i tree nodes as xj(i) (the j-thnode in level i) and branches are denoted xj(i) → xk(i + 1). Fix a menu A andsome x ∈ A. Consider the x-tree Tx and consider the set of all directed paths alongbranches in the tree, Φ := {ℓ : ℓ = {x → xi1(1) → xi2(2) → · · · → xiNx

(Nx)}}.Let ℓ = (ℓ1, ℓ2, . . ., ℓNx

) denote the specific nodes which lie along the path ℓ. Wesay that x is unobstructed in Tx by the menu A if there is a path ℓ ∈ Φ such thatℓi∩A = {x}, ∀ℓi. We claim that x is unobstructed in a menu A if and only if x ∈ A∗.Note that if we prove this, then it follows that the value of the category functionUC(A) on the tree Tx is u(x), which proves UC(A) ≥ U(A). Hence, we reduce toproving that x is unobstructed in the x-tree Tx at the menu A if and only if x ∈ A∗.

32

For the “if” part of the claim assume that x is unobstructed. Then clearly thevalue of the category function on the tree Tx is u(x). It follows that x ∈ A∗. Now forthe reverse direction. Proceed by induction on the (�)|X-rank of x. That is, for thebase step take z ∈ inf(X) and verify that: For any menu A with z ∈ A and z ∈ A∗,the tree Tz is unobstructed by A. Now induct upwards. If x is the lowest rankedsingleton in A∗, then the unobstruction claim is obvious. Thus, assume x is not thelowest ranked singleton in A∗ and wlog that x ∼ A∗ (since, if Tx is obstructed inA, it can only be obstructed by elements with (�)|X-rank strictly lower than x).Moreover, for the unobstruction claim we actually need only consider menus A ofthe form A ≡ {x} ∪ A′ where A′ := ∪y:{x}≻{y} y. Thus, we shall assume that A hasthis form.

Let A1(x), A2(x), . . ., Ak(x) be an enumeration of the aggregation sets of x.We claim that for each Ai(x), there is some zi ∈ Ai(x) such that (i) the zi-treeis unobstructed in A and (ii) zi 6∈ A. Check this via contradiction. AssumeA1(x) is such that for every z ∈ A1(x) either (i) z is obstructed in A, or (ii)z ∈ A. Then, when z is obstructed in A we have (by the induction hypothesis)z 6∈ (A ∪ {z})∗. Let {z1, . . ., zn} be elements of A1(x) that are obstructed by A andlet {zn+1, zn+2, . . ., zm} be the elements of A1(x) that are unobstructed by A, but forwhich zi ∈ A. Since we are alleging {z1, z2, . . ., zn}∪{zn+1, . . ., zm} = A1(x), we thenobtain (A1(x) ∪ A)

∗ ⊆ A∗. It follows (by Strong Reduction) that A∗ ∼ A1(x) ∪ A.OTOH, by Strong Reduction again, A ∼ A∗ ∼ {x} - contradiction. Thus, for eachAi(x) find zi such that (i) zi is unobstructed in A and (ii) zi 6∈ A.

For what follows we will need to concatenate paths from different levels (in thex-tree Tx). Let ℓ(i, j) denote a path connecting a node in level L(i) to a node in levelL(j). Recall that the element x had k attraction sets. Let z1, . . ., zk be a list of unob-structed elements chosen respectively from A1(x), . . ., Ak(x) (and such that zi 6∈ A).Let x(k) denote a node in level L(k) of the x-tree that contains {x, z1, . . ., zk}, i.e.we sequentially attach {z1} ∪ Bt(z1), {z2} ∪ Bt(z2), . . ., {zk} ∪ Bt(zk) to the initialnode comprised of {x} ∪Bt(x). Let ℓ(0, k) denote the unobstructed path from level0 to level k that ends at the node x(k). From the construction of the x-tree, wesuccessively attach yi-trees for some y1, y2, . . ., yl. Let yi1 = y1 = zi1 , yi2, . . ., yik bethe subsequence of {y1, y2, . . ., yl} where we first attach a zi-tree in the x-tree con-struction algorithm. Let L(N1), . . .,L(Nl) denote the terminal levels of the yi-treesbeing attached to the nodes at level L(k). We inductively concatenate the unob-structed path ℓ(0, k) with an unobstructed path ℓ(k, k + N1) and in turn with anunobstructed path ℓ(k +N1, k +N1 +N2), and so on.

If y1 ∈ {z1, . . ., zk} (call this the “unobstructed set”), there is an unobstructedpath ℓ(k, k + N1) from the node x(k) to some node x(k + N1) in level L(k + N1)of the partial tree obtained by concatenating the y1-tree to the preceding levels

33

L(1),L(2), . . .,L(k). Concatenate ℓ(0, k) to ℓ(k, k +N1) to obtain an unobstructedpath from level L(0) to level L(k +N1). If y1 ∈ Bt(x), then y1 /∈ A and y1 must beunobstructed at A. The argument for this mimics the companion case for y2 (resp.y3) and is presented in more detail below. Now consider y2. If y2 6∈ x(k +N1), thena copy of the node x(k + N1) is replicated on every level of the attached y2-tree.Thus, take ℓ(k +N1, k +N1 +N2) to be the path which concatenates the branches{x(k+N1) := C0 → C1}, {C1 → C2 = x(k+N1)}, . . ., {CN2−1(= x(k+N1)) → CN1(=x(k+N1))}. If y2 ∈ x(k+N1), consider two cases. Either y2 is in the unobstructedset or it is not. In the former case, replicate the argument for the y1 case to extendthe unobstructed path. If y2 is not in the unobstructed set, then either (i) y2 is firstintroduced at some level L(k+ ky) of the y1(= zi1)-tree or (ii) y2 ∈ Bt(x) or Bt(z1).Consider case (i). Let x(k+ky) be the unique predecessor node at this level for whichthere is an unobstructed path starting at x(k + ky) and terminating at x(k + N1).Since we attach a y2-tree at some level of the construction of the y1-tree, a portionof this path must pass through (unobstructed) an y2-tree. Denote this path segmentas ℓ(k1, k2) (where there are N2 branches that comprise this segment). Note thatthe y2-tree is embedded inside the y1-tree, so that (by the recursiveness of the treeconstruction) the image of the same path in any embedded y2-tree is unobstructed(w.r.t the menu A). Let ℓ(k+N1, k+N1+N2) be a copy of the path ℓ(k1, k2) with rootnode x(k + N1), and that passes through the y2-tree. Consider the concatenation,(ℓ(0, k); ℓ(k, k + N1); ℓ(k + N1, k + N1 + N2)) and note that this concatenation isunobstructed. Let the terminal node of this path be denoted x(k +N1 +N2). Thisargument is summarized in the following schematic.

root node of y1-tree

b

node x(k + ky)

L(k + ky)

{y2}

attach y2-tree

b

node x(k + N1)

L(k + N1)

Figure 3: An unobstructed path through the y1-tree which passes through the nestedy2-tree.

The green segment is the path ℓ(k1, k2) – this is an unobstructed path throughthe nested image of the y2-tree inside the y1-tree. This means that when we reachthe stage of the reconstruction where we attach the y2-tree we can use the same

34

path to extend the concatenation of an unobstructed path. Now consider case (ii). Ify2 ∈ Bt(x), then we claim it must be unobstructed at A (in addition to y2 /∈ A). Else,y2 /∈ (A∪{y2})

∗ (again, by the induction hypothesis). Hence, (A∪{y2})∗ ⊆ A∗ which

implies, by Strong Reduction, that A∗ ∼ A∪{y2} and, in turn, A ∼ A∪{y2}. OTOH,{x} ∼ A and {x} ≻ A ∪ {y2} – contradiction. Similarly, if y2 ∈ Bt(z1) it must beunobstructed in A (in addition to y2 /∈ A). Else, an analogous argument shows thatz1 is obstructed in A. Hence, in either case we can find an unobstructed path goingthrough the y2-tree. As in case (i), we concatenate the initial path (ℓ(0, k); ℓ(k, k +N1)) with an unobstructed path through the attached y2-tree, call the latter ℓ(k1 +N1, k+N1+N2). Note that the concatenation (ℓ(0, k); ℓ(k, k+N1); ℓ(k+N1, k+N1+N2)) is unobstructed. Next consider y3. If y3 is in the unobstructed set we proceedverbatim as above. Else, either (i) y3 is first introduced along an unobstructed pathin the yi-tree for some i ≤ 2 or (ii) y3 ∈ Bt(x), Bt(z1), or Bt(z2) (the latter if y2 = z2).In case (i), a portion of one of the paths ℓ(k+N1, k+N1+N2) or ℓ(k, k+N1) must passunobstructed through the y3-tree (at the point in the construction of the y1 (resp.y2) tree where the y3 tree is attached). Let ℓ(k+N1+N2, k+N1+N2+N3) denotea replica of this path that goes through the y3-tree. Continue the concatenation(ℓ(0, k); ℓ(k, k+N1); ℓ(k+N1, k+N1+N2)); ℓ(k+N1+N2, k+N1+N2+N3)) to obtainan unobstructed partial path. Case (ii) is dealt with the same way as in the precedingargument (i.e. case (ii) for y2). Inductively proceed to extend the concatenated pathto obtain a sequence (put ki = k +

∑i

j=1 Nj) (ℓ(0, k); ℓ(k, k1); · · ·; ℓ(kl−1, kl)). Notethat this is a complete path in the x-tree and, moreover, it is unobstructed. Itfollows that the value of the category function UC(·) on the tree Tx is u(x), provingthat UC(A) ≥ U(A). This concludes the proof of the theorem.

Proof of Theorem 2. Menus (or, equivalently, nodes) will be bolded to denote thedifference between an element and a set of elements. The argument follows twosteps, (i) embedding and (ii) pruning. The first step shows that every categorymodel (u, C) that is sharp is a prolongation (of the form described in the theorem)of some sharp sub-category of the tree category. The second step shows that thereis a unique sharp sub-category of the tree category.

Step 1: Embedding.For each x ∈ X consider Σ′(x) = {C′ ∈ C1 : x ∈ C′, x ∈ sup(C′)}. If Σ′(x) 6= ∅, withlabeling determined by the x-tree construction, let A1(x), . . ., Ak(x) be the aggrega-tion sets associated with x. We reconstruct a set of paths through the x-tree whoseassociated set of terminal nodes is contained in nodes in the set Σ′(x). Introducesome notation. Let Φ be the set of all paths in the x-tree and for each ℓ ∈ Φ, letℓ−1(x1(1)) denote the set of all terminal nodes which have {x, x1(1)} as a predecessornode.28 Similarly, for any (partial) path ℓ(1, n) from level 1 to level n, let ℓ−1(ℓ(1, n))denote the set of all terminal nodes whose associated paths through the tree all

28Recall that when we put x in the node, this is shorthand for {x} ∪Bt(x).

35

share the initial segment ℓ(1, n). Consider the aggregation set A1(x) and an elementx1(1) ∈ A1(x). We do not know yet that the sets in Σ′(x) correspond to terminalnodes in the x-tree – this will be our conclusion. Nevertheless, we will apply the nota-tion ℓ−1(ℓ(1, n)) to the sets in Σ′(x). The meaning is the following: For x1(1) ∈ A1(x)we take ℓ−1(x1(1)) to be the set of all elements (terminal nodes) in Σ′(x) that con-tain the element x1(1). Similarly, for any x2(i) ∈ A2(x) let ℓ

−1(x1(1), x2(i)) denoteall sets in Σ′(x) which contain both x1(1) and x2(i). Note that, by representability,every node in Σ′(x) is in ℓ−1(x1(i)) for some x1(i) ∈ A1(x) (there may be somenodes contained in ℓ−1(x1(i)) for more than one i). Thus, we have a correspondenceγ1 : A1(x) ⇉ Σ′(x) given by x1(j) 7→ ℓ−1(x1(j)). Now iterate this process. Foreach x2(i) consider the set ℓ

−1(x1(j), x2(i)) and note that, by representability, everynode in Σ′(x) lies in ℓ−1(x1(j), x2(i)) for some pair x1(j), x2(i). Consider the cor-respondence γ2 : A1(x)× A2(x) ⇉ Σ′(x) given by (x1(i), x2(j)) 7→ ℓ−1(x1(i), x2(j)).Inductively construct correspondences, γn :

∏n

j=1 Aj(x) ⇉ Σ′(x). Note that (afterk levels) for the next step of the x-tree construction we attach an xi(j)-tree forsome xi(j) ∈ Ai(x). For any partial path ℓ(1, k) such that ℓ−1(ℓ(1, k)) 6= ∅ we canextend via the same procedure as above to obtain an extension of the path throughthe xi(j)-tree. Observe that each path that passes through a node that containsxi(j) extends by representability – there must be an element of any aggregation setassociated to xi(j) contained in the node. Also note that if xi(j) 6∈ x(k) for somex(k) ∈ ℓ−1(ℓ(1, k)) then this path is extended without branching until the terminallevel of the xi(j)-tree. Inductively proceeding we obtain a collection of paths in thex-tree.

Note that every node x′ ∈ Σ′(x) has the property that x′ ∈ ℓ−1(1, k), ∀k ≤ kℓfor some path in the x-tree of, say, length kℓ. Fix a node x′ ∈ Σ′(x) and let{ℓ1, ℓ2, . . ., ℓn} be an enumeration of paths in the x-tree (let ki denote the length ofpath ℓi) such that x′ ∈ ℓ−1

i (ℓ(1, k)), ∀k ≤ ki and for all paths ℓi. Also let x(i) denotethe terminal node (in the x-tree) of path ℓi. It follows that any node x′ ∈ Σ′(x)contains some terminal nodes x(i) in the x-tree. We verify that (by retracting ifnecessary) any x′ in a sharp model (u, C) contains precisely one such x(i). Since(u, C) is a sharp representation, for each set Ci ∈ C one of the following must betrue. Either

i. There is a menu A for which arg maxCj∈C minz∈Cj∩A u(z) = Ci, or

ii. Ci 6⊆ ∪j 6=i Cj .

In either case, we say that “the maximum occurs at x” on Ci if under case (i)the maximum of the function UC(A) occurs at x ∈ Ci for some menu A, or ifx /∈ ∪x′′(6=x′)∈C x

′′ (in which case we can take the menu that supports this as a max-imum to be A = {x}). Assume now that the maximum occurs at x = sup(x′). Weclaim that there is only one x(i) ⊆ x′. Else, if there is a distinct pair (x(i),x(j))

36

with x(i),x(j) ⊆ x′ find z ∈ x(j)\x(i) (or z ∈ x(i)\x(j) if x(j) ⊆ x(i)). Considerthe menu A ∪ {z}. Since z /∈ x(i) and A ∼ {x} (and x(i) is a terminal node in thetree category (u, T ) – which represents �) we must have A ∪ {z} ∼ {x}. OTOH,UC(A∪ {z}) < u(x) – contradicting the hypothesis that (u, C) represents � as well.Hence, there is no such pair (x(i),x(j)).

Notice that exactly the same argument shows that the node x(i) coincides withx′. Now assume the maximum occurs not at x, but for some zx ∈ x′ where{x} ≻ {zx} (i.e. in case (i) holds, or zx /∈ ∪x′′∈C x

′′ in case (ii) holds). Take zxto be the (�)|X-maximal such element. Replace x′ in the category C with the lowerbound order interval, (−∞, zx] ∩ x′ := x∗. Note that the category C′ := C\x′ ∪ x∗

(i.e. C′ is obtained from C by deleting node x′ and replacing it with x∗, all othernodes are left intact) also represents �. Now apply the preceding argument with zxreplacing x to find a unique terminal node zx(i) in the zx-tree contained in x∗ ⊆ x′.The preceding argument shows that zx(i) = (−∞, zx] ∩ x′. Hence, zx(i) is a lowerbound order interval in x′.29 This shows that every node x′ ∈ C contains a terminalnode of the tree category T as a lower bound order interval. To complete the argu-ment we need that no two nodes x′,x′′ ∈ C map to the same terminal node (underthe mapping x 7→ (−∞, zx] ∩ x described above) of the tree category as a lowerbound order interval. However, this is a consequence of sharpness. If two nodesmap to the same (∞, zx] ∩ x, then both cannot be relevant for the representationsince for any menu where the maximum UC(A) occurs on the node x (resp. x′), itmust occur on the common lower bound order interval (−∞, zx]∩x. Thus, not bothof x,x′ can be relevant for the representation.

This concludes the argument for the embedding step. Let us summarize whathas been shown so far. Starting with any sharp model (u, C) we have shown thatthis model is a prolongation of a sharp model (u, C′), where the latter model isa sub-category of the tree category T . Moreover, each node, C′

i, in C′ is a lowerbound order interval for a unique Cj in C. We can further retract (u, C′) to a (sharp)sub-model of the tree category with the property that for each node C′

i ∈ C′ the(�)|X-maximal element for which the maximum of the function UC(·) is attained onC′i (for some menu A) is sup(C′

i).30 Formally, we take

sup{xA ∈ C′i : (i)A ∼ {xA}, (ii) C

′i = arg maxC′

j∈C′ minz∈C′

j∩Au(z)}

and reduce (by the preceding argument) to nodes C′i such that sup(C′

i) = sup{xA :(i)A ∼ {xA}, (ii) C

′i = arg maxC′

j∈C′ minz∈C′

j∩Au(z)}. Put xC equal to this common

29The same argument for x′ shows that the node x∗ must contain some terminal node of the

zx-tree since the model (u, C′) represents �.30This follows from identical reasoning as given in the preceding paragraph for the embedding

argument.

37

element and note that C′i must be a minimal (w.r.t. set inclusion) node in the xC-

tree. Now turn to the next step.

Step 2: Pruning.Fix the tree category representation of (u, T ) of �. Take any two sharp sub-categories (u, C1), (u, C2) which also represent � and have the property that foreach Ci

1 ∈ C1 (resp. Ci2 ∈ C2) the maximum of UC1(·) (resp. UC2(·)) for some menu A

is attained on sup(Ci1) (resp. sup(C

i2)). By the preceding remarks, this implies that

all nodes in C1, C2 are minimal nodes of the respective x-trees to which the nodesbelong. We show that C1 = C2 by downwards induction on Tx-trees. That is, weinductively show that all minimal nodes in the Tx-tree that are present in C1 mustalso be in C2 and vice-versa. Start with the (�)|X -maximal x-tree. We claim thatfor maximal Tx-trees all minimal terminal nodes are present in both C1 and C2. To-wards contradiction, assume there is a minimal terminal node, say x∗ that is not inthe sub-category (u, C1). Let {x1, . . .,xN} be an enumeration of the minimal nodesin the Tx-tree with x1 ≡ x∗. For each xi( 6= x1) pick some element zi ∈ xi\x1 (byminimality of x∗). Consider the menu A′ := {zi : zi ∈ xi} ∪ {x}. Since the Tx-treeis unobstructed at A′ we must have A′ ∼ {x}. OTOH, consider the value of thefunction UC1(A′) (where UC1(·) denotes the menu utility generated by (u, C1)). Notethat the only nodes in the category that contain x are the terminal nodes from thex-tree (by (�)|X-maximality). Since x∗ 6∈ C1 and {x} ≻ {zi}, where zi ∈ xi, everynode in C1 that contains the singleton x is obstructed by some zi. It follows thatUC1(A′) < u(x), which contradicts representability since the Tx-tree is unobstructedat the menu A′. It follows that all minimal terminal nodes for a (�)|X -maximalx-tree are present in the model (u, C1) (and resp. (u, C2), by the identical argument).

Similar reasoning shows that all minimal terminal nodes are present for anynon-embedded Tx-tree.

31 Accordingly break up the x-trees into two groups: groupI is the set of non-embedded x-trees and group II is the set of embedded x-trees.Copying the preceding argument we check that (u, C1) and (u, C2) agree on all treesin group I, i.e. all nodes in C1 from group I trees are in C2 and vice-versa. Next,we induct downwards on (�)|X-rank to show that all nodes in (u, C1) that comefrom group II trees are also in (u, C2) and vice-versa. Enumerate group II trees via{x1} ≻ · · · ≻ {xn} and assume we have shown agreement of nodes in C1 and C2 onall Txi

-trees for 1 ≤ i ≤ k. Consider the Txk+1-tree and, towards contradiction, say

that x∗ is a node in (u, C1) that is not in (u, C2). Consider each Txi-tree in which

the Txk+1-tree is embedded. Consider the nodes in this tree which contain the node

x∗ and which are present in C2. Formally denote this as:

κi(x∗) := {yi ∈ C2 : x∗ ⊆ yi}.

31By a “non-embedded” Tx-tree we mean any Tx-tree such that x does not appear in an aggre-gation set A(x′) for some {x′} ≻ {x}.

38

We claim that κi(x∗) is non-empty for some xi in which the Txk+1-tree embeds.

Proceed via contradiction. Consider all the minimal nodes of the xk+1-tree and foreach such node xi 6= x∗ find zi ∈ xi\x∗. Now note that, for any xi-tree in whichthe xk+1-tree embeds, any terminal node of this tree which contains xk+1 containsa node from the xk+1-tree. If κi(x∗) = ∅ for all xi-trees in which the xk+1-treeembeds, then each of these terminal nodes contains one of the nodes xi 6= x∗ fromthe xk+1-tree. Now consider the menu A′ := {zi ∈ xi : xi 6= x∗} ∪ {x}. Note thatxk+1 is unobstructed at A′. However, the category C1 is obstructed since we arealleging that x∗ /∈ C1 and that κi(x∗) = ∅, ∀i – contradiction. It follows that κi(x∗)is non-empty for some xi-tree in which the x-tree embeds.

Now pick any yi ∈ κi(x∗). By the induction hypothesis, since yi ∈ C2 we alsohave yi ∈ C1. Since C1 is sharp, x∗ cannot be a lower bound order interval in yi (asdeleting x∗ from C1 would then yield a representation of the same menu preference,implying that x∗ is not relevant for the representation – contradicting sharpness).Hence, for each such yi we can find zi ∈ yi\x∗ with {xk+1} ≻ {zi} and do this forevery yi ∈ κi(x∗) and for every i with κi(x∗) 6= ∅. Also find zi ∈ xi\x∗ for eachminimal node other than x∗ of the xk+1-tree. Let A′ := {zi : zi ∈ xi\x∗} ∪ {xk+1}.Also put

Ai = {zi : {xk+1} ≻ {zi}, zi ∈ yi\x∗,yi ∈ κi(x∗)}.

Put these all together to construct the menu:

A := A′ ∪i Ai.

Note that the xk+1-tree is unobstructed at A, so that A ∼ {xk+1}. OTOH, comput-ing the value of UC2(A) note that the embedded image of the node x∗ in any xi-treeis obstructed by A and, by construction, the nodes in C2 from the xk+1-tree are alsoobstructed by A. It follows that UC2(A) < u(xk+1) – contradicting representability.Thus, x∗ ∈ C2, so that all nodes from the tree Txk+1

that are in C1 are also in C2.By symmetry, all nodes from this tree that are in C2 are also in C1 – completing theinductive step. It follows that C1 ≡ C2.

N.B. The argument shows that there is a unique, sharp sub-category of the treecategory with the property that (i) all nodes are minimal and (ii) the highest valuefor which a strict maximum of the function UC(·) occurs on each of these nodes isthe (�)|X-maximal singleton in the node. Note that it is easy enough to constructsuch a category from scratch. Namely, start with the tree category. Pass to thesub-category of all minimal nodes – which also yields a representation. Next, passto any sharp sub-category. Note that the above argument already shows that allsuch sub-categories are prolongation of some (u, C∗). To find C∗, go node by nodeand take xCi to be the maximum singleton for which the maximum is attained onnode Ci. If xCi 6= sup(Ci), then replace this node with (−∞, xCi ] ∩ Ci. The same

39

arguments as presented above show that this must be a minimal terminal node ofthe xCi-tree. Inductively proceed until we obtain a sharp sub-category consistingentirely of minimal nodes with xCi = sup(Ci) for all sets Ci in the sub-category.

6.2 Proofs for Sections 4.1-4.2

Proof of Lemma 1. That Strong CRW implies DSB is obvious. We check the con-verse. Consider the set Σ(A) = {x ∈ A : {x} ∼ A}. We claim that there is somexA ∈ A such that ∀A′ ⊆ A with xA ∈ A′ we have A′ ∼ A. Otherwise, for each xsuch that {x} ∼ A there is a subset A(x) ⊆ A with x ∈ A(x) and A 6∼ A(x). PutA := A\ ∪x∈Σ(A) A(x) and note that A = ∪x∈Σ(A)A(x) ∪ A. By iterative appli-

cation of DSB, A ∼ A which implies (by iterative application of DSB again) that∃x ∈ A such that {x} ∼ A. On the other hand, x ∈ A(x) and A(x) ∩ A = ∅ –contradiction.

Proof of Observation 1. Fix (u, v) and for each x consider the set Cx := {x} ∪ {y :v(y) > v(x), u(x) > u(y)}. Let C ≡ {Cx}x∈X . Let UC denote the menu utilitygenerated by (u, C) and let U denote the menu utility generated by (u, v) via theStrotz formula. Fix any menu A and let x ∈ Av be a u-maximal element. Notethat U(A) = u(x). Moreover, Cx ∩ A = {x} so that UC(A) ≥ u(x). For thereverse inequality, take any z ∈ A with u(z) > u(x). Then, z /∈ Av so that thereis some y ∈ Av with v(y) > v(z). Since u(z) > u(x) and x is u-maximal in Av wehave u(z) > u(x) ≥ u(y), so that y ∈ Cz. Hence, UC(A) ≤ u(x). It follows thatUC(A) = U(A).

Proof of Observation 2. We first check that U(Ax) ≥ min{minz∈A1(x) u(z), u(x) −(maxz∈A2(x) v(z) − v(x))}. Consider the GP kernel, u(z) − (maxw∈Ax

v(w) − v(z)),and break into two cases, (i) the maximum occurs on x and (ii) the maximumoccurs away from x, say at some y 6= x (so that u(y) + v(y) > u(x) + v(x)).Consider this latter case first and note that U(Ax) ≤ u(y). If U(Ax) = u(y),then we must have u(y) ≤ u(x) (by definition of Ax) so that y ∈ A1(x). It fol-lows that U(Ax) ≥ minz∈A1(x) u(z). Next, note that U(Ax) < u(y) implies that∃zy ∈ Ax with u(y) + v(y) ≥ u(zy) + v(zy) and v(zy) = maxz∈Ax

v(z) > v(y). Sinceu(y)+v(y) > u(x)+v(x), we obtain that u(y)−(v(zy)−v(y)) > u(x)−(v(zy)−v(x)).Note that u(zy) ≤ u(x), else U(Ax) = u(y) − (v(zy) − v(y)) ≥ u(zy) > u(x) (theinequality since y is chosen from Ax). But U(Ax) ≤ u(x) by definition of Ax, a con-tradiction. If zy ∈ A1(x), then u(zy)+v(zy) > u(x)+v(x) so that U({x, zy}) = u(zy).Moreover, u(y)−(v(zy)−v(y)) ≥ u(zy). Hence, {zy} ∼ {x, zy} � Ax. It follows that,in this case as well, U(Ax) ≥ minz∈A1(x) u(z). To finish this case, we now claim thatzy /∈ A2(x). Else, x is chosen from {x, zy} so that U({x, zy}) = u(x)−(v(zy)−v(x)) <u(y) − (v(zy) − v(y)) = U(Ax). On the other hand, {x, zy} � Ax by definition ofthe set Ax – a contradiction. Now consider the case where the maximum occurs on

40

x, i.e. x is chosen in Ax. Note that U(Ax) ≤ u(x). If u(x) = U(Ax), then we musthave v(x) = maxz∈A2(x) v(z). Hence, U(Ax) = u(x) − (maxz∈A2(x) v(z) − v(x)) ≥min{minz∈A1(x) u(z), u(x) − (maxz∈A2(x) v(z) − v(x))}. If U(Ax) < u(x), then findzx with u(x) + v(x) ≥ u(zx) + v(zx) and v(zx) = maxz∈Ax

v(z) > v(x). Note thatzx ∈ A2(x) so that U(Ax) = u(x)− (v(zx)− v(x)) ≥ u(x)− (maxz∈A2(x) v(z)− v(x)).Hence, U(Ax) ≥ min{minz∈A1(x) u(z), u(x) − (maxz∈A2(x) v(z) − v(x))} in this caseas well. We now check the reverse inequality: U(Ax) ≤ min{minz∈A1(x) u(z), u(x)−(maxz∈A2(x) v(z) − v(x))}. For each z ∈ A1(x) note that {x, z} ∼ {z}. Hence,{z} ∼ {x, z} � Ax, as U(Ax) = min{U(A′) : A′ ⊆ A, x ∈ A′}. Similarly, U(Ax) ≤U(A2(x)). Moreover, note that U(A2(x)) = u(x)−(maxz∈A2(x) v(z)−v(x)) as x is cho-sen from A2(x) (by definition of A2(x)). Hence, U(Ax) ≤ min{minz∈A1(x) u(z), u(x)−(maxz∈A2(x) v(z)− v(x))}.

Proof of Observation 3. Take any set A′ and note that for any Ci ∈ C(x) we haveminx∈Ci∩A′ u(x) ≥ minx∈Ci∩A u(x). It follows that

U(A′) = maxCi∈C minx∈Ci∩A′ u(x) ≥ maxCi∈C(x) minx∈Ci∩A′ u(x)

≥ maxCi∈C(x) minx∈Ci∩A u(x).

Hence, it suffices to find a set, Ax, for which the value U(Ax) is attained onmaxCi∈C(x)minx∈Ci∩A u(x). Begin with A := ∪Ci∈C(x) (Ci ∩A). If the value is attainedon some Ci ∈ C(x), then this set is our Ax and we are done. Else, if the maximum isattained on some set Ci /∈ C(x), then let z0 be an element on which the value is at-tained. Note that we must have u(z0) > minx∈Ci∩A

u(x), ∀Ci ∈ C(x). Consider A\z0.

The value U(A\z0) must either (i) be attained on Ci ∈ C(x) or (ii) attained on someCi /∈ C(x). Let z1 be the element that attains the maximum and note that u(z1) >minx∈Ci∩(A\(z0)

u(x), ∀Ci ∈ C(x). Hence, we may remove z1 from the menu without

affecting the value of minx∈Ci∩(A\z0)u(x), ∀Ci ∈ C(x). Consider A\{z0, z1} and repeat

the argument. Eventually we must reach a set A\{z0, . . ., zk} =: Ax where the max-imum occurs on some Ci ∈ C(x). Note that minx∈Ci∩Ax

u(x) = minx∈Ci∩A u(x), ∀Ci ∈C(x). It follows that umin(x,A) = maxCi∈C(x)minx∈Ci∩A u(x).

Proof of Observation 4. We check by induction on the cardinality of A that u(x,A) ≤umin(x,A). The inequality trivially holds for singleton A, thus let |A| > 1. By hy-pothesis, u(x,A) ≤ u(x,A\y), ∀y ∈ A\x. By the induction hypothesis, u(x,A\y) ≤umin(x,A\y). Let Ax be an element of argmin{U(A′) : A′ ⊆ A, x ∈ A} and notethat if ∃y ∈ A such that Ax ⊆ A\y, then umin(x,A) = umin(x,A\y) – which im-plies u(x,A) ≤ umin(x,A). If there is no such y, then it must be the case thatargmin{U(A′) : A′ ⊆ A, x ∈ A′} = {A}. In this case we obtain, on the onehand, umin(x,A) = U(A). On the other hand, U(A) = maxx∈A u(x,A), so thatu(x,A) ≤ umin(x,A).

41

6.3 Proofs for Section 4.3

Proof of Proposition 1. Put Λ(�) := (u, C) and check this pair satisfies A1-3. A1and A3 are clearly satisfied from construction of the pair (u, C). For A2, let usfirst check the equality: A∗ = A∗

(u,C). That is, the subjective feasible set com-puted relative to � agrees with the feasible set derived from the ex post observ-ables. The left-to-right containment is obvious from definition. Now check right-to-left. Let x ∈ A∗

(u,C) and note that, by definition, if A′ ⊆ A s.t. x ∈ A′, then

u(y) ≥ u(x), ∀y ∈ C(A′) where C(A′) := arg maxx∈(A′)∗ u(x). If, towards contra-diction, x /∈ A∗, then there is some subset A′ ⊆ A where x ∈ A′ yet {x} ≻ A′.This implies that x /∈ (A′)∗. Moreover, by Axiom 1 we know that A′ ∼ {xA′}for some xA′ ∈ (A′)∗. Since, by definition, A′ � {y}, ∀y ∈ (A′)∗ we obtain:{x} ≻ A′ ∼ {xA′} � y, ∀y ∈ (A′)∗. Hence, u(x) > u(y), ∀y ∈ (A′)∗, implyingthat x /∈ A∗

(u,C) – a contradiction. This shows that Λ(�) satisfies axioms A1-A3.

For the converse, let (u, C) ∈ Σ(X). We want to find an order on menus � ∈ΣP(X) such that Λ(�) = (u, C). For each menu A, pick any xA ∈ C(A) and definea utility on menus via the formula U(A) := u(xA). By A1, this is a well-definedfunction on menus. Let us check that it satisfies Axiom 1 (CRW) and Axiom 2(Strong Reduction). Consider A′ ⊆ A with xA ∈ A′. Since xA ∈ C(A), this impliesC(A) ∩ A′ 6= ∅. By A3, this implies that u(x) ≥ u(y), ∀x ∈ C(A′), ∀y ∈ C(A).Hence, U(A′) = u(xA′) ≥ u(xA) = U(A), implying Axiom 1 (CRW). To check Axiom2 (Strong Reduction) consider two sets, A∗

(u,C) and A∗. The latter is the subjective

feasible set computed w.r.t. the induced menu preference � (defined via U(A) =u(xA)). The former is the subjective feasible set defined using the observables (u, C).Notice that if (u, C) satisfies A1 − A3, then these two sets are identical. Axiom 2(Strong Reduction) then follows immediately. Hence, �∈ ΣP(X). To finish, we verifythat Λ(·) is a bijection. Since we have already checked surjectivity it suffices to verifythat Λ is injective. Towards contradiction, say that �,�′∈ Σ(X) map to the same(u, C). This implies (�)|X = (�′)|X , i.e. the singleton ranking is the same. Notethat if �6=�′, then there must be some menu A such that A∗

� 6= A∗�′.32 This implies

that there is some x with – wlog – x ∈ A∗�\A

∗�′ . Let IA(x) denote the �

′A-indifference

class of x and consider the menu, D := {z ∈ A : (i) {x} ≻ {z}, (ii) x �′A z} ∪ {x}.

By Axiom 2 (Strong Reduction) we know that x /∈ D∗�′ . OTOH, since x ∈ A∗

�, wemust have x ∈ D∗

�. Since x = sup(D∗�) it follows that x ∈ C(D∗

�), contradicting thefact that C(D∗

�) = C(D∗�′). Hence, there is no such pair (A∗

�, A∗�′), implying that

� = �′.

Proof of Proposition 2. Given a filter Θ and a normative ranking u we take C(·)to be C(A) := argmaxx∈Θ(A) u(x). We verify that when the filter satisfies Sen’s

32A∗� denotes the feasible set computed under � and A∗

�′ denotes the feasible set computedunder �′.

42

α, the two restricted β conditions, and the worst elements are feasible restric-tion, then (i) the filter equals the subjective feasible set defined by the pair (u, C)and (ii) the induced pair (u, C) satisfies A1-A3. These two facts imply, using thepreceding proposition, that the filter comes from a category model. To this end,we first show that Θ(A) coincides with the feasible set A∗ induced by (u, C).Note that Θ(A) ⊆ A∗ follows from Sen’s α. We verify that A∗ ⊆ Θ(A). To-wards contradiction, assume that x ∈ A∗, yet x /∈ Θ(A). Note that by feasibilitywe have maxz∈Θ(A) u(z) ≥ u(x). Hence, consider the (allegedly non-empty) subset{z ∈ Θ(A) : u(z) ≥ u(x)} := A1 and put A2 := {x} ∪ {z ∈ A : u(x) > u(z)}. Notethat A := (A\A2) ∪ A2 and minz∈A\A2 u(z) ≥ u(x) and, moreover, A1 ⊆ A\A2 sothat the latter menu is non-empty. By subjective feasibility we must have x ∈ Θ(A2).Now note that, by WCF, argminz∈A u(z) ⊆ Θ(A) ∩ A2. By the restricted β, I weobtain x ∈ Θ(A) – contradiction. It follows that Θ(A) = A∗, ∀A. This implies thatC(A) := argmaxx∈Θ(A) u(x) = argmaxx∈A∗ u(x). In other words, if we start outwith a pair (u,Θ) where Θ satisfies α, the restricted β condition, and WCF, andconsider the induced choice correspondence C(·), then we can recover this choicecorrespondence by maximizing u(·) on the set A∗.

We now check that the axioms A1−A3 hold for the (induced) pair (u, C). Notethat axiom A1 follows from definition of C and A3 follows from Sen’s α. We nowcheck A2. Put A∗ ⊆ A′ ⊆ A and assume x ∈ Θ(A′). If x is u-minimal in A′, thensince u-minimal elements of A are in A∗ (by WCF):

miny∈A u(y) ≤ miny∈A′ u(y) ≤ miny∈A∗ u(y) ≤ miny∈A u(y).

The first inequality since A ⊇ A′, the second since A′ ⊇ A∗, and the third by WCFagain. Hence, by WCF we have x ∈ Θ(A). If x /∈ miny∈A′ u(y), find z ∈ miny∈A′ u(y).By WCF z ∈ Θ(A′) and by the preceding argument this implies z ∈ Θ(A). Bythe restricted Sen’s β, II this then implies x ∈ Θ(A) as well (since Θ(A) = A∗

by the previous argument). Hence, A∗ = Θ(A) = Θ(A′) = (A′)∗, implying thatC(A) = C(A′).

43

References

Cherepanov, V., T. Fedderson, and A. Sandroni (2013): “Rationalization,”Theoretical Economics, 8, 775–800.

de Clippel, G. and K. Eliaz (2012): “Reason-Based Choice: A BargainingRationale for the Attraction and Compromise Effects,” Theoretical Economics, 7,125–162.

Dekel, E., B. Lipman, and A. Rustichini (2001): “Representing Preferenceswith a Unique Subjective State Space,” Econometrica, 69, 591–600.

——— (2009): “Temptation-Driven Preferences,” Review of Economic Studies, 76,937–971.

Fudenberg, D. and D. K. Levine (2006): “A Dual-Self Model of Self-Control,”American Economic Review, 96, 1449–1476.

Gul, F., P. Natenzon, and W. Pesendorfer (2010): “Random Choice asBehavioral Optimization,” Mimeo.

Gul, F. and W. Pesendorfer (2001): “Temptation and Self-Control,” Econo-metrica, 69, 1403–1435.

——— (2007): “Harmful Addiction,” Review of Economic Studies, 74, 147–172.

Horan, S. (2011): “A Simple Model of Biased Choice,” Mimeo.

Lleras, J. S., Y. Masatlioglu, D. Nakajima, and E. Ozbay (2008): “WhenMore is Less: Choice with Limited Consideration,” Mimeo.

Manzini, P. and M. Mariotti (2007): “Sequentially Rationalizable Choice,”American Economic Review, 97, 1824–1839.

——— (2012): “Categorize then Choose: Boundedly Rational Choice and Welfare,”Journal of the European Economic Association, 10, 1141–1165.

Masatlioglu, Y., D. Nakajima, and E. Ozbay (2012): “Revealed Attention,”American Economic Review, 102, 2183–2205.

Natenzon, P. (2010): “Random Choice and Learning,” Mimeo.

Noor, J. and N. Takeoka (2010): “Menu-Dependent Self-Control,” Mimeo.

Ok, E., P. Ortoleva, and G. Riella (2010): “Revealed (P)Reference Theory,”Mimeo.

Simonson, I. and A. Tversky (1993): “Context-Dependent Preferences,” Man-agement Science, 39, 349–376.

44