0011 Learning

download 0011 Learning

of 22

Transcript of 0011 Learning

  • 7/31/2019 0011 Learning

    1/22

    1

    1. Introduction

    This paper deals with consumer learning from the point of view of the relationship between

    the theory of the consumers optimal choice and the cognitive sciences. The approach is

    consistent with Simons theory of bounded rationality, and particularly with the aspect of said

    theory that goes under the name of "procedural rationality", used as a concise term to mean the

    determination of a heuristic with a view to achieving satisficing decisions. In the light of this

    conception of rationality, the concept of consumer learning is developed by means of a critical

    discussion both of the Bayesian approach and of the neural approach, the aim being to identify its

    potential for application and its limits.

    In this setting, a subject, j, assumed to represent the generic consumer, uses acts of

    consumption or, in more general terms, the acquisition of information to assess the capacity of a

    given good to satisfy specific needs, because he initially has some doubts as to said assessment.

    In fact, the problem contains aspects of a cognitive order that inevitably influence the way in

    which j makes his decisions. Simon's contribution lies in considering the question of the

    formation of knowledge as being part of the decisional process and therefore as being capable of

    influencing said process, as if it were a system of constraints.

    This work deals with the following aspects: a) the statement of a practical problem which

    prompts the need to derive an adequate representation of the learning process; b) a discussion of

    the relationship of learning in economics and cognitive science; c) a discussion of the concept of

    bounded rationality applied to consumer theory; d) the derivation of a specific "cognitive"

    representation of products; e) a critical discussion of the Bayesian representation of the learning

    process; f) an evaluation of the neural representation of said learning process.

    2. The problem

    Lets take Tto be the whole time horizon of a generic agent,j, and let's assume that during the

    initial period, t1, he decides to purchase one unit of a given good or service, a, reserving any

    decision to purchase another unit at a later date, t2. The reason why j implements such a

    strategy lies in the existence of some doubt concerning the adequacy of the good or service being

  • 7/31/2019 0011 Learning

    2/22

    2

    provided in terms of satisfying the need for which it is purchased, so forj the consumption made

    at t1 serves to ascertain whether there is an acceptable degree of correspondence between the

    predicted adequacy and the ascertained adequacy of the good or service, a.

    On the matter of whether the consumer is doubtful concerning the ability of a to satisfy

    specific needs, I shall go into more depth in a subsequent section of this paper. For the time

    being, suffice it to note that the existence of any such uncertainty does not necessarily involve

    contractual failings on the part of the supplier, such as those lying at the bottom of the Akerlof

    model (1970), for instance, but can be traced back to difficulties of a cognitive order that every

    consumer faces. In this context, it is notjs uncertainty concerning his own utility function, as

    hypothesized by Cyert and De Groot (1975), or as described in one of my previous papers

    (Mistri, 1996), that is the problem here. The question considered here poses the need to derive a

    consumer learning method and requires said process to be set within a consistent theoretical

    framework (Brenner, 1999). Intuitively, we might suppose that the most suitable theoretical

    framework for the above-illustrated problem lies in the experimental consumer approach. What

    remains to be filled with an operational content is the concept of the experimental consumer, who

    uses the goods not to maximize his utility function, but to ascertain their adequacy in satisfying

    specific needs. We are thus considering a consumer who expresses cardinal preferences on

    classes of goods; all goods are described by their observable features, which can be represented

    in vectorial terms. For each class of goods, our consumerj derives a value function - in the sense

    used by Luce and Raiffa (1976, p. 220) - which, as the two above-mentioned authors themselves

    point out, is not necessarily a utility function.

    3. Learning and cognitivism

    The introduction mentions a concept of learning that departs from those used in many works

    dedicated specifically to consumer learning, though in a rough approximation it is impossible to

    prescind from the latter. Besides, it is worth emphasizing that there is no unequivocal learning

    model in economics. The reason for this must be sought in the multiplicity of the theorizations

    existing in the cognitive sciences. The term "learning" is therefore used to indicate a very ample

    class of phenomena that differ from each other and that only have in common the fact that they

  • 7/31/2019 0011 Learning

    3/22

  • 7/31/2019 0011 Learning

    4/22

    4

    any risk of error.

    4. The consumer and bounded rationality

    From the standpoint taken here, learning is defined both as a process by means of which the

    subject creates classes of goods, and as a process by means of which the consumerj refines his

    classification of the goods. Inevitably this poses the problem of defining a good. In the standard

    sense adopted by Debreu (1959), goods are defined on the basis of their physical nature and are

    distinguished according to their features and their territorial and temporal location. The inclusion

    of the time factor involves introducing a dimension of uncertainty. As a first approximation, let's

    assume that j has a definite order of preferences, j, concerning a set of goods, {ai}, where i =

    1,2,...l, which can be represented in the space lR+ . At the same time, still following the standard

    scheme approach, the information needed forj can be said to be restricted to the system of

    relative prices, which can be indicated by the vectorpi , with i = 1,2,...,l.

    Purchases are defined at an initial moment t1; then an instantaneous equilibrium is

    determined forj according to the rules of bounded maximization. In the standard definition of

    goods, having different features makes the goods objectively differ from each other and a suitable

    utility function can be derived for them as a set. Conversely, following the interpretative line

    prevailing in marketing studies, we can define goods on the strength of a set of characteristics

    using multi-attribute analysis, according to which the goods can be considered as equivalent if

    they are found so on the basis of a comparison of theirattributive synthetic indexes, or attributes

    which sum up their characteristics as a whole (Lancaster, 1966).

    In an approach that considers multi-attribute goods, it can be assumed that j defines his order

    of preferences for a set ofabstractgoods, which represent categories or classes of goods against

    which every real good can be compared. This way of defining the goods has cognitive

    foundations, in the sense that a person generally tends to conceptualize and categorize.

    Conceptualization and categorization are the outcome of peoples natural tendency to contain the

    amount of information to remember, seeking a substantial cognitive economy. Conceptualiza-

    tion helps to facilitate inference; from the consumer's point of view, categorization helps to

    facilitate inferences concerning the ability of a certain class of goods to satisfy a specific need.

  • 7/31/2019 0011 Learning

    5/22

    5

    A goods-purchasing action always has an inferential nature, especially if we consider the

    sequence of stages by means of which the purchase/consumption process takes place, beginning

    with a decisional phase. The various phases do not necessarily coincide; it has been said that a

    decision to purchase is an inference on the features of a good that may be consumed in the future.

    This immediately poses the problem of establishing how, in practical terms, j makes these

    conjectures and how this cognitive process can be set in a typically economic conceptual scheme.

    A purchasing project is based, first of all, on a heritage of information accumulated prior to

    such a decision being reached; above all, it goes through the way in which said information is

    classified and represented in the person's memory. What j classifies is the coupling between

    goods features and needs, as they became apparent in the past and as j believes they may be

    manifest in the future. Said coupling gives rise to mental images, which are interior

    representations of the outside reality (Marucci, 1997). Cognition of the mental images appears as

    a useful medium between the activity of perceiving the sensorial input and the knowledge

    systems stored in the "semantic memory", i.e. the memory where the concepts are categorized.

    Recourse to the theory of mental images appears useful for an understanding of the link between

    plans and actions - a link that lies, for instance, at the foundations of the theory of sequential

    decisional processes.

    The mental image becomes a logical pivot between perception, categorization and

    memorization, and seems necessary to explain the behavior of a subject (such as our consumer),

    who is not merely a classifier of goods, but also an elaborator of consumption schemes. In

    essence, we can assume that, in deciding on a consumption plan, the consumer has in mind a

    certain image of the good and of the pleasure that he can gain from it. To simplify the

    description of the decisional process involved, we can say that this is implemented exclusively on

    the basis of the structural features of the goods, as codified in the persons memory. At the same

    time, it is feasible to imagine that the subject tends to simplify the images through categorization,

    reducing the goods to prototypes which become representations of abstract goods (Macchi,

    1989). In fact, we can assume that j draws from various different training processes (e.g. an

    exogenous education towards consumption, imitation, experimentation, the collection of

    information, the opinions of experts, the opinions of opinion leaders, etc. ) and is thus capable of

    building himself a grid of typical features that the good must possess.

  • 7/31/2019 0011 Learning

    6/22

    6

    A previous paper (1998) introduced the distinction between genotypical goods and

    phenotypical goods: genotypical goods represent a class of goods with specific general or abstract

    features, while phenotypical goods are variations of the abstract type. Economic theory

    implicitly uses the concept of phenotypical good when it deals with the differentiation between

    similar goods as part of the monopolistic competition approach. At the same time, economic

    theory implicitly uses the concept of genotypical good as part of its standard theory on consumer

    behavior.

    In the present context, the genotypical good can be considered as the prototype that emerges

    from an adequate process of categorization on the part ofj. Besides,js deriving of a prototype is

    consistent with the principle of satisficing behavior. The prototype theory can be linked to a

    simple representation of multi-attribute goods, reminiscent of the representation processes used

    by the neural schemes, i.e. with vectors (Paul Churchland, 1995; Patricia Churchland and T.

    Sejinowski, 1992); on this basis every attribute, or feature, can be considered as a dimension in

    the abstract space of the features; thus a multi-attribute good can be represented by a vector of the

    features,

    [1] x = (x1,x2,...,xm) where i = 1,2,...,m

    where (x1,x2,...,xm) is the space of the features.

    The agent j has a definite order of preferences j on a set of abstract goods that can be

    represented in vectorial form {x1, x2,...,xl}, which represent specific classes of goods with which

    the real goods {x1*, x2*,...,xl*} and their features can be compared. So the problem consists in

    establishing whether a given real product, xk*, has such features as will make it belong to the

    specific class typified by the product xk. This is a typical recognition problem in the sense of

    pattern recognition logic. In mathematical terms, the problem involves assigning the specific

    product xk* to its own class, C.

    Let's consider a real good xk*; this will be equivalent to a typical good, xk, providing it

    belongs to the same class, Ck; so we can say that two products, represented vectorially, x1, x2, are

    equivalent in terms of features when they both belong to Ck, i.e.

    [2] x1* x2* {x1* Ck and x2* Ck}

    Note that the class of equivalence is determined on the basis of the features of the products

    and of their functions.

  • 7/31/2019 0011 Learning

    7/22

    7

    5. Classes and multi-attribute value functions

    The definition of class of equivalence, as mentioned above, is entirely generic; it can be

    specified by associating each class of equivalence, Ci (where i = 1,2,...,k,...,l, defined on the space

    of the goods classes), with an index of value that correlates the value functions, v, with the

    structures of the perceived features of the single products (Luce and Raiffa, 1976, p. 68). A value

    function, v, associates a real number with every point on the space of the features and can

    represent a cardinally-ordered structure of preferences.

    Assuming that a product is assessed on the basis of the set of its perceived features, it follows

    that a criterion has to be identified with which to obtain a concise representation ofv. Decisional

    theory uses the multi-attribute value functions, v, which are linearly additive in their arguments

    (Keeney and Raiffa, 1976; Marshall and Oliver, 1995), e.g.

    [3] v (x1, x2) = v (x1) + v (x2)

    where v (x1) and v (x2) are single-attribute functions (Keeney and Raiffa, 1976, p. 105).

    According to [3], the linear form ofvenables the space of the features to be broken down into

    subspaces, each with a single feature, dealing with v (x1) and v (x2) as single-attribute value

    functions, each of which is defined on a specific space of the classes. This operation can prove

    useful in practice, as we shall see.

    Conversely, we are aware that non-additive multi-attribute value functions would be better

    able to grasp the complexity of the process of categorizing the various products, though for the

    purposes of the present work it is probably advisable to restrict ourselves to considering vas

    linearly additive. A linearly additive v can be considered as a linear approximation of a

    corresponding non-linearly additive v. From the cognitive viewpoint, the difference between the

    two functions expresses a different conceptualizing and categorizing method. The cognitive

    sciences themselves are not unequivocal in giving an adequate interpretation of categorization

    processes, because in the simplest cases they are implemented by means of a linear breakdown of

    basic components, while gestaltic phenomena cannot be eliminated in the more complex cases.

    This means that the single parts of the entity that we want to categorize interact with each other,

    emphasizing the role of the structure as a whole, so that goods with a different attributive

  • 7/31/2019 0011 Learning

    8/22

    8

    structure may belong to the same class.

    In practice, the consumer is required to solve a problem of "pattern recognition" that involves

    recognizing the relationship between the set of features and the utility that can be gained from

    them. As a result, any two goods can be considered as belonging to the same class, even if their

    features have vectorial structures that are not the same, if their respective vhave the same value -

    or values that fall close enough to a "shadow" value. Let's consider [3] and assume that 'av (x1a,

    x2a) indicates the value function of the product a and that 'bv (x1

    b, x2b) indicates the value function

    of the product b; assuming also thatx1a x1

    b and thatx2a x2

    b, but are such that:

    '

    av (x1a

    , x2a

    ) ='

    bv (x1b

    , x2b

    ). In this case the two products will belong to the same class, just as

    they will in the obvious case in whichx1a = x1

    b andx2a = x2

    b.

    From a cognitive point of view, the consumer will recognize the patterns by breaking them

    down into essential parts, according to "features analysis" criteria (Anderson,1980), assessing

    their fundamental distinctive features. Using this model, the stimuli are considered as

    combinations of elementary distinctive features. The consumer is therefore required to classify

    first the simple attributes, by determining their mono-attributive classes, then the combination of

    said attributes, by determining their pluri-attributive classes. He will memorize these features by

    means of a specific coding procedure. In a subsequent phase, when he must recall the features

    and the sensations they gave him from memory, the consumer must adopt a synthetic assessment.

    A loss of information is implicit in this process of memorizing and recalling from memory, which

    also explains the difficulty that many people have, according to Marshall and Oliver (1995, p.

    291), in comparing objectives with multiple attributes. It follows that using an additive v,

    inasmuch as it is a linear approximation of a non-additive v, represents a satisficing heuristic, the

    use of which can generate uncertainty in the determination of the choices made by j. Given alinearly additive v, which takes its values on the space of the classes of equivalence, , assuming

    that x = (x1,x2,...,xm) is a vector of attributes and kis the weight assigned to the generic attribute

    x, then

    [4] )()(1=

    =m

    i

    ixkvxV

    In t1 the consumer memorizes the vectorx, which becomes x1; in t2 he recalls to mind the

  • 7/31/2019 0011 Learning

    9/22

    9

    same vector, called x2, so that if

    [5] x2 x1

    there is a loss of information in t2 due to the effect of the memorization process that took

    place in t1. A taxonomy of the consumer's learning processes can be charted that identifies the

    objectives that are met by these processes; three fundamental approaches to consumer learning

    can be identified, i.e.

    a) j has a defined order of preferences j on the actual goods {xi*} inlR+ and reaches his

    decisions in a series of periods, , where i = 1,2,...T, and for each period a probability

    distribution can be deduced on the expected conditions of his world. This means that j findshimself in a situation of environmental uncertainty and the learning only concerns a refinement of

    his knowledge of the conditions of his world;

    b)j's order of preferences changes with time in a sequence of periods . This assumption

    lies, for instance, at the basis of several works by Cyert and De Groot (1975), which assume that

    it is through a process of acquiring new information that the consumer can modify his own utility

    function. This has an important fallout on the inter-temporal consistency of the multi-period

    plans within which j's preferences can be modified from one period to another, consequently

    inducing him to make sub-optimal choices (Woo, 1992);

    c)j has a defined order of preferences j on a set of goods prototypes {xi}, where i = 1,2,...l,

    but he has difficulty in adequately assessing the suitability of (i.e. in classifying) any real

    products, xj*. The experimentation of xj*, consisting in the acquisition of information (also

    through acts of consumption) will enable him to refine his judgement.

    Hypothesis (c) is the only one considered in the present context, where it is assumed that j has

    no difficulty in arranging goods types according to j, defining them on the basis of their

    representative features, whereas he may have difficulty in classifying the actual good xk*. After

    the first period, t1 , when j has had the opportunity to verify whether the product xk* has

    exhibited the expected suitability, he will be able to assess whether the actual product comes

    within a given class of representative features, if any.

    Defining the goods class of equivalence enables the matter of learning to be considered in

    terms of pattern recognition; js learning process concerning the goods ability to satisfy a need

  • 7/31/2019 0011 Learning

    10/22

    10

    can consequently be defined as his capacity to classify said good correctly. In operational terms,

    we can say that a classification process is correct if the expected level of the goods value

    function in t1, v(x), coincides with the one ascertained in t2, v(x)*, i.e.

    [6] v(x) = v(x)*

    The idea of class of equivalence contains two specific categorization modalities: one relates to

    the creation of the classes of equivalence concerned, the other involves attributing the goods to

    their single respective classes of equivalence. Both modalities belong to the more general

    learning process and it is on the representation of said process that, as mentioned earlier, the two

    great families of models are divided, one inspired by the cognitivist approach and the other by the

    connectionist or neural approach.

    6. The consumer as a Bayesian classifier, Critical considerations

    The cognitivist approach - which is based on the assumption that the subject is a data

    processor - finds formal expression in the Bayesian modeling method. Models of this type have

    been applied to the theory of consumer behavior by Cyert and De Groot (1975) and by

    Kihlstrom, Mirman and Postlewaite (KMP) (1984). In the light of what has been said so far, it is

    assumed that the concept of class has a fundamental role in the consumers decisions. j's decision

    to consume a good x* depends on his evaluation of the "level" of the goods value function,

    v(x*). So the problem forj consists in refining his assessment, by acquiring information, of

    whether the good or service belongs to one class or another. In Bayesian logic, in t1, j

    estimates the level ofv(x*), and he does so on the basis of the information that he possesses at the

    time. As mentioned earlier, the features of a product are represented vectorially and j doesn't

    necessarily know the structure of the vectorx* before his act of consumption; j can establish a

    probability distribution of said structure. The approach adopted by KMP consists in deriving a

    consumers utility function that incorporates a process of Bayesian learning defined on a space

    that is given by a coupling of the space of the goods with that of their features, which are not

    necessarily all known toj in advance.

    Following KMP, we assume that the consumerj will obtain certain services from the product

    represented by the vector of the features x*, which can be indicated as a, so that

  • 7/31/2019 0011 Learning

    11/22

    11

    [7] a = x* +

    In [7] represents a random variable with a known density function. Note that the parameter

    is non-random, but is not known in advance. Let's assume thatj estimates that can only take

    on two values, 1 and 2, that stand for two different classes. In t1, j has the (subjective)

    probability that x* falls into 1 or2; i.e. p(1), p(2). These are a priori probabilities. j's

    estimate may change if he acquires information synthesized by the likelihood function, p(x* |i),

    where i = 1,2. It is then easy to complete the Bayesian formula

    [8] p(i|x*) =p(x* |i)p(i ) /p(x*)

    where p(x*) is the probability density function of

    x* and p(i |x*) is the a posteriori

    probability. The Bayes classification rule states that:

    if p(1|x*) > p(2|x*) then x* belongs to 1

    [9]

    if p(1|x*) < p(2|x*) then x* belongs to 2

    In this arrangement we have to assume that the probability function is known, which may be

    scarcely realistic. In fact, j will alter his estimate of the probability levels of1 and 2 on thebasis of the information he receives, and he should estimate its reliability in probabilistic terms,

    which does not always satisfy the condition of realism for the hypotheses by which the Bayesian

    models would like to be inspired (Salmon 1995).

    7. Towards a neural representation

    Generally speaking, two consumer learning modalities have been identified; in one, j builds

    his classes of equivalence on the basis of which he defines his own stable order of preferences, j,and a second one with which he assigns each actual product to its class of equivalence. While the

    Bayesian approach seems unsuitable for representing these two modalities, the neural approach -

    which is a formalized expression of connectionism, seems capable of responding better to the

    need to formalize the consumer learning process thus described. Equation [1] concisely

    represents the vector of the features of any given product; note that [1] is, in a nutshell, consistent

    with the neural modeling method. In [1], the representation of a typical product can be

  • 7/31/2019 0011 Learning

    12/22

    12

    considered as isomorphic to the structure of its vectorially-expressed features. It follows that

    learning in [1] can be represented as a transformation of the vector x into a new vectorx

    according to the rule Tx = x, where Tis a suitable transformation. This formula easily explains

    the interest of certain scholars (Salmon,1995; Fabbri and Orsini, 1983; Beltratti, Margarita and

    Terna, 1996) in opportunities for using neural networks in the field of learning in economics. If

    [1] leads us to think of an isomorphism between the structure of a products features and its

    vectorial representation, in neural networks this isomorphism is strengthened, as it were, in the

    sense that it can be traced in the formal equivalence between the vectorial representation of the

    products features and the vectorial representation that is given of the cognitive structures by

    several neural models, and by the PDP (parallel distributed processing) models in particular

    (Rumelhart and McClelland, 1986).

    The vectorial representation of the goods really consists in a vectorial representation of their

    features, since working on the features makes the coding process easier (Floreano, 1996, p. 41).

    Opting to codify the features enables certain difficulties relating to the so-called "local code"

    (consisting in the fact that each input unit,xj, wherej = 1,2,...,n, corresponds to a specific object)

    to be overcome. Generally speaking, neural networks with a minimal complexity use the so-

    called "distributed coding", in which many units contribute towards representing each object. If

    we assume that the input units codify the objects features, then every input unit on the network

    will codify the presence or the value of a certain feature. Thus each object activates one or more

    units and each unit is used for one or more objects, so each object is defined by the combinations

    of active units in the network (Floreano, 1996, p. 43).

    Taking the most straightforward hypothesis, i.e. that every input unit represents a feature, the

    weight attributable to each input unit depends on the relative importance assigned to the feature

    with respect to the others, according to the value function logic. Returning to [1], the component

    of the vector of the features can be identified - again on the extremely simplified assumption

    adopted here - with the input signals. Bearing in mind the significance of the weights, the net

    input of a neuron,Ai, is usually represented by

    [10] =N

    j

    jiji xwA where i = 1,2,...,n j = 1,2,...,n

    Note that, while wj stands for the weight of the ith input of a unit, wij represents the strength of

  • 7/31/2019 0011 Learning

    13/22

    13

    the interconnection between the unitj and a unit i.

    The net input Ai of an ith neuron is the algebraic sum of the products among all the j input

    signalsxj and the values of the corresponding synapses wij, from which the threshold value i of

    the neuron is subtracted. Thus the net input of the neuron will be given by

    [11] =N

    j

    ijiji xwA

    The response of the neuron, yi, is established by passing the net input through an activation

    function (x) (Floreano, 1996, p. 35).

    [12] =

    N

    j

    ijiji xwy )(

    The general principle is that learning (intended as the organized acquisition of knowledge) in

    the model, i.e. in the neural network, mimics what is thought to happen in the brain when

    something is learnt, i.e. connections are created between neurons and the cortical areas via the

    synapses. The connections may have a "variable geometry", in the sense that the same stimulus

    can give rise to different connections in different people. Knowledge (and consequently also

    recall) of events, situations, objects, etc., is represented in the brain by means of relatively

    durable configurations of synaptic connections and is distributed through said synaptic

    connections. Knowledge is not stored in single units, but is distributed among many different

    units, each of which contributes to the representations of many different elements of knowledge

    (Mazzoni, 1998, p. 324). Rather than storing what they learn in a sort of "private" memory, the

    neural networks store information in the connections between the nodes. In the neural scheme,

    learning thus consists in reinforcing certain connections and extinguishing others.

    The processing of the information takes place in the layers in which the neural network is

    composed. The most straightforward neural network models are those which, like the

    Perceptron, are composed of a layer of incoming networks, that receive stimuli and information

    from the outside world, and a layer of neuronodes that process the information and then give a

    representation of it as output. In this latter case, we speak of a layer of outgoing units or outputs.

    At a slightly more complex level, there are models including a layer of hidden units that do some

    essential preliminary information-processing work. The input units represent incoming

    information elements and are activated by the stimulation deriving from information coming

  • 7/31/2019 0011 Learning

    14/22

    14

    from the surrounding environment.

    This information makes the units trigger a signal; each input is attributed a relative weight,

    which takes into account the importance of the input signal. The distribution of the weights on

    the connections is due to the fact that some inputs are more important than others in the way in

    which they combine to produce an impulse, and thus have a greater weight. The weight can thus

    be seen as a measure of the strength, or intensity, of the connection. The hidden units then

    receive signals from the input units and the weights of the synaptic connections that define them

    are modified on the basis of said signals, which release more signals to the output units; here

    again, these signals can modify both the weight of the output units and the strength of the

    connections between the hidden units and the output units. The role of learning in the logic of

    the Hebbian networks (which are used in the classification processes) can be expressed as follows

    (Floreano, 1996, p. 66):

    - given a neural network with N input nodes andP training pairs, each composedof an input vector, xp, and a required response ("target"), tp, the output from the network for

    each input pattern is given by:

    1 if wijxj > 0

    [13]y =

    0 otherwise

    This value is compared with the required response, tp, for a given input pattern. If the nets

    response is the same as the required response, the synaptic values are not changed; if, on the other

    hand, there is a difference between output and required response, i.e. an error in the logic of the

    neural networks, the synaptic weights are modified on the basis of the correct response, where

    wij represents the correction to attribute to the synaptic weight in question

    [14] wij = t xj

    where is a proportionality constant; the value thus obtained is added to the preceding values

    of the synapses.

    Note that, according to Churchland and Sejnowski (1992, ch. IV), the representation and

    classification of inputs and outputs in a PDP system takes place vectorially. Each neuron can

    take part in the representation of many different elements and no single neuron represents an

  • 7/31/2019 0011 Learning

    15/22

  • 7/31/2019 0011 Learning

    16/22

    16

    one of two possible classes, so that C**1 and C*0.

    Assuming (x1,x2,...,xn) as the input values and (w1,w2,...,wn) as the synaptic weights, without

    any interconnections between the units, so that wijwi, then

    [15] =N

    i

    ii xxwy 01

    wherex0 is a threshold value.

    As mentioned before,j-Perceptron has to verify whether or not a product belongs to a certain

    class. Training takes place by submitting pairs of input/output examples to him in sequence until

    the network is capable of calculating the function exactly. In other words, let's imagine that there

    are only two product classes, (C*, C**), that divide the space of the goods, C, into two specific

    subspaces. j must assign a generic good, b, to one of the two goods classes. It is also assumed

    that there are only two inputs,x1,x2. Given the values ofx1 andx2,j-Perceptron must assign the

    value of1 to the output if he "believes" that product b belongs to C**, or the value of 0 if he

    "believes" that b belongs to C*.

    Let

    [16]==

    x

    iii

    xwxg0

    )(

    be the overall input value; for the output value we shall have

    1 if y(x) > x0

    [17]y(x) =

    0 if y(x) < x0

    Assuming that we have only two inputs, then

    [18]y(x) = w0 + w1w1 + w2w2

    where w0 is an ad hoc weight. Resolving fory(x) = 0, we shall have

    [19]x2 = (w1 / w2) x1 (w0 / w2)

    which gives rise to a straight line that divides the Cregion into two sub-regions. For certain

    values of (w0,w1,w2) the output will fall in the C** region and will thus be equal to 1; for other

    values it will fall in the C* region and will equate to 0. The "decision surface" is found on the set

    C. If there are numerous inputs, the decision surface will be composed of a hyperplane; we can

  • 7/31/2019 0011 Learning

    17/22

    17

    say that "the problem relating to the learning of aPerceptron can be brought down to the correct

    determination of a decision surface" (Carrella, 1995, p. 189). So the learning strategy of a j-

    Perceptron consists in progressively modifying the synaptic weights so as to enable the network

    to proceed with a correct classification, assigning b to the right class. By way of an (extremely

    simple) example, let's imagine that we have a consumerj who has the features of thePerceptron,

    in that his function will be to learn to classify certain goods in their respective classes. The

    neural network he uses will be a network with no hidden levels, with linear outputs from the

    nodes. Errors will be corrected by means of a manual application of the "delta rule", also called

    the root-mean-square error rule, which is based on the principle of modifying the weights of the

    connections in sequence to reduce the difference (or "delta") between the required output and the

    value found at the output neuron. Let's assume thatj-Perceptron has to classify a good, x, with

    two features,x1,x2, and thaty(x)is the output indicating the product classes, 1 C**, 0 C*;

    and lets say that w1,w2 are the corresponding synaptic weights. j-Perceptrons learning process

    will consist in changing the synaptic weights if for a given output node the calculated value is not

    the same as the required value. The model presented here is an adaptation of the model

    illustrated by Carrella (1995, p. 163 et seq.).

    Take the following truth table:

    Table 1

    Features OutputGood

    x x y(x)

    x 0 0 1

    Each feature can be associated with the value of 1 or 0, which indicate the previously-

    mentioned mono-attributive classes. The following parameters are also needed for the

    application of the learning rule: T(threshold value), arbitrarily assumed as corresponding to 0.1;

    e (error), assumed as corresponding to 0.1; d (percentage of weight correction), assumed as

    corresponding to 0.5.

    In the initial phase, the synaptic weights are assigned arbitrarily, in the sense that j-Perceptron

    is uncertain as to how to classify the goods; lets take these weights to be

    [20] w1 = -0.1 ; w2 = 0.2

  • 7/31/2019 0011 Learning

    18/22

    18

    Putting w0=Tin [18], the output neuronode is calculated according to the equation

    [21]y(x) = w1x1 + w2x2 T

    Ify(x) acquires a positive value, the output note will indicate a value of 1; if not, it will

    indicate a value of0; and, as we know, these values indicate the products classes.

    Forx1 = 0, x2 = 0, T = 0.1, the output value required would be 1. Taking the weights indicated

    in [20] and inserting them in [21], and making the necessary simple calculations, we find that the

    result equates to -0.1 and is therefore negative, so the output value assigned toy(x) will be 0.

    Table 2

    Good Required y Calculated y

    x 1 0

    If we compare the two columns, we find that j-Perceptron has failed to classify the good

    correctly, so he must modify the synaptic weights by a proportional amount; if the output value is

    0, when it should be 1, he must increase the weights; in the opposite case, he must reduce them.

    In order to calculate the error, it is best to treat the threshold value as an input, x3 = 1, having a

    weight w3 = -T. Then the equation for determining the weights becomes

    [22] w1x1 + w2x2 + w3x3 > 0

    The new weights are obtained from the old weights plus the correction factor, Fc, calculated

    on the old weights. The correction factor will be as follows

    [23]Fc= (E + e)d

    whereEis the error; as we know, e is the value assigned to the error, d is the percentage of

    weight correction. The error,E, is defined as

    [24]E = 0 (w1x1 + w2x2 + w3x3)

    For the values assigned before,E = 0.1. Hence

    [25]Fc = (E + e)d = (0.1 + 0.1) 0.5 = 0.1

    We can now modify the weights in proportion to the calculated value until we find a system of

    weights capable of representing all the input/output pairs, through an iterative process which, in

    the specific case of our example, can lead to the solution of the problem in a number of cycles.

    Each cycle can be considered as an act of experimental consumption.

  • 7/31/2019 0011 Learning

    19/22

    19

    9. Conclusions

    Given the simplified structure of the model used here, we can represent every feature of a

    product as an input and assume that every training cycle will correspond to the acquisition of a

    new item of information, so thatj is capable of modifying the structure representing the features

    of a product in his memory.

    In practice, categorization processes, even for a single attribute, can be described by highly

    complex neural networks, so that the coupling of different attributes necessarily leads to the

    construction of neural networks that are far more complex than the example considered here.

    Nonetheless, this is a useful exercise for the purpose of understanding the fact that product

    classification, through the acquisition of suitable information, involves "internal" modeling

    processes on the consumer's cognitive structure. Learning can thus be represented by said

    processes.

    Clearly, the use of neural modeling can hardly cover all learning processes. It does grasp a

    part of said processes, however, i.e. the ones characterized by the need to classify certain

    patterns, as in the case of consumer products.

  • 7/31/2019 0011 Learning

    20/22

    20

    REFERENCES

    AKERLOF G., "The Market for "Lemons": Quality Uncertainty and the Market Mechanism",Quartely Journal of Economics, 1970, 84, pp.488-500

    ALLAIS M., "Determination of cardinal Utility according to an Intrinsic Invariant Model", in L.

    Daboni, A. Montesano and M. Lines, eds., Recent Developments in the Foundations of

    Utility and Risk Theory, Dordrecht: D. Reidel Publishing Company, 1986, pp.83-120

    ANDERSON J.R., Cognitive Psychology and its Implications, New York: Freeman & Co. 1980

    BELTRATTI A., MARGARITA S. and TERNA P., Neural Networks for Economic and

    Financial Models, London: International Thompson Computer Press, 1996

    BRENNER T.,Modelling Learning in Economics, Cheltenham (UK); Edward Elgar, 1999

    CARRELLA G.,L'Officina Neurale, Milano: Franco Angeli Editore, 1995

    CHURCHLAND P.J.and SEJNOWSKI T.J., The Computational Brain, Cambridge, MA: The

    MIT Press, 1992

    CHURCHLAND P., The Engine of Reason, the Seat of the Soul, Cambridge,MA: The MIT

    Press,1995

    CYERT R. and DE GROOt M.H., Adaptive Utility, in R.H.Day, T.Groves, eds., AdaptiveEconomic Models, London: Academic Press, 1975, pp. 223-46

    DEBREU G., Theory of Value, New York: Wiley, 1959

    FABBRI G. and ORSINI R., Reti Neurali per le Scienze Economiche, Padova: Franco Muzzio

    Editore,1993

    FLOREANO D.,Manuale sulle Reti Neurali, Bologna, Il Mulino, 1996

    von HAYEK F., The Sensory Order. An Inquiry into the Foundations of Theoretical Psychology ,

    London: Routledge & Kegan, 1952

    KIHLSTROM R., MIRMAN L. and POSTLEWAITE A., "Experimental Consumption and the

    Rothschild Effect", in M. Boyer, R. Kihlstrom, eds.,Bayesian Models in Economic Theory,

    Amsterdam: North Holland,1984, pp. 279-302

    KEENEY R.L. and RAIFFA H., Decisions with Multiple Objectives: Preferences and Value

    Tradeoffs, New York, Wiley, 1976

  • 7/31/2019 0011 Learning

    21/22

  • 7/31/2019 0011 Learning

    22/22

    22

    PATTERN RECOGNITION

    AND

    CONSUMER LEARNING

    Maurizio Mistri

    (Department of Economics, University of Padua)

    ABSTRACT

    This paper deals with the topic of consumer learning as an extension of the experimental

    consumer approach. With respect to said approach, however, learning is dealt with as a process ofproducts categorization and classification. For this purpose, the goods are described on the basis

    of their features and are represented vectorially by means of value functions. It is easily

    demonstrated how said methodology enables the use of neural networks as an analytical and

    logical instrument. In the last part of the paper, a simple example is given of the application of

    the neural network layout to describe a consumer called upon to classify certain products.

    JEL classification: D12,D83

    Keywords: consumption, consumer behavior, consumer learning