0011 Learning

7/31/2019 0011 Learning

1/22

1

1. Introduction

This paper deals with consumer learning from the point of view of the relationship between

the theory of the consumers optimal choice and the cognitive sciences. The approach is

consistent with Simons theory of bounded rationality, and particularly with the aspect of said

theory that goes under the name of "procedural rationality", used as a concise term to mean the

determination of a heuristic with a view to achieving satisficing decisions. In the light of this

conception of rationality, the concept of consumer learning is developed by means of a critical

discussion both of the Bayesian approach and of the neural approach, the aim being to identify its

potential for application and its limits.

In this setting, a subject, j, assumed to represent the generic consumer, uses acts of

consumption or, in more general terms, the acquisition of information to assess the capacity of a

given good to satisfy specific needs, because he initially has some doubts as to said assessment.

In fact, the problem contains aspects of a cognitive order that inevitably influence the way in

which j makes his decisions. Simon's contribution lies in considering the question of the

formation of knowledge as being part of the decisional process and therefore as being capable of

influencing said process, as if it were a system of constraints.

This work deals with the following aspects: a) the statement of a practical problem which

prompts the need to derive an adequate representation of the learning process; b) a discussion of

the relationship of learning in economics and cognitive science; c) a discussion of the concept of

bounded rationality applied to consumer theory; d) the derivation of a specific "cognitive"

representation of products; e) a critical discussion of the Bayesian representation of the learning

process; f) an evaluation of the neural representation of said learning process.

2. The problem

Lets take Tto be the whole time horizon of a generic agent,j, and let's assume that during the

initial period, t1, he decides to purchase one unit of a given good or service, a, reserving any

decision to purchase another unit at a later date, t2. The reason why j implements such a

strategy lies in the existence of some doubt concerning the adequacy of the good or service being

7/31/2019 0011 Learning

2/22

2

provided in terms of satisfying the need for which it is purchased, so forj the consumption made

at t1 serves to ascertain whether there is an acceptable degree of correspondence between the

predicted adequacy and the ascertained adequacy of the good or service, a.

On the matter of whether the consumer is doubtful concerning the ability of a to satisfy

specific needs, I shall go into more depth in a subsequent section of this paper. For the time

being, suffice it to note that the existence of any such uncertainty does not necessarily involve

contractual failings on the part of the supplier, such as those lying at the bottom of the Akerlof

model (1970), for instance, but can be traced back to difficulties of a cognitive order that every

consumer faces. In this context, it is notjs uncertainty concerning his own utility function, as

hypothesized by Cyert and De Groot (1975), or as described in one of my previous papers

(Mistri, 1996), that is the problem here. The question considered here poses the need to derive a

consumer learning method and requires said process to be set within a consistent theoretical

framework (Brenner, 1999). Intuitively, we might suppose that the most suitable theoretical

framework for the above-illustrated problem lies in the experimental consumer approach. What

remains to be filled with an operational content is the concept of the experimental consumer, who

uses the goods not to maximize his utility function, but to ascertain their adequacy in satisfying

specific needs. We are thus considering a consumer who expresses cardinal preferences on

classes of goods; all goods are described by their observable features, which can be represented

in vectorial terms. For each class of goods, our consumerj derives a value function - in the sense

used by Luce and Raiffa (1976, p. 220) - which, as the two above-mentioned authors themselves

point out, is not necessarily a utility function.

3. Learning and cognitivism

The introduction mentions a concept of learning that departs from those used in many works

dedicated specifically to consumer learning, though in a rough approximation it is impossible to

prescind from the latter. Besides, it is worth emphasizing that there is no unequivocal learning

model in economics. The reason for this must be sought in the multiplicity of the theorizations

existing in the cognitive sciences. The term "learning" is therefore used to indicate a very ample

class of phenomena that differ from each other and that only have in common the fact that they

7/31/2019 0011 Learning

3/22

7/31/2019 0011 Learning

4/22

4

any risk of error.

4. The consumer and bounded rationality

From the standpoint taken here, learning is defined both as a process by means of which the

subject creates classes of goods, and as a process by means of which the consumerj refines his

classification of the goods. Inevitably this poses the problem of defining a good. In the standard

sense adopted by Debreu (1959), goods are defined on the basis of their physical nature and are

distinguished according to their features and their territorial and temporal location. The inclusion

of the time factor involves introducing a dimension of uncertainty. As a first approximation, let's

assume that j has a definite order of preferences, j, concerning a set of goods, {ai}, where i =

1,2,...l, which can be represented in the space lR+ . At the same time, still following the standard

scheme approach, the information needed forj can be said to be restricted to the system of

relative prices, which can be indicated by the vectorpi , with i = 1,2,...,l.

Purchases are defined at an initial moment t1; then an instantaneous equilibrium is

determined forj according to the rules of bounded maximization. In the standard definition of

goods, having different features makes the goods objectively differ from each other and a suitable

utility function can be derived for them as a set. Conversely, following the interpretative line

prevailing in marketing studies, we can define goods on the strength of a set of characteristics

using multi-attribute analysis, according to which the goods can be considered as equivalent if

they are found so on the basis of a comparison of theirattributive synthetic indexes, or attributes

which sum up their characteristics as a whole (Lancaster, 1966).

In an approach that considers multi-attribute goods, it can be assumed that j defines his order

of preferences for a set ofabstractgoods, which represent categories or classes of goods against

which every real good can be compared. This way of defining the goods has cognitive

foundations, in the sense that a person generally tends to conceptualize and categorize.

Conceptualization and categorization are the outcome of peoples natural tendency to contain the

amount of information to remember, seeking a substantial cognitive economy. Conceptualiza-

tion helps to facilitate inference; from the consumer's point of view, categorization helps to

facilitate inferences concerning the ability of a certain class of goods to satisfy a specific need.

7/31/2019 0011 Learning

5/22

5

A goods-purchasing action always has an inferential nature, especially if we consider the

sequence of stages by means of which the purchase/consumption process takes place, beginning

with a decisional phase. The various phases do not necessarily coincide; it has been said that a

decision to purchase is an inference on the features of a good that may be consumed in the future.

This immediately poses the problem of establishing how, in practical terms, j makes these

conjectures and how this cognitive process can be set in a typically economic conceptual scheme.

A purchasing project is based, first of all, on a heritage of information accumulated prior to

such a decision being reached; above all, it goes through the way in which said information is

classified and represented in the person's memory. What j classifies is the coupling between

goods features and needs, as they became apparent in the past and as j believes they may be

manifest in the future. Said coupling gives rise to mental images, which are interior

representations of the outside reality (Marucci, 1997). Cognition of the mental images appears as

a useful medium between the activity of perceiving the sensorial input and the knowledge

systems stored in the "semantic memory", i.e. the memory where the concepts are categorized.

Recourse to the theory of mental images appears useful for an understanding of the link between

plans and actions - a link that lies, for instance, at the foundations of the theory of sequential

decisional processes.

The mental image becomes a logical pivot between perception, categorization and

memorization, and seems necessary to explain the behavior of a subject (such as our consumer),

who is not merely a classifier of goods, but also an elaborator of consumption schemes. In

essence, we can assume that, in deciding on a consumption plan, the consumer has in mind a

certain image of the good and of the pleasure that he can gain from it. To simplify the

description of the decisional process involved, we can say that this is implemented exclusively on

the basis of the structural features of the goods, as codified in the persons memory. At the same

time, it is feasible to imagine that the subject tends to simplify the images through categorization,

reducing the goods to prototypes which become representations of abstract goods (Macchi,

1989). In fact, we can assume that j draws from various different training processes (e.g. an

exogenous education towards consumption, imitation, experimentation, the collection of

information, the opinions of experts, the opinions of opinion leaders, etc. ) and is thus capable of

building himself a grid of typical features that the good must possess.

7/31/2019 0011 Learning

6/22

6

A previous paper (1998) introduced the distinction between genotypical goods and

phenotypical goods: genotypical goods represent a class of goods with specific general or abstract

features, while phenotypical goods are variations of the abstract type. Economic theory

implicitly uses the concept of phenotypical good when it deals with the differentiation between

similar goods as part of the monopolistic competition approach. At the same time, economic

theory implicitly uses the concept of genotypical good as part of its standard theory on consumer

behavior.

In the present context, the genotypical good can be considered as the prototype that emerges

from an adequate process of categorization on the part ofj. Besides,js deriving of a prototype is

consistent with the principle of satisficing behavior. The prototype theory can be linked to a

simple representation of multi-attribute goods, reminiscent of the representation processes used

by the neural schemes, i.e. with vectors (Paul Churchland, 1995; Patricia Churchland and T.

Sejinowski, 1992); on this basis every attribute, or feature, can be considered as a dimension in

the abstract space of the features; thus a multi-attribute good can be represented by a vector of the

features,

[1] x = (x1,x2,...,xm) where i = 1,2,...,m

where (x1,x2,...,xm) is the space of the features.

The agent j has a definite order of preferences j on a set of abstract goods that can be

represented in vectorial form {x1, x2,...,xl}, which represent specific classes of goods with which

the real goods {x1*, x2*,...,xl*} and their features can be compared. So the problem consists in

establishing whether a given real product, xk*, has such features as will make it belong to the

specific class typified by the product xk. This is a typical recognition problem in the sense of

pattern recognition logic. In mathematical terms, the problem involves assigning the specific

product xk* to its own class, C.

Let's consider a real good xk*; this will be equivalent to a typical good, xk, providing it

belongs to the same class, Ck; so we can say that two products, represented vectorially, x1, x2, are

equivalent in terms of features when they both belong to Ck, i.e.

[2] x1* x2* {x1* Ck and x2* Ck}

Note that the class of equivalence is determined on the basis of the features of the products

and of their functions.

7/31/2019 0011 Learning

7/22

7

5. Classes and multi-attribute value functions

The definition of class of equivalence, as mentioned above, is entirely generic; it can be

specified by associating each class of equivalence, Ci (where i = 1,2,...,k,...,l, defined on the space

of the goods classes), with an index of value that correlates the value functions, v, with the

structures of the perceived features of the single products (Luce and Raiffa, 1976, p. 68). A value

function, v, associates a real number with every point on the space of the features and can

represent a cardinally-ordered structure of preferences.

Assuming that a product is assessed on the basis of the set of its perceived features, it follows

that a criterion has to be identified with which to obtain a concise representation ofv. Decisional

theory uses the multi-attribute value functions, v, which are linearly additive in their arguments

(Keeney and Raiffa, 1976; Marshall and Oliver, 1995), e.g.

[3] v (x1, x2) = v (x1) + v (x2)

where v (x1) and v (x2) are single-attribute functions (Keeney and Raiffa, 1976, p. 105).

According to [3], the linear form ofvenables the space of the features to be broken down into

subspaces, each with a single feature, dealing with v (x1) and v (x2) as single-attribute value

functions, each of which is defined on a specific space of the classes. This operation can prove

useful in practice, as we shall see.

Conversely, we are aware that non-additive multi-attribute value functions would be better

able to grasp the complexity of the process of categorizing the various products, though for the

purposes of the present work it is probably advisable to restrict ourselves to considering vas

linearly additive. A linearly additive v can be considered as a linear approximation of a

corresponding non-linearly additive v. From the cognitive viewpoint, the difference between the

two functions expresses a different conceptualizing and categorizing method. The cognitive

sciences themselves are not unequivocal in giving an adequate interpretation of categorization

processes, because in the simplest cases they are implemented by means of a linear breakdown of

basic components, while gestaltic phenomena cannot be eliminated in the more complex cases.

This means that the single parts of the entity that we want to categorize interact with each other,

emphasizing the role of the structure as a whole, so that goods with a different attributive

7/31/2019 0011 Learning

8/22

8

structure may belong to the same class.

In practice, the consumer is required to solve a problem of "pattern recognition" that involves

recognizing the relationship between the set of features and the utility that can be gained from

them. As a result, any two goods can be considered as belonging to the same class, even if their

features have vectorial structures that are not the same, if their respective vhave the same value -

or values that fall close enough to a "shadow" value. Let's consider [3] and assume that 'av (x1a,

x2a) indicates the value function of the product a and that 'bv (x1

b, x2b) indicates the value function

of the product b; assuming also thatx1a x1

b and thatx2a x2

b, but are such that:

'

av (x1a

, x2a

) ='

bv (x1b

, x2b

). In this case the two products will belong to the same class, just as

they will in the obvious case in whichx1a = x1

b andx2a = x2

b.

From a cognitive point of view, the consumer will recognize the patterns by breaking them

down into essential parts, according to "features analysis" criteria (Anderson,1980), assessing

their fundamental distinctive features. Using this model, the stimuli are considered as

combinations of elementary distinctive features. The consumer is therefore required to classify

first the simple attributes, by determining their mono-attributive classes, then the combination of

said attributes, by determining their pluri-attributive classes. He will memorize these features by

means of a specific coding procedure. In a subsequent phase, when he must recall the features

and the sensations they gave him from memory, the consumer must adopt a synthetic assessment.

A loss of information is implicit in this process of memorizing and recalling from memory, which

also explains the difficulty that many people have, according to Marshall and Oliver (1995, p.

291), in comparing objectives with multiple attributes. It follows that using an additive v,

inasmuch as it is a linear approximation of a non-additive v, represents a satisficing heuristic, the

use of which can generate uncertainty in the determination of the choices made by j. Given alinearly additive v, which takes its values on the space of the classes of equivalence, , assuming

that x = (x1,x2,...,xm) is a vector of attributes and kis the weight assigned to the generic attribute

x, then

[4] )()(1=

=m

i

ixkvxV

In t1 the consumer memorizes the vectorx, which becomes x1; in t2 he recalls to mind the

7/31/2019 0011 Learning

9/22

9

same vector, called x2, so that if

[5] x2 x1

there is a loss of information in t2 due to the effect of the memorization process that took

place in t1. A taxonomy of the consumer's learning processes can be charted that identifies the

objectives that are met by these processes; three fundamental approaches to consumer learning

can be identified, i.e.

a) j has a defined order of preferences j on the actual goods {xi*} inlR+ and reaches his

decisions in a series of periods, , where i = 1,2,...T, and for each period a probability

distribution can be deduced on the expected conditions of his world. This means that j findshimself in a situation of environmental uncertainty and the learning only concerns a refinement of

his knowledge of the conditions of his world;

b)j's order of preferences changes with time in a sequence of periods . This assumption

lies, for instance, at the basis of several works by Cyert and De Groot (1975), which assume that

it is through a process of acquiring new information that the consumer can modify his own utility

function. This has an important fallout on the inter-temporal consistency of the multi-period

plans within which j's preferences can be modified from one period to another, consequently

inducing him to make sub-optimal choices (Woo, 1992);

c)j has a defined order of preferences j on a set of goods prototypes {xi}, where i = 1,2,...l,

but he has difficulty in adequately assessing the suitability of (i.e. in classifying) any real

products, xj*. The experimentation of xj*, consisting in the acquisition of information (also

through acts of consumption) will enable him to refine his judgement.

Hypothesis (c) is the only one considered in the present context, where it is assumed that j has

no difficulty in arranging goods types according to j, defining them on the basis of their

representative features, whereas he may have difficulty in classifying the actual good xk*. After

the first period, t1 , when j has had the opportunity to verify whether the product xk* has

exhibited the expected suitability, he will be able to assess whether the actual product comes

within a given class of representative features, if any.

Defining the goods class of equivalence enables the matter of learning to be considered in

terms of pattern recognition; js learning process concerning the goods ability to satisfy a need

7/31/2019 0011 Learning

10/22

10

can consequently be defined as his capacity to classify said good correctly. In operational terms,

we can say that a classification process is correct if the expected level of the goods value

function in t1, v(x), coincides with the one ascertained in t2, v(x)*, i.e.

[6] v(x) = v(x)*

The idea of class of equivalence contains two specific categorization modalities: one relates to

the creation of the classes of equivalence concerned, the other involves attributing the goods to

their single respective classes of equivalence. Both modalities belong to the more general

learning process and it is on the representation of said process that, as mentioned earlier, the two

great families of models are divided, one inspired by the cognitivist approach and the other by the

connectionist or neural approach.

6. The consumer as a Bayesian classifier, Critical considerations

The cognitivist approach - which is based on the assumption that the subject is a data

processor - finds formal expression in the Bayesian modeling method. Models of this type have

been applied to the theory of consumer behavior by Cyert and De Groot (1975) and by

Kihlstrom, Mirman and Postlewaite (KMP) (1984). In the light of what has been said so far, it is

assumed that the concept of class has a fundamental role in the consumers decisions. j's decision

to consume a good x* depends on his evaluation of the "level" of the goods value function,

v(x*). So the problem forj consists in refining his assessment, by acquiring information, of

whether the good or service belongs to one class or another. In Bayesian logic, in t1, j

estimates the level ofv(x*), and he does so on the basis of the information that he possesses at the

time. As mentioned earlier, the features of a product are represented vectorially and j doesn't

necessarily know the structure of the vectorx* before his act of consumption; j can establish a

probability distribution of said structure. The approach adopted by KMP consists in deriving a

consumers utility function that incorporates a process of Bayesian learning defined on a space

that is given by a coupling of the space of the goods with that of their features, which are not

necessarily all known toj in advance.

Following KMP, we assume that the consumerj will obtain certain services from the product

represented by the vector of the features x*, which can be indicated as a, so that

7/31/2019 0011 Learning

11/22

11

[7] a = x* +

In [7] represents a random variable with a known density function. Note that the parameter

is non-random, but is not known in advance. Let's assume thatj estimates that can only take

on two values, 1 and 2, that stand for two different classes. In t1, j has the (subjective)

probability that x* falls into 1 or2; i.e. p(1), p(2). These are a priori probabilities. j's

estimate may change if he acquires information synthesized by the likelihood function, p(x* |i),

where i = 1,2. It is then easy to complete the Bayesian formula

[8] p(i|x*) =p(x* |i)p(i ) /p(x*)

where p(x*) is the probability density function of

x* and p(i |x*) is the a posteriori

probability. The Bayes classification rule states that:

if p(1|x*) > p(2|x*) then x* belongs to 1

[9]

if p(1|x*) < p(2|x*) then x* belongs to 2

In this arrangement we have to assume that the probability function is known, which may be

scarcely realistic. In fact, j will alter his estimate of the probability levels of1 and 2 on thebasis of the information he receives, and he should estimate its reliability in probabilistic terms,

which does not always satisfy the condition of realism for the hypotheses by which the Bayesian

models would like to be inspired (Salmon 1995).

7. Towards a neural representation

Generally speaking, two consumer learning modalities have been identified; in one, j builds

his classes of equivalence on the basis of which he defines his own stable order of preferences, j,and a second one with which he assigns each actual product to its class of equivalence. While the

Bayesian approach seems unsuitable for representing these two modalities, the neural approach -

which is a formalized expression of connectionism, seems capable of responding better to the

need to formalize the consumer learning process thus described. Equation [1] concisely

represents the vector of the features of any given product; note that [1] is, in a nutshell, consistent

with the neural modeling method. In [1], the representation of a typical product can be

7/31/2019 0011 Learning

12/22

12

considered as isomorphic to the structure of its vectorially-expressed features. It follows that

learning in [1] can be represented as a transformation of the vector x into a new vectorx

according to the rule Tx = x, where Tis a suitable transformation. This formula easily explains

the interest of certain scholars (Salmon,1995; Fabbri and Orsini, 1983; Beltratti, Margarita and

Terna, 1996) in opportunities for using neural networks in the field of learning in economics. If

[1] leads us to think of an isomorphism between the structure of a products features and its

vectorial representation, in neural networks this isomorphism is strengthened, as it were, in the

sense that it can be traced in the formal equivalence between the vectorial representation of the

products features and the vectorial representation that is given of the cognitive structures by

several neural models, and by the PDP (parallel distributed processing) models in particular

(Rumelhart and McClelland, 1986).

The vectorial representation of the goods really consists in a vectorial representation of their

features, since working on the features makes the coding process easier (Floreano, 1996, p. 41).

Opting to codify the features enables certain difficulties relating to the so-called "local code"

(consisting in the fact that each input unit,xj, wherej = 1,2,...,n, corresponds to a specific object)

to be overcome. Generally speaking, neural networks with a minimal complexity use the so-

called "distributed coding", in which many units contribute towards representing each object. If

we assume that the input units codify the objects features, then every input unit on the network

will codify the presence or the value of a certain feature. Thus each object activates one or more

units and each unit is used for one or more objects, so each object is defined by the combinations

of active units in the network (Floreano, 1996, p. 43).

Taking the most straightforward hypothesis, i.e. that every input unit represents a feature, the

weight attributable to each input unit depends on the relative importance assigned to the feature

with respect to the others, according to the value function logic. Returning to [1], the component

of the vector of the features can be identified - again on the extremely simplified assumption

adopted here - with the input signals. Bearing in mind the significance of the weights, the net

input of a neuron,Ai, is usually represented by

[10] =N

j

jiji xwA where i = 1,2,...,n j = 1,2,...,n

Note that, while wj stands for the weight of the ith input of a unit, wij represents the strength of

7/31/2019 0011 Learning

13/22

13

the interconnection between the unitj and a unit i.

The net input Ai of an ith neuron is the algebraic sum of the products among all the j input

signalsxj and the values of the corresponding synapses wij, from which the threshold value i of

the neuron is subtracted. Thus the net input of the neuron will be given by

[11] =N

j

ijiji xwA

The response of the neuron, yi, is established by passing the net input through an activation

function (x) (Floreano, 1996, p. 35).

[12] =

N

j

ijiji xwy )(

The general principle is that learning (intended as the organized acquisition of knowledge) in

the model, i.e. in the neural network, mimics what is thought to happen in the brain when

something is learnt, i.e. connections are created between neurons and the cortical areas via the

synapses. The connections may have a "variable geometry", in the sense that the same stimulus

can give rise to different connections in different people. Knowledge (and consequently also

recall) of events, situations, objects, etc., is represented in the brain by means of relatively

durable configurations of synaptic connections and is distributed through said synaptic

connections. Knowledge is not stored in single units, but is distributed among many different

units, each of which contributes to the representations of many different elements of knowledge

(Mazzoni, 1998, p. 324). Rather than storing what they learn in a sort of "private" memory, the

neural networks store information in the connections between the nodes. In the neural scheme,

learning thus consists in reinforcing certain connections and extinguishing others.

The processing of the information takes place in the layers in which the neural network is

composed. The most straightforward neural network models are those which, like the

Perceptron, are composed of a layer of incoming networks, that receive stimuli and information

from the outside world, and a layer of neuronodes that process the information and then give a

representation of it as output. In this latter case, we speak of a layer of outgoing units or outputs.

At a slightly more complex level, there are models including a layer of hidden units that do some

essential preliminary information-processing work. The input units represent incoming

information elements and are activated by the stimulation deriving from information coming

7/31/2019 0011 Learning

14/22

14

from the surrounding environment.

This information makes the units trigger a signal; each input is attributed a relative weight,

which takes into account the importance of the input signal. The distribution of the weights on

the connections is due to the fact that some inputs are more important than others in the way in

which they combine to produce an impulse, and thus have a greater weight. The weight can thus

be seen as a measure of the strength, or intensity, of the connection. The hidden units then

receive signals from the input units and the weights of the synaptic connections that define them

are modified on the basis of said signals, which release more signals to the output units; here

again, these signals can modify both the weight of the output units and the strength of the

connections between the hidden units and the output units. The role of learning in the logic of

the Hebbian networks (which are used in the classification processes) can be expressed as follows

(Floreano, 1996, p. 66):

- given a neural network with N input nodes andP training pairs, each composedof an input vector, xp, and a required response ("target"), tp, the output from the network for

each input pattern is given by:

1 if wijxj > 0

[13]y =

0 otherwise

This value is compared with the required response, tp, for a given input pattern. If the nets

response is the same as the required response, the synaptic values are not changed; if, on the other

hand, there is a difference between output and required response, i.e. an error in the logic of the

neural networks, the synaptic weights are modified on the basis of the correct response, where

wij represents the correction to attribute to the synaptic weight in question

[14] wij = t xj

where is a proportionality constant; the value thus obtained is added to the preceding values

of the synapses.

Note that, according to Churchland and Sejnowski (1992, ch. IV), the representation and

classification of inputs and outputs in a PDP system takes place vectorially. Each neuron can

take part in the representation of many different elements and no single neuron represents an

7/31/2019 0011 Learning

15/22

7/31/2019 0011 Learning

16/22

16

one of two possible classes, so that C**1 and C*0.

Assuming (x1,x2,...,xn) as the input values and (w1,w2,...,wn) as the synaptic weights, without

any interconnections between the units, so that wijwi, then

[15] =N

i

ii xxwy 01

wherex0 is a threshold value.

As mentioned before,j-Perceptron has to verify whether or not a product belongs to a certain

class. Training takes place by submitting pairs of input/output examples to him in sequence until

the network is capable of calculating the function exactly. In other words, let's imagine that there

are only two product classes, (C*, C**), that divide the space of the goods, C, into two specific

subspaces. j must assign a generic good, b, to one of the two goods classes. It is also assumed

that there are only two inputs,x1,x2. Given the values ofx1 andx2,j-Perceptron must assign the

value of1 to the output if he "believes" that product b belongs to C**, or the value of 0 if he

"believes" that b belongs to C*.

Let

[16]==

x

iii

xwxg0

)(

be the overall input value; for the output value we shall have

1 if y(x) > x0

[17]y(x) =

0 if y(x) < x0

Assuming that we have only two inputs, then

[18]y(x) = w0 + w1w1 + w2w2

where w0 is an ad hoc weight. Resolving fory(x) = 0, we shall have

[19]x2 = (w1 / w2) x1 (w0 / w2)

which gives rise to a straight line that divides the Cregion into two sub-regions. For certain

values of (w0,w1,w2) the output will fall in the C** region and will thus be equal to 1; for other

values it will fall in the C* region and will equate to 0. The "decision surface" is found on the set

C. If there are numerous inputs, the decision surface will be composed of a hyperplane; we can

7/31/2019 0011 Learning

17/22

17

say that "the problem relating to the learning of aPerceptron can be brought down to the correct

determination of a decision surface" (Carrella, 1995, p. 189). So the learning strategy of a j-

Perceptron consists in progressively modifying the synaptic weights so as to enable the network

to proceed with a correct classification, assigning b to the right class. By way of an (extremely

simple) example, let's imagine that we have a consumerj who has the features of thePerceptron,

in that his function will be to learn to classify certain goods in their respective classes. The

neural network he uses will be a network with no hidden levels, with linear outputs from the

nodes. Errors will be corrected by means of a manual application of the "delta rule", also called

the root-mean-square error rule, which is based on the principle of modifying the weights of the

connections in sequence to reduce the difference (or "delta") between the required output and the

value found at the output neuron. Let's assume thatj-Perceptron has to classify a good, x, with

two features,x1,x2, and thaty(x)is the output indicating the product classes, 1 C**, 0 C*;

and lets say that w1,w2 are the corresponding synaptic weights. j-Perceptrons learning process

will consist in changing the synaptic weights if for a given output node the calculated value is not

the same as the required value. The model presented here is an adaptation of the model

illustrated by Carrella (1995, p. 163 et seq.).

Take the following truth table:

Table 1

Features OutputGood

x x y(x)

x 0 0 1

Each feature can be associated with the value of 1 or 0, which indicate the previously-

mentioned mono-attributive classes. The following parameters are also needed for the

application of the learning rule: T(threshold value), arbitrarily assumed as corresponding to 0.1;

e (error), assumed as corresponding to 0.1; d (percentage of weight correction), assumed as

corresponding to 0.5.

In the initial phase, the synaptic weights are assigned arbitrarily, in the sense that j-Perceptron

is uncertain as to how to classify the goods; lets take these weights to be

[20] w1 = -0.1 ; w2 = 0.2

7/31/2019 0011 Learning

18/22

18

Putting w0=Tin [18], the output neuronode is calculated according to the equation

[21]y(x) = w1x1 + w2x2 T

Ify(x) acquires a positive value, the output note will indicate a value of 1; if not, it will

indicate a value of0; and, as we know, these values indicate the products classes.

Forx1 = 0, x2 = 0, T = 0.1, the output value required would be 1. Taking the weights indicated

in [20] and inserting them in [21], and making the necessary simple calculations, we find that the

result equates to -0.1 and is therefore negative, so the output value assigned toy(x) will be 0.

Table 2

Good Required y Calculated y

x 1 0

If we compare the two columns, we find that j-Perceptron has failed to classify the good

correctly, so he must modify the synaptic weights by a proportional amount; if the output value is

0, when it should be 1, he must increase the weights; in the opposite case, he must reduce them.

In order to calculate the error, it is best to treat the threshold value as an input, x3 = 1, having a

weight w3 = -T. Then the equation for determining the weights becomes

[22] w1x1 + w2x2 + w3x3 > 0

The new weights are obtained from the old weights plus the correction factor, Fc, calculated

on the old weights. The correction factor will be as follows

[23]Fc= (E + e)d

whereEis the error; as we know, e is the value assigned to the error, d is the percentage of

weight correction. The error,E, is defined as

[24]E = 0 (w1x1 + w2x2 + w3x3)

For the values assigned before,E = 0.1. Hence

[25]Fc = (E + e)d = (0.1 + 0.1) 0.5 = 0.1

We can now modify the weights in proportion to the calculated value until we find a system of

weights capable of representing all the input/output pairs, through an iterative process which, in

the specific case of our example, can lead to the solution of the problem in a number of cycles.

Each cycle can be considered as an act of experimental consumption.

7/31/2019 0011 Learning

19/22

19

9. Conclusions

Given the simplified structure of the model used here, we can represent every feature of a

product as an input and assume that every training cycle will correspond to the acquisition of a

new item of information, so thatj is capable of modifying the structure representing the features

of a product in his memory.

In practice, categorization processes, even for a single attribute, can be described by highly

complex neural networks, so that the coupling of different attributes necessarily leads to the

construction of neural networks that are far more complex than the example considered here.

Nonetheless, this is a useful exercise for the purpose of understanding the fact that product

classification, through the acquisition of suitable information, involves "internal" modeling

processes on the consumer's cognitive structure. Learning can thus be represented by said

processes.

Clearly, the use of neural modeling can hardly cover all learning processes. It does grasp a

part of said processes, however, i.e. the ones characterized by the need to classify certain

patterns, as in the case of consumer products.

7/31/2019 0011 Learning

20/22

20

REFERENCES

AKERLOF G., "The Market for "Lemons": Quality Uncertainty and the Market Mechanism",Quartely Journal of Economics, 1970, 84, pp.488-500

ALLAIS M., "Determination of cardinal Utility according to an Intrinsic Invariant Model", in L.

Daboni, A. Montesano and M. Lines, eds., Recent Developments in the Foundations of

Utility and Risk Theory, Dordrecht: D. Reidel Publishing Company, 1986, pp.83-120

ANDERSON J.R., Cognitive Psychology and its Implications, New York: Freeman & Co. 1980

BELTRATTI A., MARGARITA S. and TERNA P., Neural Networks for Economic and

Financial Models, London: International Thompson Computer Press, 1996

BRENNER T.,Modelling Learning in Economics, Cheltenham (UK); Edward Elgar, 1999

CARRELLA G.,L'Officina Neurale, Milano: Franco Angeli Editore, 1995

CHURCHLAND P.J.and SEJNOWSKI T.J., The Computational Brain, Cambridge, MA: The

MIT Press, 1992

CHURCHLAND P., The Engine of Reason, the Seat of the Soul, Cambridge,MA: The MIT

Press,1995

CYERT R. and DE GROOt M.H., Adaptive Utility, in R.H.Day, T.Groves, eds., AdaptiveEconomic Models, London: Academic Press, 1975, pp. 223-46

DEBREU G., Theory of Value, New York: Wiley, 1959

FABBRI G. and ORSINI R., Reti Neurali per le Scienze Economiche, Padova: Franco Muzzio

Editore,1993

FLOREANO D.,Manuale sulle Reti Neurali, Bologna, Il Mulino, 1996

von HAYEK F., The Sensory Order. An Inquiry into the Foundations of Theoretical Psychology ,

London: Routledge & Kegan, 1952

KIHLSTROM R., MIRMAN L. and POSTLEWAITE A., "Experimental Consumption and the

Rothschild Effect", in M. Boyer, R. Kihlstrom, eds.,Bayesian Models in Economic Theory,

Amsterdam: North Holland,1984, pp. 279-302

KEENEY R.L. and RAIFFA H., Decisions with Multiple Objectives: Preferences and Value

Tradeoffs, New York, Wiley, 1976

7/31/2019 0011 Learning

21/22

7/31/2019 0011 Learning

22/22

22

PATTERN RECOGNITION

AND

CONSUMER LEARNING

Maurizio Mistri

(Department of Economics, University of Padua)

ABSTRACT

This paper deals with the topic of consumer learning as an extension of the experimental

consumer approach. With respect to said approach, however, learning is dealt with as a process ofproducts categorization and classification. For this purpose, the goods are described on the basis

of their features and are represented vectorially by means of value functions. It is easily

demonstrated how said methodology enables the use of neural networks as an analytical and

logical instrument. In the last part of the paper, a simple example is given of the application of

the neural network layout to describe a consumer called upon to classify certain products.

JEL classification: D12,D83

Keywords: consumption, consumer behavior, consumer learning

0011 Learning

Documents

Transcript of 0011 Learning