Post on 17-Mar-2016
description
The econometric discrete dependent variable multinomial Logit model
Eleftherios Giovanis
This paper examines the consumers’ preferences to the local furniture market in the Province of Serres. We apply a multinomial logit model to investigate the probability of buying a furniture in the following four-monthly period. We analyze also the demographic characteristics and we conclude that they are playing a major role among other factors. The questionnaire that will be analyzed in the particular project is a subset of the prototype, while the questions that were included in the initial questionnaire were too many, as a result the analysis to be quite long. So we tried to concentrate and to be restricted at the most important factors that they practice a great influence to the consumers’ choice decisions.
Introduction
According to the findings of the sector-based study that was realised by ICAP
(COMCENTER , 2007) in Greece the majority of the productive furniture units are characterized
by the small size, while usually are of familial nature and they do not have automated
production. The productive units of medium and big size it is appreciated that they approach
the 30% of the market share. The conclusions of this study are that the Greek enterprises
present a decreasing export activity, while a shift of the market share to the super-markets
has been marked, as well as to the importing enterprises via franchising. An other conclusion
of the study is that the purchase and the furniture consuming are directly connected with the
disposable income . So the problem that emerges is that an important part of the disposable
income of Greek households is absorbed because of the obligations of the loans settlement.
This fact results to the time change of the existing furniture replacement.
The domestic furniture consumption marked an increasing course during the period 1998-
2006 with an average annual of 4.6%. At the year of 2006 the living room furniture it is
appreciated that they assembled the 47.0% of the total domestic market, the bedroom
furniture covered the 27.0% of the total domestic market, while the dining room furniture
assembled a percentage of 26.0% (COMCENTER , 2007)
Jonkers (2006) in a report that was conducted in collaboration with the CBI Market
Survey finds that one of the major threats in the Greek domestic furniture market is that the
1
Greek economy is quite dependent on furniture imports , based mainly on low prices, such
they arise opportunities for the developing country exporters, because the imports from these
countries are increasing at a faster rate than the imports from the developed countries.
According to Jonkers the best opportunities are in living and dining room furniture, where
domestic production is declining. In the same survey, imports increased by 80% in value
between 2001 and 2005, while exports were increased only by 14%. The major developing
countries exporters are China with Є 56 million, Turkey with Є 28.8 million , Indonesia with
Є 16.4 million, Vietnam with Є 10.2 million, India with Є 4.9 million, Malaysia with Є 4.2
million and then smaller suppliers are followed, as Albania, Egypt and South Africa. As for
the furniture exports the largest destination country is Cyprus, while Bulgaria, Germany and
Romania are followed.
The most important firm in Serres , and one of the most important in Greece, is the firm
“DROMEAS” ABEEA , which was established in 1979 and it is sited at the Industrial Area
outside Serres, about 80 km northeast of Thessalonica. Some of the firm’s achievements are
the equipment of 10,000 seats for waiting area of Manila’s airport and 4,000 seats for two
airports in Egypt. Smaller tasks bear its stamp in UK, Saudi Arabia and Australia. Also the
firm undertook the 40.0% of the furniture production that Olympic Committee was needed
(Interwood, 2007). Some other furniture firms and shops that are taking place in the
Prefecture of Serres are “BLACK RED WHITE”, “Fratzana”, “Kioutsoukis”, “ARREDO”,
and shops like “NEOSET”, “SATO” and others.
The main role of this project is to present some of the most important Logit models, that
can be used in the marketing survey researches and to choose the possible best model, while
this model choice it’s not unique, but is depended in the kind of product or service, the
questionnaire and sample design , the kind of the market , the city or the country , as also the
demographic characteristics, where a specific research is taking place.
2
Data
The data have been obtained by a marketing research that was realized by telephone
interview on 12-15 February of 2008 and was conducted by the firm “Analysis Center”. The
sample is 387 households and is being referred in the Prefecture of Serres in the region of
Macedonia of Greece. In the first stage the sample design was random, but in the second stage
data have been weighted based on age and sex. We must notice that the marketing survey is
refereed to households, but we are concern and for the sex too, because we would like to
obtain hypothesis test about the opinion and the preferences difference between the two sexes.
The weightings have been made based on the demographics data provided by National
Statistical Service of Greece. As concerning the urban weighting, is not necessary because the
research is reported for the city of Serres and the Capitals of regional Municipalities, so we
are concerning about only to urban population. We must notice that if the sample in the first
sample was not random, but stratified, as the industries in a specific sector, or particular age
category of particular sex, or a specific geographical region the weighted models would create
problems, as low standard errors and consequently erroneous interpretation of test
significance hypothesis. We must mention that it’s not possible to refer the name of the firm
which gave the order of the specific marketing research for private rights, but we are just
trying to give a guide of different approaches in the estimation of Logit models, as well the
interpretation of the results.
Methodology
The first thing that we must point out is to explain why we must take the Logit and not the
Probit model. In most application the two models are quite similar, while the main difference
is that the logistic distribution has slightly fatter tails, as we can see in figure 2.1. Also there is
no important reason to choose one model over the other. Actually many researchers prefer
Logit model, because of its mathematical simplicity (Gujarati, 2004).
3
Figure 1 Probit and logit cumulative distributions
In the model , that we will take, we would like to estimate the probability of buying
furniture in the next four-monthly period based on the kind of the furniture that consumers
generally would prefer to buy, on the criteria they choose the shop, on how much money they
intend to give , on demographics data as sex, age, income and profession. The multinomial
logit model in its general theoretical form is:
833632531430329228
1274263252241232221
2061941831721611514
13123112104938
27165544332211
Pr
PfbPfbPfbPfbPfbPfb
PfbIncbIncbIncbIncbidbageb
SexbInfbInfbInfbInfbInfbloyb
VarietybicebMonbMonbCritbCritb
CritbCritbCbCbCbCbCbaLi
+++++
+++++++
+++++++
++++++
++++++++=
,where α is constant, C1 is a dummy variable and is referred to question 1 , table 3.1,
presented in 3,where C1=1 for Living rooms and C1=0 otherwise where C2=1 , C3=1, C4=1,
C5=1 for dining rooms, Children furniture, Garden furniture, Bedrooms and Office furniture
respectively and zero otherwise. Crit is a dummy variable and is referred to question 2 , table
3.2 where Crit1=1 for Price and Crit1=0 otherwise where Crit2=1, Crit3=1, Crit4=1 for
quality, variety and trade name respectively and zero otherwise. Mon2 is a dummy variable
and is referred to question 3, table 3.3 where Mon2 =1 for 250-600 € and Mon2 =0 otherwise
and Mon3 =1 for ≥600 € and Mon2 =0 otherwise. Variables “Price”, “Variety” are
quantitative variables and are referred to question 4 , table 3.4. Loy is a dummy variable and is
4
referred to question 5, table 3.5 where loy=1 for Serres and loy =0 otherwise. Inf1 is a
dummy variable and is referred to question 3.6 , table 6 where Inf1=1 for TV and Inf1=0
otherwise and so for other variables Inf. Variable Sex is a dummy variable where Sex=1 for
male and Sex=0 for female. Variable age is quantitative variable and is presented in table
3.10. Variable “id” is a dummy variable where equals with 1 when the consumer lives in the
Municipality of Serres and equals with 0 the consumer lives in the regional Municipalities of
Serres Prefecture. The reason why we are taking this variable is to examine if the consumers
are characterized by homogenous preferences according to location or if there is heterogeneity
among them. Of course we could make the analysis more complicated and to cluster into
groups the main geographic regions but we make the hypothesis that the preferences are
homogenous, because in question 2 the “location” criterion assemble only 0.9%, so it doesn’t
play a crucial role in the consumer choices. Variables Inc are the dummy income variables
where Inc1=1 for income <500 € and Inc1=0 otherwise. The same procedure followed for the
other variables of income, Inc2, Inc3 and Inc4, and are presented in table 3.8. Finally variable
Pf (table 3.8) are the dummy profession variables, where Pf1 =1 for employees in Rural
Sector and Pf1 =0 otherwise. The same procedure is followed for the other variables of
employment , Pf2, Pf3, Pf4 ,Pf5, Pf6, and Pf8. The dependent polytomous variable is Li where is
referred to question 7, table 3.7 and it is Li =1 for those who answered YES, Li =2 for those
who answered NO and Li =3 for those who answered MAY BE.
So for a dummy variable with S categories, this requires the calculation of S-1 equations,
one for each category relative to the reference category. When using multinomial logistic
regression, one category of the dependent variable is chosen as the comparison category. This
category will be for Li =3. The probability is defined as
∑+
==J
j Ji
ji
i
X
Xjy
)exp(1
)exp()Pr(
β
β (1)
5
,and the log likelihood function can be written as
∑∑ −=J
j jiji
J
j i XXjy )exp(log()( ββ (2)
,where for the ith individual, yi is the observed outcome (dependent variable) and Xi is a
vector of explanatory variables , categorical or not, while j is the particular outcome and J
refers to all outcomes, except the base category. The unknown parameters βj are estimated by
maximum likelihood (Bartels, Boztug & Muller, 1999). The explanatory variables in relation
(1) doesn’t include the script t because the cases are the same for each choice j. With this
model we intend to explain if an unordered set of outcomes applies to the different individuals
in our sample, which means that probabilities of all these outcomes depend on the same
characteristics (Davidson & MacKinnon, 1999). In the section of the results we will show a
simple estimation example. Multinomial Logit relies in the assumption which called
independence from irrelevant alternatives (IIA) . This assumption claims that disturbances are
independent and homoscedastic (Greene, 2002). Because the dependent variable includes 3
outcomes we will consider outcome 1 (YES) as the base reference category and we will
estimate for the other two outcomes . So the probability for outcome 1 (YES) will be
∑+==
J
j Ji
ii
X
Xy
)exp(1
)exp()1Pr( 1
β
β (3)
, for outcome 2 (NO)
∑+==
J
j Ji
ii
X
Xy
)exp(1
)exp()2Pr( 2
β
β (4)
, and finally the probability for outcome 3 (MAY BE) is
6
∑+==
J
j Ji
i
Xy
)exp(1
1)3Pr(
β (5)
(Davidson & MacKinnon, 1999) A final matter that we must analyze is that from question 4
we took only variables price and variety. The reason why we have done this is that consumers
seem to respond in the same way, which means that price and quality might be considered as
a single variable, grouped to one. So we are trying to reduce the number of variables to avoid
the multicollinearity problem. Because those variables of question 4 are actually hierarchical,
the procedure of the cluster analysis is an agglomerative hierarchical method that begins with
all variables separate, six in our case, each forming its own cluster. . In the first step, the two
variables closest together are joined. In the next step, either a third variable joins the first two,
or two other variables join together into a different cluster. This process continues until all
clusters joined into one, but we decide to take two groups as it is more logical for our data.
First we must find the similarity measures between the variables and this can be done with the
commonly correlation coefficient distance measure
∑∑∑∑
∑∑∑−−
−=
])([])([ 2222 yynxxn
yxxynr
(6)
Ward’s cluster method objective is to minimize the sum of squares of the deviations from the
mean value (Žiberna et al, 2004)
∑∑∑ −=k
ikijk
ji
xXESS (7)
Ward’s clustering method results are presented in the figure 2.2, where we conclude that
the first groups constitutes by price, quality, service and service after shopping and the second
group constitutes by variety and delivery. The next step is taking the averages of each group
and to obtain the new variables.
7
Figure 2
Ward’s clustering method
V a r ia b l e s
Similarity
d e l i v e ryv a ri e tyse rv i c e a f te r sh o p p i n gse rv i c eq u a l l i t yp ri c e
-3 ,7 0
3 0 ,8 6
6 5 ,4 3
1 0 0 ,0 0
D e n d ro g ram w ith W a rd L in k ag e a n d Ab so lu te C o r r e la t io n C o e f f ic ie n t D is ta n ce
Second method is principal components. First we find the covariance matrix of the six
above variables. Then we find the eigenvalues of the covariance matrix in table 2.1. There are
two components with eigenvalues greater than unit. Table 2.2 presents the first principal
component eigenvector and we conclude again that we can obtain variables price, service,
quality and service after shopping as one, and from the other side variety and delivery as
another variable.
The first method is the frequency weighted multinomial logistic regression based on age.
The survey was conducted based on households but age plays an significant cluster variable
because there isn’t great age difference between couples and from the age we can generate
important significant. This is explained because the category of 30-50 years old presents the
greatest majority and frequency, especially in the city. So this category has the greatest weight
than the corresponding categories 18-24 old or 65 years and more, because couples that
belong in the category of 30-50 years old are more likely to buy furniture, for various reasons
as marriage, for replacement, because of deterioration or renovation, or to buy for their
children, that they will live in other house or in other city for educational purposes, working
or marriage. The probability is:
8
∑+
==J
j Jii
iji
i
XW
WXjy
)exp(1
)exp()Pr(
β
β (8.a.)
, while 8.a. can be written as
∑ −
−
−+
−
==J
j Ji
ji
i
XmJ
Jn
mj
jnX
jy
))exp((1
))(exp(
)Pr(1
1
β
β
(8.b.)
Where n is the number of observations, j is the specific outcome, J express all the
outcomes, except the base category, and m is the number of cases (Langholz & Goldstein,
2001). So for example if there are three persons of 30 years old, where the cases m equals
with three, who choose outcome 2 (NO), what is the probability based on the questions and
the demographics data?
The second model is the weighted robust multinomial Logit , where we obtain the same
weight as in the case of the weighted multinomial Logit model. The problem that arise in the
previous model is that MLE method and Rao’s score test can be misleading in the model
misspecification because of misclassification errors or extreme data points, the well known
outliers, in the sample (Pia & Feser, 2000). Pregibon (1982) suggests some tools that remove
data from the sample. But the problem that arises is that, while this procedure is iterative,
leaves the analyst with a considerably reduced sample. Robust is the well known Huber-
White sandwich variance estimator. Probabilities are defined as in the 8.a. The Huber-White
variance estimator is
11 )]()[()]([
1 −
Ε
∧∧
Ε
∧∧−
Ε
∧∧
Φ= βββ HHn
VE (9)
, where ∑=
Ε
∧
Ε
∧
Ε
∧
Ε
∧∧
∂∂
∂=
n
i
ii
΄
xyg
nH
1
2
]),|(log
[1
)(ββ
ββ
(9.a.)
9
∧
H is the Hessian matrix and
]),|(log
][),|(log
[1
1 ΄
xygxyg
n
iin
i
ii
∧
∧
=∧
∧∧
∂
∂
∂
∂=Φ ∑
β
β
β
β (9.b.)
,while if Ε
∧
β is the true MLE estimator then VE simplifies to 1)]}([{ −Ε
∧
Η− β .(Greene, 2002).
We notice that these standard errors, in the case we study, are robust for certain
misspecifications of the distribution of dependent variable and not for heteroscedasticity. The
reason why we claim that is that the assumption where disturbances are independent and
homoscedastic is confirmed with Hausman’s test and we will analyze it in next part of the
project.
The third method is the replication method with Jackknife standard errors. Jackknife is a
non nonparametric technique for estimating standard error of a statistic. The procedure is a
systematically recomputation of the statistic estimation leaving out one observation at a time
from the sample set. Thus, each subsample consists of n − 1 observations formed by deleting
a different observation from the sample. The jackknife estimator and its standard error are
then calculated from these truncated subsamples (Greene, 2002). For example, suppose θ is
the parameter of interest and let )()2()1( ...., n
∧∧∧
θθθ be estimations of θ based on n subsamples
each of size n − 1. The jackknife estimator of θ is given by (Wolter, 2007)
n
n
i
i
J
∑=
∧
∧
= 1
)(θ
θ (10)
and the jackknife estimate of the standard error of J
∧
θ is
2/12
1
)( ])(1
[ J
n
i
i
n
nJ
∧
=
∧∧
∑ −−
=∧ θθσ θ (11)
10
The t-statistic can be defined as
2/12
1
)(
)(
])(1
1[
)(
J
n
i
i
Ji
n
nt
∧
=
∧
∧∧∧
∑ −−
−=
θθ
θθ
(12)
Results
We must notice that there isn’t something equivalent and available, in the literature , to be
able to compare our results with other findings. Marketing research firms are dealing with
these matters, but these results are not available in public. From the results that are presented
in tables 1-3 in appendix we conclude that we reject the simple weighted Logit model because
of the great number of the statistical insignificance of the variables, even if from table 4 and
the Hausman test we conclude that the independence from irrelevant alternatives (IIA)
hypothesis is true. Also we reject the weighted Logit model with robust White-Huber standard
errors because of the heteroscedasticity presence and so the IIA assumption violation. So we
accept as the best estimation the weighted multinomial Logit with Jackknife standard errors,
which satisfies also the IIA assumption. So if we would like to make a probability prediction
for a consumer of buying or not or not sure of buying in the next four-monthly period we will
take the following probabilities.
)exp(1
)exp()1Pr( 1
T
iL
Ly
Σ+== ,
)exp(1
)exp()2Pr( 2
T
iL
Ly
Σ+== ,
and
)exp(1
1)3Pr(
T
iL
yΣ+
==
So for example if a consumer chose from question 1 the answer Living rooms, the main
criterion of buying from a furniture shop is the price, is female ,she intends to spend 250-600
€, she marks all the characteristics of her previous shopping – price, quality and the others-
11
with 5, she is 30 years old, she prefers Serres , as the region of shopping, she prefers to be
informed by leaflets, her income is 1001-1500 €, the profession is businessman and she lives
in the Municipal of Serres, then by Table 3 in appendix the probabilities for the multinomial
Logit with Jackknife standard errors will be.
L1 = -24.646 + 1.977 – 2.446 + 5*0.837 + 5* 0.415 – 30*0.042 + 0.783 –3.635 – 2.05 +
26.610 -1.998 + 8.491 = 8.093
for outcome 1 and
L2 = -22.896 - 18.582 – 1.496 + 5*0.541 + 5* 0.344 + 30*0.048 +0.269 +4.091 1.604 +
32.296 -1.266 + 16.870 = 8.559
for outcome 2
%50.3894.8485
48.3271
)559.8exp()093.8exp(1
)093.8exp(
)exp(1
)exp()1Pr( 1
=
=++
=Σ+
==T
iL
Ly
%40.6194.8485
46.5213
)559.8exp()093.8exp(1
)559.8exp(
)exp(1
)exp()2Pr( 2
=
=++
=Σ+
==T
iL
Ly
and %1.094.8485
1
)exp(1
1)3Pr( ==
Σ+==
T
iL
y
Performance test of the proposed model
The next step is to apply a Monte-Carlo simulation to test the performance evaluation and
capability of the model we are presented. The expected coefficient value can be defined as
(Janke, 2002)
∑=
=N
i
iXfN
X1
)(1
(13)
12
, where X is the expectation value and the estimator X is a random number fluctuating
around the theoretical expected value. The variance is
222 )()( Χ−Χ=
Χσ (14)
, where we can take the standard errorN
σ. We must mention that the formula of standard
error is important, because the standard error of a Monte-Carlo simulation analysis decreases
with the square root of the sample size. Also if we would like for example a 50% error
reduction, or a 50% increase in accuracy, we must quadruple the number of random
drawings. As we already know, from relation (1)
∑+==≡
J
j Ji
ji
ij
X
Xjy
)exp(1
)exp()Pr(
β
βπ
(15)
So we can draw a predicted value y, from a multinomial distribution with parameters equal to
πj and n=1. We simulated the model with 500 set of parameters and then we took relations
(13) and (14) to find the mean estimated parameters and their standard errors. We decided to
simulate our estimations because our sample is finite so the parameter estimations are never
certain (Tomz et al, 2000) and probably not reliable and efficient. More specifically the
program draws simulations of the parameters from their asymptotic sampling distribution
equal to the vector of the estimated parameters and variance equal to the variance-covariance
matrix of estimates (Tomz et al, 2000). From the results of table 7 we conclude that our
model is fairly good, because the estimated coefficients by Monte-Carlo simulation are very
close to the estimated coefficients of the multinomial weighted Logit model with Jackknife
standard errors.
13
Conclusions
We applied three different multinomial Logit models for the marketing research survey
that was conducted in the Prefecture of Serres , for the case of the furniture market. The scope
of the research was the probability estimation of buying furniture, in the next four-monthly
period, based on the questionnaire and the demographic characteristics of the potential
consumers. We found that the simple weighted multinomial Logit is suffering by many
statistical insignificant variables, as there is a great possibility of the multicollinearity
problem. From the other side the weighted multinomial Logit, with Huber-White robust
standard errors presents heteroscedasticity and violates the IIA hypothesis. So we preferred to
choose the weighted multinomial Logit, with jackknife standard errors. We applied a simple
Monte-Carlo simulation and we concluded that the proposed model is quite a good option in
our case. We must mention that there are also other good estimations, as the Principal
Components (PC) logit or bootstrap, but the estimation are quite similar, with that of the
model we propose here, so it’s not necessary to present the results. It’s just worthy of
mentioning these methods, as PCA-logit or bootstrap, because in some other cases the
estimations might be quite better.
References
COMCENTER (2007),,“ The highly-fragmented furniture market in Greece” , I.C.A.P.
Bartels K., Boztug Y. & Muller M., (1999) “Testing the multinomial logit model”, working paper, University Potsdam, Humboldt-University at Berlin, Germany
Davidson R. & MacKinnon G.J., (1999), “Econometric theory and methods,” Oxford University Press, New York ,pp. 460-462 Greene H.W., (2003), “Econometric Analysis,” Fifth edition, Prentice Hall, New Jersey, U.S.A. , pp. 518-521, 724, 924 Gujarati D., (2004), “Basic Econometrics,” Fourth edition, McGraw-Hill, U.S.A., pp. 614-615
Interwood magazine , (2007) , “Dromeas presentation,” pp. 12-21
14
Janke W., (2002), “Statistical Analysis of Simulations: Data Correlations and Error Estimation,” John von Neumann Institute for Computing, Julich, NIC Series, Vol. 10, pp. 423-445. Jonkers J. (2006), “The domestic furniture market in Greece,” CBI MARKET SURVEY, Centre for the promotion of imports from developing countries, The Netherlands Langholz B. & Goldstein L., (2001), “Conditional logistic analysis of case-control studies with complex sampling,” Biostatistics, 2(1), 63-84. Pia M. & Feser V., (2000), “Robust Logistic Regression for Binomial Responses”, working paper, University of Geneva. Pregibon, D. (1982). “Resistant fits for some commonly used logistic models with medical applications,” Biometrics 38, 485-498.
Tomz M., Wittenberg J., King G., (2000), “Making the Most of Statistical Analyses: Improving Interpretation and Presentation,” American Journal of Political Science, Vol. 44, No. pp. 341–355 Wolter M. K. ,(2007), “Introduction to Variance Estimation,” Statistics for Social and behavioural sciences , Second Edition, Springer, 151-153 Žiberna A., Kejžar N. & Golob P., (2004), “A Comparison of Different Approaches to Hierarchical Clustering of Ordinal Data” , Metodološki zvezki, 1(1), 57-73
15
TABLE 1 EIGENVALUES
Eigenvalue 2,2600 1,1140 0,9610 0,6817 0,5238 0,4595
Proportion 0,377 0,186 0,160 0,114 0,087 0,077
Cumulative 0,377 0,562 0,722 0,836 0,923 1,000
TABLE 2. 1st PC factor
Variable PC1
price 0,441
service 0,489
quallity 0,527
variety 0,205
delivery 0,177
Service after shopping
0,463
TABLE 3 1.From which furniture category to you intend
generally to buy?
Percent
Living rooms 54.2
Dining rooms 11.2
Children furniture 9.3
Garden furniture 2.8
Bedrooms 19.2
Office furniture 3.3
TABLE 4 2. Which are the main criteria of buying
from a furniture shop?
Percent
Price 56.2
Quality 31.2
Variety 8.7
Trade name 3.0
Location 0.9
16
TABLE 5 3. How much money do you intend to give?
Percent
≤ 250 € 19.2
250-600 € 26.8
≥ 600 € 54.0
TABLE 6
4. Mark between 1 and five (5 is the best and 1 is the worst) the following characteristics you
faced in your previous furniture shopping.
Service Percent Mean St. deviation
1 0.7
2 3.2
3 19.4
4 41.2
5 35.5
4.07
0.86
Price
1 4.8
2 5.5
3 19.0
4 31.5
5 39.2
3.95
1.11
Quality
1 7.8
2 6.5
3 22.6
4 30.4
5 32.7
3.74
1.20
Variety
1 2.4
2 6.7
3 14.1
4 24.7
5 52.1
4.17
1.06
17
TABLE 6 (Continue)
4. Mark between 1 and five (5 is the best and 1 is the worst) the following characteristics you
faced in your previous furniture shopping.
Delivery Percent Mean St. Deviation
1 35.3
2 11.7
3 6.7
4 9.5
5 36.8
3.0
1.76
Service after shopping
1 4.7
2 5.9
3 23.9
4 22.7
5 42.8
3.93
1.15
TABLE 7 5. For the specific shopping do you prefer the Serres shops or other regions?
Percent
Serres 74.5
Thessaloniki 17.4
Drama 4.2
Bulgaria 0.3
Other region 3.6
TABLE 8 6. How would you like to be informed about the furniture products?
Percent
TV 24.8
Radio 1.4
Newspapers-magazines 11.0
Leaflets 53.2
Phone contact 1.7
Internet 7.9
18
TABLE 9 7. Will you buy furniture in the following four-monthly period?
Percent
YES 19.5
NO 66.0
MAY BE 14.5
TABLE 10 Income distribution and profession activity
Income Percent
<500 € 11.2
501-1000 € 33.2
1001-1500 € 29.7
1501-2000 € 12.5
>2000 € 13.4
Profession
Rural Sector 6.1
Public Sector Employee 16.9
Private Sector Employee 16.3
Businessman 11.8
Student 3.9
Household 20.8
Unemployed 6.1
Pensioner 18.1
TABLE 11 Sex
Percent
MALE 46.5
FEMALE 53.5
TABLE 12 Age
Percent
Mean 47.0
St. Deviation 14.4
Std. Error of Mean 0.78
22
TABLE 1 Weighted multinomial Logit model
Market = 1 Coef. z Market
= 1
Coef. z Market = 2 Coef. z Market = 2 Coef. z
C1 -24.19117 (2258.291)
-0.01 INF4 -2.743014* (.3507851)
-7.82 C1 -22.16686 (2258.291)
-0.01 INF4 -3.368385* (.3205891)
-10.51
C2 -24.3191 (2258.291)
-0.01 INF5 22.83438 . C2 -19.07691 (2258.291)
-0.01 INF5 16.97649* (.3756762)
45.19
C3 -25.38924 (2258.291)
-0.01 Sex -2.679168* (.1854444)
-14.45 C3 -21.85798 (2258.291)
-0.01 Sex -1.333601* (.1519156)
-8.78
C4 -22.60833 (1907391)
-0.00 Age -.0599014* (.0087728)
-6.83 C4 10.60145 (1369884)
0.00 Age .0313198* (.0069501)
4.51
C5 -24.28615 (2258.291)
-0.01 id -1.906601* (.2293723)
-8.31 C5 -21.49782 (2258.291)
-0.01 id -1.112417* (.1915824)
-5.81
CRIT1 2.309676* (.184604)
12.51 INC1 -5.837068* (.4140417)
-14.10 CRIT1 -17.626 (2258.291)
-0.01 INC1 -.8446702* (.3298488)
-2.56
CRIT2 -1.402899 (.)
. INC2 -4.76893* (.2847018)
-16.75 CRIT2 -19.67238 (2258.291)
-0.01 INC2 -.1101128 (.2469728)
-0.45
CRIT3 -5.21230* (.3421758)
-15.23 INC3 -2.029722* (.2858869)
-7.10 CRIT3 -24.18795 (2258.291)
-0.01 INC3 1.624504* (.2553209)
6.36
CRIT4 22.21736 (2258.291)
0.01 INC4 -7.152252* (.3930177)
-18.20 CRIT4 2.673512 (.)
. INC4 -1.967171* (.2919626)
-6.74
MON2 -1.20423* (.3642494)
-3.31 PF1 28.05087 (2258.291)
0.01 MON2 -1.06954* (.3539236)
-3.02 PF1 29.74646* (.2330322)
127.65
MON3 -1.69661* (.348464)
-4.87 PF2 24.8759 (2258.291)
0.01 MON3 -.6975685* (.347466)
-2.01 PF2 29.64446* (.2612)
113.49
Price 1.564757* (.1200148)
13.04 PF3 28.19031 (2258.291)
0.01 Price 1.16457* (.1035494)
11.25 PF3 29.98161* (.2875892)
104.25
Variety .2101363* (.0860052)
2.44 PF4 28.53592 (2258.291)
0.01 Variety .3212617* (.0649514)
4.95 PF4 33.55159* (.5056188)
66.36
LOY .8190109* (.1819961)
4.50 PF5 21.98089 (5227809)
0.00 LOY .2840219 (.1461295)
1.94 PF5 63.10384 (4506271)
0.00
INF1 -.0253447 (.3857449)
-0.07 PF6 26.89052 (2258.291)
0.01 INF1 -1.556478* (.3376938)
-4.61 PF6 29.36098* (.2271631)
129.25
INF2 20.11875 (.)
. PF8 30.49286 (2258.291)
0.01 INF2 20.32117* (.3704731)
54.85 PF8 31.04017 (.)
.
INF3 -3.28183* (.3923674)
-8.36 constant 3.03079 . INF3 -3.631752* (.3446816)
-10.54 constant 11.004 (.)
.
Log likelihood -3154.662 Pseudo R2 = 0.4113
Note: .(market=3 is the base outcome) , st. errors in parentheses, * denotes significant in 5% level, z denotes z-statistics
23
TABLE 2
Weighted multinomial Logit model with Huber-White robust standard errors Market = 1 Coef. z Market = 1 Coef. z Market = 2 Coef. z Market = 2 Coef. z C1 -24.1911*
(1.311869) -18.44 INF4 -2.743014*
(.2599947) -10.55 C1 -22.16686
(.) . INF4 -3.368385*
(.1704796) -19.76
C2 -24.3191* (1.079975)
-22.52 INF5 22.83438 (.)
. C2 -19.07691 (.)
. INF5 16.97649* (.3390611)
50.07
C3 -25.3892* (1.43201)
-17.73 Sex -2.679168* (.1868817)
-14.34 C3 -21.85798* (1.377216)
-15.87 Sex -1.333601* (.1386592)
-9.62
C4 -22.6083* (1.588602)
-14.23 Age -.0599014* (.009422)
-6.36 C4 10.60145 (.)
. Age .0313198* (.007253)
4.32
C5 -24.28615 (.)
. id -1.906601* (.2064382)
-9.24 C5 -21.49782 (.)
. id -1.112417* (.1512474)
-7.35
CRIT1 2.309676* (.2087961)
11.06 INC1 -5.837068* (.3919177)
-14.89 CRIT1 -17.626* (1.776402)
-9.92 INC1 -.8446702* (.336651)
-2.51
CRIT2 -1.402899 (.)
. INC2 -4.76893* (.2479698)
-19.23 CRIT2 -19.67238 (.)
. INC2 -.1101128 (.2396833)
-0.46
CRIT3 -5.21230* (.3268497)
-15.95 INC3 -2.029722* (.2660761)
-7.63 CRIT3 -24.18795 (.)
. INC3 1.624504* (.2580996)
6.29
CRIT4 22.21736 (.)
. INC4 -7.152252* (.3635541)
-19.67 CRIT4 2.673512 (.)
. INC4 -1.967171* (.3080401)
-6.39
MON2 -1.20423* (.2631651)
-4.58 PF1 28.05087* (2.772297)
10.12 MON2 -1.06954* (.2122592)
-5.04 PF1 29.74646* (.297904)
99.85
MON3 -1.69661* (.247516)
-6.85 PF2 24.8759 (.)
. MON3 -.6975685* (.2015474)
-3.46 PF2 29.64446* (.2209823)
134.15
Price 1.564757* (.1194442)
13.10 PF3 28.19031* (.7815903)
36.07 Price 1.16457* (.1153233)
10.10 PF3 29.98161* (.2531853)
118.42
Variety .0859671* (.0859671)
2.44 PF4 28.53592* (3.422268)
8.34 Variety .3212617* (.0571129)
5.63 PF4 33.55159* (.3761956)
89.19
LOY .8190109* (.1962168)
4.17 PF5 21.98089 (.)
. LOY .2840219 (.1456701)
1.95 PF5 63.10384* (.4796068)
131.57
INF1 -.0253447 (.3071136)
-0.08 PF6 26.89052* (2.056111)
13.08 INF1 -1.556478* (.1797354)
-8.66 PF6 29.36098* (.2161207)
135.85
INF2 20.11875 (.)
. PF8 30.49286 (.)
. INF2 20.32117* (.3147879)
64.56 PF8 31.04017 (.)
.
INF3 -3.28183* (.3378955)
-9.71 constant 3.03079 . INF3 -3.631752* (.2628416)
-13.82 constant 11.004 .
Log likelihood -3154.662 Pseudo R2= 0.4113
Note: .(market=3 is the base outcome) , st. errors in parentheses, * denotes significant in 5% level, z denotes z-statistics
24
TABLE 3 Weighted multinomial Logit model with jackknife robust standard errors
Market = 1 Coef. t Market = 1 Coef. t Market = 2 Coef. t Market = 2 Coef. t C1 -24.646
(.4233) -58.22 INF4 -3.635
(.2786) -13.05 C1 -22.986
(.3469) -66.25 INF4 -4.091
(.1582) -25.85
C2 -23.977 (.6381)
-37.57 INF5 21.162 (.4137)
51.15 C2 -18.925 (.4808)
-39.35 INF5 15.380 (.2446)
62.87
C3 -25.565 (.4561)
-56.05 Sex -2.870 (.1950)
-14.71 C3 -22.645 (.3828)
-59.16 Sex -1.583 (.1492)
-10.61
C4 -24.050* (298.313)
-0.08 Age -.042 (.0090)
-4.68 C4 16.071* (200.5572)
0.08 Age .048 (.0067)
7.14
C5 -25.671 (.4825)
-53.19 id -1.998 (.1749)
-11.42 C5 -23.192 (.3508)
-56.10 id -1.266 (.1358)
-9.32
CRIT1 1.977 (.3763)
5.25 INC1 -8.056 (.4807)
-16.76 CRIT1 -18.582 (.3514)
-52.87 INC1 -3.207 (.4115)
-7.80
CRIT2 -1.745 (.3833)
-4.55 INC2 -4.573 (.2258)
-20.25 CRIT2 -20.684 (.2961)
-69.84 INC2 -.0002* (.218)
0.00
CRIT3 -5.741 (.3854)
-14.90 INC3 -2.050 (.2235)
-9.17 CRIT3 -25.852 (.3686)
-70.12 INC3 1.604 (.2173)
7.38
CRIT4 21.855 (.4987)
43.82 INC4 -6.932 (.3836)
-18.07 CRIT4 1.453 (.4287)
3.39 INC4 -1.630 (.3110)
-5.24
MON2 -2.110 (.3318)
-6.36 PF1 25.621 (.3814)
67.16 MON2 -2.121 (.2937)
-7.22 PF1 28.177 (.3529)
79.84
MON3 -2.446 (.2924)
-8.37 PF2 22.681 (.3734)
60.74 MON3 -1.496 (.2394)
-6.25 PF2 28.477 (.2871)
99.16
Price 0.837 (.0640)
13.07 PF3 26.076 (.3689)
70.67 Price 0.541 (.0510)
10.59 PF3 28.921 (.3169)
91.24
Variety .415 (.0867)
4.79 PF4 26.610 (.5446)
48.86 Variety .344 (.0393)
8.75 PF4 32.296 (.4236)
76.24
LOY .783 (.2189)
3.58 PF5 16.517* (307.3844)
0.05 LOY .269* (.1756)
1.53 PF5 66.619* (120.3684)
0.55
INF1 0.107* (.3587)
030 PF6 25.329 (.3955)
64.04 INF1 -0.916 (.2278)
-4.02 PF6 28.703 (.3357)
85.50
INF2 17.416 (.4322)
40.29 PF8 28.457 (.4560)
62.40 INF2 17.625 (.3227)
54.61 PF8 29.865 (.4010)
74.46
INF3 -3.645 (.3517)
-10.36 constant 8.491 (.8963)
9.47 INF3 -3.846 (.2502)
-15.37 constant 16.870 (.6426)
26.25
Log likelihood -3154.662 Pseudo R2 =
0.4113
Note: .(market=3 is the base outcome) , st. errors in parentheses, * denotes insignificant in 5% level, t denotes t-statistics
25
TABLE 4 Hausman's specification test for the weighted multinomial logit model
Coefficients
(b)
partial
Coefficients
(B)
all
(b-B)
Difference
W Coefficients
(b)
partial
Coefficients
(B)
all
(b-B)
Difference
W
C1 27.582 25.257 2.325 0.098 INF3 -43.365 -29.976 -13.388 3.10E+09 C2 27.274 24.948 2.325 0.202 INF4 1.258 1.333 -0.075 0.064212 C3 -19.141 -9.513 -9.628 1.25E+04 INF5 -0.027 -0.031 0.004 0.003141 C4 26.550 24.588 1.962 0.103 Sex 1.860 1.112 0.747 0.120582 C5 19.831 20.511 -0.680 2.676 Age -0.763 0.844 -1.608 0.154244 CRIT1 21.965 22.558 -0.593 2.674 Id -1.173 0.110 -1.283 0.18215 CRIT2 26.933 27.073 -0.140 2.667 INC1 -2.742 -1.624 -1.118 0.214111 CRIT3 -27.040 -12.659 -14.381 2.05E+04 INC2 1.481 1.967 -0.485 0.185999 CRIT4 -0.811 1.069 -1.880 2.690 INC3 -34.677 -36.725 2.044 0.026322 MON2 -1.474 0.697 -2.172 2.687 INC4 -34.536 -36.620 2.084 0.068746 MON3 -1.2410 -1.164 -0.076 0.031 PF1 -34.874 -36.957 2.082 . Price -0.334 -0.321 -0.013 0.028 PF2 -34.330 -36.337 2.006 . Variety 0.064 -0.284 0.348 0.091 PF3 -36.476 -38.0162 1.539 0.037052 LOY 2.971 1.556 1.414 0.205 PF4 27.582 25.257 2.325 0.098427 INF1 4.271 3.631 0.639 0.160 PF6 27.274 24.948 2.325 0.20215 INF2 4.284 3.368 0.916 0.192 INF2 4.284 3.368 0.916 0.192653
Test: H0 : difference in coefficients not systematic , Pr = 1.0000 , *Reject H1
26
TABLE 5
Hausman's specification test for the weighted multinomial logit model with Huber-White robust standard errors
Coefficients
(b)
partial
Coefficients
(B)
all
(b-B)
Difference
W Coefficients
(b)
partial
Coefficients
(B)
all
(b-B)
Difference
W
C1 27.582 25.254 2.325 .145 INF3 -43.364 -29.976 -13.388 .317 C2 27.274 24.948 2.325 .200 INF4 1.258 1.333 -.075 .047 C3 -19.141 -9.513 -9.628 . INF5 -.027 -.031 .004 . C4 26.550 24.588 1.962 .066 Sex 1.860 1.112 .747 .113 C5 19.831 20.511 -.680 .263 Age -.763 .844 -1.608 .103 CRIT1 21.965 22.558 -.593 .159 Id -1.173 .110 -1.283 .190 CRIT2 26.933 27.073 -.140 . INC1 -2.742 -1.624 -1.118 .254 CRIT3 -27.040 -12.659 -14.381 . INC2 1.481 1.967 -.485 .145 CRIT4 -.811 1.069 -1.880 .113 INC3 -34.676 -36.724 2.044 .150 MON2 -1.474 .697 -2.172 .108 INC4 -34.536 -36.620 2.084 .162 MON3 -1.241 -1.164 -.076 . PF1 -34.874 -36.957 2.082 . Price -.334 -.321 -.013 .022 PF2 -34.330 -36.337 2.006 .173 Variety .064 -.284 .348 .062 PF3 -36.476 -38.016 1.539 . LOY 2.971 1.556 1.414 .164 PF4 27.582 25.257 2.325 .145 INF1 4.271 3.631 .639 .184 PF6 27.274 24.948 2.325 .200 INF2 4.284 3.368 .9160 .160 INF2 -43.364 -29.976 -13.388 .317 Test: H0 : difference in coefficients not systematic , Pr = 0.0000 , *Reject H0
27
TABLE 6
Hausman's specification test for the weighted multinomial Logit model with Jackknife standard errors
Coefficients
(b)
partial
Coefficients
(B)
all
(b-B)
Difference
W Coefficients
(b)
partial
Coefficients
(B)
all
(b-B)
Difference
W
C1 27.58222 25.25704 2.325183 .2701009 INF3 INF3 4.271423 3.631752 .6396704 C2 24.15519 22.16708 1.988104 . INF4 INF4 4.284463 3.368385 .9160783 C3 27.27403 24.94816 2.32587 .2703368 INF5 INF5 -43.36495 -29.97649 -13.38846 C4 -19.14167 -9.513089 -9.628579 . Sex Sex 1.258049 1.333601 -.0755513 C5 26.55099 24.588 1.962987 .1360996 Age Age -.0270006 -.0313198 .0043192
CRIT1 19.83164 20.51188 -.680234 .2246666 Id id 1.86003 1.112417 .747613 CRIT2 21.96522 22.55825 -.593033 .1694605 INC1 INC1 -.7636035 .8446702 -1.608274 CRIT3 26.93368 27.07383 -.1401496 .1340384 INC2 INC2 -1.173496 .1101128 -1.283609 CRIT4 -27.04027 -12.65906 -14.38121 . INC3 INC3 -2.742845 -1.624504 -1.11834 MON2 -.8112801 1.06954 -1.88082 .1148009 INC4 INC4 1.481673 1.967171 -.485498 MON3 -1.474704 .6975685 -2.172272 .1102908 PF1 PF1 -34.67769 -36.72252 2.044834 Price -1.241008 -1.16457 -.0764379 . PF2 PF2 -34.53645 -36.62053 2.084071 Variety -.3347701 -.3212617 -.0135083 .0229542 PF3 PF3 -34.8748 -36.95767 2.082875 LOY .0642479 -.2840219 .3482699 .0649966 PF4 PF4 -38.96175 -40.52765 1.565903 INF1 2.971465 1.556478 1.414987 .1675352 PF6 PF6 -82.42541 -72.06926 -10.35615 INF2 -48.46552 -33.32116 -15.14435 . INF2 INF3 4.271423 3.631752 .6396704
Test: H0 : difference in coefficients not systematic , Pr = 0.1126 , *Reject H1
28
TABLE 7
MONTE-CARLO SIMULATION Market = 1 Coef. Market = 1 Coef. Market = 2 Coef. Market = 2 Coef. C1 -24.635
(.3857) INF4 -3.6452
(.2741) C1 -22.9653
(.330) INF4 -4.085
(.1412) C2 -23.9114
(.6307) INF5 21.1507
(.4331) C2 -18.8615
(.4767) INF5 15.39
(.2485) C3 -25.5435
(.4383) Sex -2.86
(.1940) C3 -22.6257
(.369) Sex -1.5663
(.1462) C4 -28.4515*
(284.75) Age -.04276
(.0090) C4 3.852*
(201.2064) Age .04814
(.0067) C5 -25.6378
(.4543) id -2.00327
(.1818) C5 -23.1686
(.3477) id -1.2552
(.1385) CRIT1 1.9965
(.3759) INC1 -8.0812
(.4886) CRIT1 -18.5965
(.3454) INC1 -3.2441
(.4225) CRIT2 -1.726
(.3801) INC2 -4.5711
(.2262) CRIT2 -20.6993
(.2911) INC2 -.0062*
(.2252) CRIT3 -5.7675
(.3623) INC3 -2.033
(.2134) CRIT3 -25.8888
(.3595) INC3 1.6044
(.2132) CRIT4 21.8952
(.5212) INC4 -6.9244
(.3731) CRIT4 1.4347
(.4523) INC4 -1.6485
(.3170) MON2 -2.1460
(.3175) PF1 25.6271
(.3897) MON2 -2..1625
(.2904) PF1 28.1815
(.36078) MON3 -2.46375
(.2921) PF2 22.656
(.3735) MON3 -1.5261
(.2363) PF2 28.4753
(.2908) Price .8361
(.0650) PF3 26.06
(.3732) Price .544
(.053) PF3 28.9281
(.3250) Variety .4138
(.009) PF4 26.6119
(.5485) Variety .3448
(.0397) PF4 32.3052
(.4336) LOY .7924
(.2149) PF5 26.4515*
(296.0381) LOY .2823*
(.1715) PF5 69.009*
(120.3516) INF1 0.1179*
(.360) PF6 25.3177
(.4018) INF1 -.8961
(.2233) PF6 28.7143
(.3418) INF2 17.3932
(.4343) PF8 28.4231
(.4448) INF2 17.602
(.3371) PF8 29.8485
(.4049) INF3 -3.6455
(.3493) constant 8.5198
(.8868) INF3 -3.8446
(.2320) constant 16.8616
(.645) Note: .(market=3 is the base outcome) , st. errors in parentheses, * denotes insignificant in 5% level.