proiect modelare
-
Upload
razvan-gabriel-ion -
Category
Documents
-
view
78 -
download
1
Transcript of proiect modelare
1. Download data for at least 4 economic variables that you think are related.
Make sure they have the same frequency (daily, weekly, monthly or yearly data) and the
same number of observations- you need contemporaneous values for all variables. The
sample should include at least 30 observations. Compute means, standard deviations,
skewness and kurtosis for all the variables and built histograms to characterize their
distributions. Describe and graph the data. Show your sources.
MonthY(Retail and food services sales) mil. $
X ( Total
disponible personal Income) bil. $
X (Consumar price
index - CPI)
X (Interest
rate) %2007-Jan. 329,736 11,640.70 202.416 8.25Feb. 324,287 11,713.80 203.499 8.25Mar. 375,294 11,788.20 205.352 8.25Apr. 359,619 11,815.80 206.686 8.25May 392,640 11,843.00 207.949 8.25Jun. 377,695 11,858.10 208.352 8.25Jul. 373,405 11,906.90 208.299 8.25Aug. 388,846 11,931.90 207.917 8.25Sept. 354,721 12,024.50 208.49 7.75Oct. 369,434 12,065.10 208.936 7.5Nov. 378,619 12,132.00 210.177 7.5Dec. 427,225 12,227.20 210.036 7.252008-Jan. 343,616 12,258.00 211.08 6Feb. 345,010 12,294.00 211.693 6Mar. 374,208 12,349.20 213.528 5.25Apr. 370,352 12,336.50 214.823 5May 399,773 12,522.10 216.632 5Jun. 380,389 12,524.10 218.815 5Jul. 385,985 12,409.70 219.964 5Aug. 385,211 12,462.60 219.086 5Sept. 353,128 12,468.90 218.783 5Oct. 352,847 12,435.00 216.573 4Nov. 338,774 12,376.10 212.425 4Dec. 388,025 12,257.70 210.228 3.252009-Jan. 313,864 12,160.20 211.143 3.25Feb. 303,504 12,072.20 212.193 3.25Mar. 333,230 12,047.30 212.709 3.25Apr. 334,767 12,110.50 213.24 3.25May 353,263 12,310.80 213.856 3.25Jun. 349,960 12,189.00 215.693 3.25Jul. 353,617 12,148.30 215.351 3.25Aug. 359,221 12,173.80 215.834 3.25
1
Sept. 330,260 12,169.70 215.969 3.25Oct. 344,716 12,178.70 216.177 3.25Nov. 345,700 12,237.40 216.33 3.25Dec. 408,576 12,300.70 215.949 3.252010-Jan. 321,550 12,324.30 216.687 3.25Feb. 317,961 12,337.20 216.741 3.25Mar. 369,339 12,389.40 217.631 3.25Apr. 366,002 12,478.50 218.009 3.25May 375,699 12,532.80 218.178 3.25Jun. 369,031 12,540.00 217.965 3.25Jul. 372,451 12,559.80 218.011 3.25Aug. 373,373 12,622.10 218.312 3.25Sept. 355,549 12,619.30 218.439 3.25
Sources: www.census.gov (retail and food services sales), www.bea.gov (personal income), www.bls.gov (CPI), www.bankofcanada.ca (interest rate).
Means:
=
= 360454.9333 – The mean of retail and food services sales in USA beginning with January 2007 to September 2010 is 360454.9333.
= 12225.40222 – The mean of personal income in USA beginning with January 2007 to September 2010 is 12225.40222.
= 213.4701333 – The mean of the consumer price index in USA beginning with January 2007 to September 2010 is 213.4701333.
= 4.95 – The mean of the interest rate in USA beginning with January 2007 to September 2010 is 4.95.
Standard deviation:
= =
= 26036.6843 – The degree of dispersion of the retail and food services sales in USA from the mean value is 250.7240547.
= 250.7240547 – The degree of dispersion of personal income in USA from the mean value is 250.7240547.
= 4.54299426 - The degree of dispersion of consumer price index in USA from the mean value is 4.54299426.
2
= 2.027929979 - The degree of dispersion of interest rate in USA from the mean value is 2.027929979.
Skewness:
= 1/n*
= 0.034228706 – The degree of asymmetry of the sales distribution around its mean is 0.034228706. In this case, the skewness is positive and that indicates a distribution with an asymmetric tail extending towards more positive values.
= -0.491795728 - The degree of asymmetry of the income distribution around its mean is -0.491795728. In this case, the skewness is negative and that indicates a distribution with an asymmetric tail extending towards more negative values.
= -0.629726058 - The degree of asymmetry of the CPI distribution around its mean is -0.629726058. In this case, the skewness is negative and that indicates a distribution with an asymmetric tail extending towards more negative values.
= 0.705329021 - The degree of asymmetry of the interest rate distribution around its mean is 0.705329021. In this case, the skewness is positive and that indicates a distribution with an asymmetric tail extending towards more positive values.
Kurtosis:
k = 0.016183697 – Positive kurtosis indicates a relatively peaked distribution of sales.
k = -0.426401663, k = -0.505730192, k = -1.19736224 – Negative kurtosis indicates a relatively flat distribution of this three economic indicators: personal income, CPI and interest rate.
3
Histogram for Y
0
2
4
6
8
10
12
14
16
303504 324124.1667 344744.3333 365364.5 385984.6667 406604.8333 More
Bin
Fre
qu
en
cy
0%
20%
40%
60%
80%
100%
120%
Frequency Cumulative %
Histogram for X1
0
2
4
6
8
10
12
14
Bin
Fre
qu
en
cy
0%
20%
40%
60%
80%
100%
120%
Frequency Cumulative %
4
Histogram for X2
0
2
4
6
8
10
12
14
202.
416
205.
3406
667
208.
2653
333
211.
19
214.
1146
667
217.
0393
333
Mor
e
Bin
Fre
qu
ency
0%
20%
40%
60%
80%
100%
120%
Frequency Cumulative %
Histogram for X3
0
5
10
15
20
25
Bin
Fre
qu
en
cy
0%
20%
40%
60%
80%
100%
120%
Frequency Cumulative %
5
2. Select the dependent variable and build a multiple regression model that makes
economic sense. Run a battery of regressions of the dependent variable on all
combinations of one, two and three other variables. Create the ANOVA table in each
case. What do you observe? Comment on the values you obtained for the coefcients.
Compute the regression statistics in Excel by minimizing the sum of squared residuals
(using Solver), and using Regression in the Data Analysis tool-pack. Verify that you
obtained the same values for the coefficients irrespective of the method used. Create a
summary table of the results and interpret it.
Combinations of one:
Model 1.1. Dependent variable: y=Sales
Independent variable: x=Personal Income
y=114575.8743(b ) + 20.11214474(b )*x For a personal income of 0, the sales will be around 115000. But from an economic point
of view the coefficient b has no relevance.
The coefficient b tells us that each additional unit of personal income adds an average of
about 20 to the sales.
Model 1.2. Dependent variable: y=Sales
Independent variable: x= Consumer price index
y=15.38986188 + 0.016202352x
For a CPI=0, the value of sales will be around 15. But from an economic point of view the coefficient b has no relevance.The coefficient b tells us that each additional unit of CPI adds an average of about
0.02% to the value of sales.
Model 1.3. Dependent variable: y= Sales
Independent variable: x= Interest rate
y= 342427.657 + 3641.873998x
For an interest rate=0 the value of sales will be around 343000. The coefficient b tells us that if the interest rate increases with 1 unit, it adds an average
of about 3642 to the value of sales.
6
We have compared the value of Significance F for the three models of regression
and we have observed that the Model 1.2 is the best model of one combination because it
is the single one with Significance F <0.05.
Combinations of two
Model 2.1. Dependent variable: y= sales
Independent variable: x =personal income; x = CPI;
y= 156310.0338+ 64.04961164 x -2711.795584 x
For a personal income of 0, the sales will be around 157 000. The value of 64 for b means that if personal income increases by one unit while CPI remains constant, sales will increase by aprox.64. The value of -2712 for b means that if CPI increases by one unit while personal income remains constant, sales will decrease by aprox. 2712.
Model 2.2. Dependent variable: y= sales
Independent variable: x =CPI; x = interest rate;
y= -652741.257+ 4479.369565 x + 11512.03473 x
For a CPI of 0, the sales will be around -652742. The value of 4480 for b means that if CPI increases by one unit while interest rate remains constant, sales will increase by aprox. 4480. The value of 11512 for b means that if interest rate increases by one unit while CPI remains constant, sales will increase by aprox. 11512.
Model 2.3. Dependent variable: y= sales
Independent variable: x =personal income; x = interest rate;
y= -614274.5472+ 10040.83217x +75.66437034 x
For a personal income of 0, the sales will be around -614280. The value of 10041 for b means that if personal income increases by one unit while interest rate remains constant, sales will increase by aprox. 10041. The value of 75 for b means that if interest rate increases by one unit while personal income remains constant, sales will increase by aprox. 75.
We have compared the value of Significance F for the three models of regression
and we have observed that the Model 2.2 and Model 2.3. are relevant models of two
combinations because they both have Significance F <0.05.
7
Combinations of three
Model 3.1. Dependent variable: y= sales
Independent variable: x =personal income; x = CPI; x = interest rate;
y= -722949.6462+ 56.23933624 x + 1594.680709 x + 11199.87269 x
The value of 56 for b means that if personal income increases by one unit while CPI and interest rate remain constant, sales will increase by aprox. 56. The value of 1595 for b means that if CPI increases by one unit while personal income and interest rate remain constant, sales will increase by aprox. 1595. The value of 11120 for b means that if interest rate increases by one unit while personal income and CPI remain constant, sales will increase by aprox. 11120.
Because Significance F for this model is lower than 0.05 we can say this model is
a relevant one.
0
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
450.000
20
07
-Ja
n.
Ap
r.
Jul.
Oct
.
20
08
-Ja
n.
Ap
r.
Jul.
Oct
.
20
09
-Ja
n.
Ap
r.
Jul.
Oct
.
20
10
-Ja
n.
Ap
r.
Jul.
Oct
.
Y(Retail and food servicessales)
All-period Average
8
0
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
450.000
2007
-Jan
.M
aySep
t.
2008
-Jan
.M
aySep
t.
2009
-Jan
.M
aySep
t.
2010
-Jan
.M
aySep
t.
Y(Retail and food servicessales)
3 month MA
0
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
450.000
2007
-Jan
.M
aySep
t.
2008
-Jan
.M
aySep
t.
2009
-Jan
.M
aySep
t.
2010
-Jan
.M
aySep
t.
Y(Retail and food servicessales)
ES alpha=0.2
9
0
50.000
100.000
150.000
200.000
250.000
300.000
350.000
400.000
450.000
2007
-Jan
.M
aySep
t.
2008
-Jan
.M
aySep
t.
2009
-Jan
.M
aySep
t.
2010
-Jan
.M
aySep
t.
Y(Retail and food servicessales)
ES alpha=0.3
190
195
200
205
210
215
220
225
2007
-Jan
.
Apr
.
Jul.
Oct
.
2008
-Jan
.
Apr
.
Jul.
Oct
.
2009
-Jan
.
Apr
.
Jul.
Oct
.
2010
-Jan
.
Apr
.
Jul.
X2(Consumer price index)
All-period Average
10
190
195
200
205
210
215
220
225
2007
-Jan
.
Apr
.
Jul.
Oct
.
2008
-Jan
.
Apr
.
Jul.
Oct
.
2009
-Jan
.
Apr
.
Jul.
Oct
.
2010
-Jan
.
Apr
.
Jul.
X2(Consumer price index)
3 month MA
190
195
200
205
210
215
220
225
2007
-Jan
.
Apr
.
Jul.
Oct
.
2008
-Jan
.
Apr
.
Jul.
Oct
.
2009
-Jan
.
Apr
.
Jul.
Oct
.
2010
-Jan
.
Apr
.
Jul.
X2(Consumer price index)
ES alpha=0.2
11
190
195
200
205
210
215
220
225
2007
-Jan
.
Apr
.
Jul.
Oct
.
2008
-Jan
.
Apr
.
Jul.
Oct
.
2009
-Jan
.
Apr
.
Jul.
Oct
.
2010
-Jan
.
Apr
.
Jul.
X2(Consumer price index)
ES alpha=0.3
Summary Table Regression
Model 1.1
Model 1.2
Model 1.3
Model 2.1
Model 2.2
Model 2.3
Model 3.1
Constant term
115000 15 343000 157 000 -652742 -614280 -7222950
Coefficient for
Personal Income
20 64 10041 56
Coefficient forCPI
0.02 -2712 4480 1595
Coefficient for
Interest rate
3642 11512 75 11120
12
Summary Table (using Solver)
Model 1.1
Model 1.2
Model 1.3
Model 2.1
Model 2.2
Model 2.3
Model 3.1
Constant term 0,002454 -263970,1 -2345620 0,028035 26,00248 0,033276 0,018979
Coefficient for
Personal Income
28,32852 27,84473 28,30651 27,83183
Coefficient forCPI
-53687091 27,68454 1563,78 27,12872
Coefficient for
Interest rate
324285,6 1594,853 27,6845 31,8367
As we can see comparing the tables above, the values for the coefficients differ
from one method to another. We have obtained some values for the coefficients using
Regression and other values using Solver.
From the Summary tables we can observe that the values of the regression
coefficients associated with a given independent variable are differents for each model.
The values depend on what independent variables are included in the model.
We consider that Model 3.1. is the one we should rely on because it takes into
consideration the largest number of factors (independent variables) that can influence the
sales.
3. For two of the variables previously chosen remake the analysis we did in class:
1. Compute all-period average, 3 month Moving Average and Exponential Somoothing
with alpha = 0.2 and alpha = 0.3. 2. Decide on what method you could use for
forecasting using the precision coefficients. 3. Compute seasonality indexes and the
trend. 4. Use the Winter model to compute forcasts for 5 months into the future for the
two variables. Use the minimization of the sum of squared residuals to find the
exponential smoothing coefficients.
The two variables we have chosen are: Retail and food services sales and
Consumer Price Index.
13
1) The values we have obtained for SALES in October using the three methods are
written in the following table:
All-period average 3 month MA ES with alpha=0.2 ES with alpha=0.3
October 360455 367124 362635.615 364451.6811
The values we have obtained for CPI in October using the three methods are written
in the following table:
All-period average 3 month MA ES with alpha=0.2 ES with alpha=0.3
October 213 218 217.7 218.05
2) SALES
All-period average 3 month MA ES with alpha=0.2 ES with alpha=0.3
MAD 21995.43 20694 21454.1 20509.2
MSE 738611259.2 753598800.7 727653365.6 703430711.2
MAPE 6.18% 5.78% 5.96% 5.71%
OCT. 360455 367124 362635.615 364451.6811
As we can see from the table above, the lowest values of the precision coefficients
are the ones obtained using the exponential smoothing with alpha=0.3 method. This
means that the value forcasted for October is the closest to the actual value (364451.6811
is the best value forcasted for October).
CPI
All-period average 3 month MA ES with alpha=0.2 ES with alpha=0.3
MAD 4.07 1.5 2.42 1.92
MSE 21.82 4.09 8.44 5.71
MAPE 1.89 % 0.7% 1.13% 0.9%
OCT. 213 218 217.7 218.05
As we can see from the table above, the lowest values of the precision coefficients
are the ones obtained using the three month Moving Average method. This means that
14
the value forcasted for October is the closest to the actual value (218 is the best value
forcasted for October).
3)
15