Decision and Risk Analysis Business Forecasting and Regression Analysis Kiriakos Vlahos Spring 2000.
-
Upload
leo-houston -
Category
Documents
-
view
221 -
download
1
Transcript of Decision and Risk Analysis Business Forecasting and Regression Analysis Kiriakos Vlahos Spring 2000.
DRA/KV
Decision and Risk Analysis
Business Forecasting and Regression Analysis
Kiriakos VlahosSpring 2000
DRA/KVSession overview
• Why do we need forecasting?• Overview of forecasting techniques• The components of time series
– Trend– Seasonality– Cycles– Randomness
• Trend curves• Causal forecasting and regression
analysis• Judgemental forecasting• Scenario planning
DRA/KV
All forecasts are wrong
Those who claim to forecast the future
are all lying even if, by chance,
they are later proved right
DRA/KVForecasting is ...
Forecasting is like trying to drive a car
blindfolded and following directions
given by a person who is looking
out of the back window
DRA/KV
Forecasting in business
Forecasting in business is like sex in
society, we have to have it, we cannot
get along without it, everyone is doing
it, one way or another, but nobody is
sure he is doing it the best way.
G W PlosslLast Frontiers for Profits
DRA/KV
Forecasting in organisations
• Marketing
– Sales, prices, social and economic
trends• Production
– Demand, costs, employment and machinery requirements
• Finance– Costs, sales, capital expenditure,
economic climate• R&D
– Technological developments, new products
• Top management– Total sales, costs, pricing, economic
trends, competitors’ positioning
DRA/KV
Formal vrs. informal forecasting
• Forecasting is a very common activity• The majority of forecasting is informal
• Why do we need formal forecasting?– Coping with complexity– Coping with growth– Coping with change– Need for auditability and justification
Formal forecasting provides a vehicle for communication about the forecast and a basis for systematic improvement.
DRA/KV
Characteristics of forecasting problems
• Time horizon– short-term– long-term
• Data patterns– Seasonality– Trend– Cycles– Randomness
• Cost• Complexity• Accuracy
DRA/KV
Data patterns - Trend
Medium to long term movements
Upward or downward
e.g.
1978 1979 1980 1981 1982 1983 1984 1985 1986 1987
DRA/KV
Data patterns - cycles
Long-term irregular movements, e.g.
Government debt since the American revolution
DRA/KV
Data patterns -Seasonality
Regular periodic oscillations.
They can be monthly, quarterly, etc. e.g.
Jan-84 Jan-85 Jan-86 Jan-87 Jan-88
Turnover (£m)
Additive or multiplicative
DRA/KV
Data patterns - Random oscillations
Unsystematic, oscillations around a constant mean.
No trend cycle or seasonality
85
90
95
100
105
110
115
DRA/KV
Classification of forecasting methods
Trend C urves
Seasonal D ecom position
Exponentia l Sm oothing
Tim e Series
R egression
Econom etric forecasting
Track ing signals
C ausal Forecasting
Scenario p lanning
Q uantita tive
Ind ividual
C om m ittees
D elphi
Market surveys
R ole p laying
G roup
Judgem enta l
Forecasting m ethods
DRA/KV
Regression overview
• Why understanding relationships is important
• Visual tools for analysing relationships• Correlation
– Interpretation – Pitfalls
• Regression– Building models– Interpreting and evaluating models– Assessing model validity– Data transformations– Use of dummy variables
DRA/KV
Why analysing relationships is
important
• Development of theory in the social sciences and empirical testing
• Finance e.g.– How are stock prices affected by
market movements?– What is the impact of mergers on
stockholder value?• Marketing e.g.
– How effective are different types of advertising?
– Do promotions simply shift sales without affecting overall volume?
• Economics e.g.– How do interest rates affect
consumer behaviour?– How do exchange rates influence
imports and exports?
DRA/KV
Sales vrs advertising
Advertising (£000)
Sal
es (
unit
s)
DRA/KVEstimating betas
The slope of this line is called the beta of the stock and is an estimate of its market risk.
DRA/KVScatter plots
• What are they?
A graphical tool for examining the relationship between variables
• What are they good for?
For determining• Whether variables are related• the direction of the relationship• the type of relationship• the strength of the relationship
DRA/KVCorrelation
• What is it?
A measure of the strength of linear relationships between variables
• How to calculate?
a) Calculate standard deviations sx, sy
b) Calculate the correlation using the formula
• Possible values
From -1 to 1
yx
iii
xy ssN
yyxxr
)1(
))((
DRA/KV
Interpreting the correlation
DRA/KVCorrelation Pitfalls
• Correlation measures only linear relationships
• Existence of a relationship does not imply causality
• Even if there exists a causal relationship, the direction may not be obvious
DRA/KV
Correlation and Causality
Many nations see improving communications as vital to boost overall economy. A 1% increment in telephone density yields an increment of about 0.1% in per-capita GNP, according to a 1983 OECD-ITU study.
AT&T advertisement in Fortune Dec 97
DRA/KVFerric Processing
What are the factors influencing production costs?
Production costs
Capacity Plant age
Plantlocation
Other plantfeatures
Predicting production cost is important for the negotiation of 5-year contracts with steel companies
?
? ?
?
DRA/KVVisual inspection
10
15
20
25
30
0 0.5 1 1.5 2 2.5 3 3.5
capacity (000 tons/month)
cost
/ton
($)
a) Construct scatter plot
b) Calculate correlation (excel function CORREL)
The correlation between cost and capacity is -0.84
c) Candidate modelCost = a + b Capacity
DRA/KV
Simple Linear Regression
10
15
20
25
30
0 0.5 1 1.5 2 2.5 3 3.5
capacity (000 tons/month)
cost
/ton
($)
Simple regression estimates a linear equation which corresponds to straight line that passes through the data
Regression model
Cost = 25.2 - 4.4 Capacity
Dependent variable
Constant orintercept
Coefficientor slope
Independentor explanatoryvariable
DRA/KVLeast squares
10
15
20
25
30
0 0.5 1 1.5 2 2.5 3 3.5
capacity (000 tons/month)
cost
/ton
($)
Residuals
• Residuals are the vertical distances of the points from the regression line
• In least squares regression
– The sum of squared residuals is minimised
– The mean of residuals is zero
– residuals are assumed to be randomly distributed around the mean according to the normal distribution
DRA/KVExcel output
Regression StatisticsMultiple R 0.84R Square 0.70Adjusted R Square 0.66Standard Error 2.33Observations 10
ANOVAdf SS MS F Significance F
Regression 1 100.65 100.65 18.47 0.00Residual 8 43.59 5.45Total 9 144.23
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 25.19 1.86 13.55 0.00 20.91 29.48Capacity -4.40 1.02 -4.30 0.00 -6.77 -2.04
Read equation
Observe adjusted R2
Observe statisticssb
s
The standard error s is simply the st. deviation of the residuals (a measure of variability)
R2 is the most widely measure of goodness of fit.
It can be interpreted as the proportion of the variance of the dependent variable explained by the model. Use the adjusted R2 ,which accounts for the no. of observations.
variancevariabledependent
varianceresidual11
2
22
ys
sR
DRA/KVHypothesis testing
Does a relationship between capacity and cost really exist? If we draw a different sample, would we still see the same relationship?
Or in stats jargon
Is the slope significantly different from zero?
x
y b=0
b=0 implies no relationship between x and y
Hypothesis testingTest whether b=0
DRA/KV
t-values and p-values
0 b
p-value
t-value * sb
sb is the st. deviation of the slope estimate b
t-value = b/sb
p-value is the probability of getting an estimate of slope at least as large as b.
Equivalent tests (5% significance level)
|T-value| > 2
p-value < 0.05
Distribution of estimate of slope if b=0
DRA/KVChecking residuals
Residuals should be random. Any systematic pattern indicates that our model is incomplete.
Autocorrelated residuals
Heteroscedasticity
Problematic patterns
DRA/KVFerric - Residuals
Line fit Plot
10
15
20
25
30
0 1 2 3 4
Capacity
Co
st/
ton
Actual Predicted
Residual Plot
-4
-3
-2
-1
0
1
2
3
4
5
0 1 2 3 4
Capacity
Re
sid
ua
ls
Are residuals random?Can you see any pattern?
DRA/KV
Combining theory and judgement
The relationship appears to be non linear.
We can fit non-linear relationships by introducing suitable transformations, e.g.
x
y y=aebx
x
Ln(y)Ln(y)=ln(a)+bx
What transformation is appropriate for the Ferric data?
Use judgement e.g.
Total Cost (TC) = Fixed Cost + Variable Cost
TC = FC + Unit Cost (UC)* Quantity(Q)
TC/Q = FC/Q + UC e.g.
Average Cost = b/Q + a
This suggests that average costs are inversely proportionate to capacity
DRA/KV
Transforming the data
Regression StatisticsMultiple R 0.97R Square 0.95Adjusted R Square 0.94Standard Error 0.98Observations 10
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 11.75 0.60 19.53 0.00 10.36 13.131/Capacity 7.93 0.67 11.88 0.00 6.39 9.46
10
15
20
25
30
0 0.5 1 1.5 2 2.5 3 3.5
capacity (000 tons/month)
cost
/ton
($)
Line Fit Plot
10
15
20
25
30
0.00 0.50 1.00 1.50 2.00 2.50
1/Capacity
Cos
t/to
n
Actual
Predicted
DRA/KVModel comparison
• High adusted R2
• All coefficients significant– t-values or p-values
• Low standard error• No pattern in residuals• Is model supported by theory?• Does the model make sense?
Criteria First model Transformed modelHigh adjusted R2 66% 94%All coefficients significant Yes YesLow residual st. dev. (s) 2.33 0.98No pattern in residuals No YesEquation makes sense Yes (?) Yes
The transformed model is better:
Cost = 11.75 + 7.93 * (1/Capacity)
DRA/KV
Forecasting &confidence intervals
• If capacity is 2 what is the forecast for cost?– Cost = 11.75 + 7.93 (1/2) = 15.71
• Approximate 95% confidence interval:
15.71 2 * s
where s=0.98 is the standard error
• The greater the number of observations the better the approximation
• More accurate intervals can be calculated using statistical packages
DRA/KV
Confidence intervals
Plot of Fitted Model
1/CAPACITY
CO
ST
0 0.5 1 1.5 2 2.5 314
17
20
23
26
29
Statgraphics gives two sets of intervals.
• Outer bands are prediction intervals for an individual plant
• Inner bands are confidence intervals for the average cost from all plants. The can be viewed as the confidence intervals for the regression line.
DRA/KV
Is plant age important?
Multiple regression
Cost = a + b(1/Capacity)+ cYear + e
Regression StatisticsMultiple R 0.98R Square 0.96Adjusted R Square 0.95Standard Error 0.90Observations 10
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 542.01 326.41 1.66 0.14 -229.83 1313.84Year -0.27 0.16 -1.62 0.15 -0.66 0.121/Capacity 7.03 0.82 8.58 0.00 5.09 8.97
Cost/ton Year 1/CapacityCost/ton 1Year -0.74237 11/Capacity 0.9728 -0.67071 1
Correlation matrix
Regression analysis
Is this a good model?
DRA/KVMulticollinearity
87878685
8585
84
83
81
81
10
15
20
25
30
0 1 2 3 4
capacity (000 tons/month)
cost
/ton
($)
Multicollinearity appears when explanatory variables are highly correlated.
Effects:
• Including Year adds little information, hence fit does not improve much
• Parameter estimates become unreliable
Remedial action:
• Remove one of the correlated variables
Moral:
• Check for correlations between explanatory variables
DRA/KV
Other inappropriate models
Influential observations and outliers
Clustering of data
DRA/KVDummy variables
Bond purchases and national incomeYear B Y W1933 2.6 2.4 01934 3.0 2.8 01935 3.6 3.1 01936 3.7 3.4 01937 3.8 3.9 01938 4.1 4.0 01939 4.4 4.2 01940 7.1 5.1 11941 8.0 6.3 11942 8.9 8.1 11943 9.7 8.8 11944 10.2 9.6 11945 10.1 9.7 11946 7.9 9.6 01947 8.7 10.4 01948 9.1 12.0 01949 10.1 12.9 0
War
ye
ars
Regression equation: B = 1.29+.68Y+2.3W
DRA/KV
Regression checklist
• Visually inspect the data (scatter plots)
• Calculate correlations
• Develop and fit sensible model(s)
• Assess and compare the model(s)
– Significance of variables (t-values, p-values)
– adjusted R2
– standard error (s)
– residual plots
• autocorrelation
• heteroscedasticity
• Normality
• Outliers, influencial observations
– Does the model make sense?
• If you are satisfied use the model for
– developing business insights
– forecasting
DRA/KVTrend curves
• Also known as growth/decay curves• Most common curves
– Linear– Quadratic– Exponential– Logarithmic– S-curves
Fitting trend curvesTransform the original data so that a linear equation of the form y=a+bx arises. Then apply regression analysis.
Example:
tbaY
abY
t
tt
)log()log()log(
DRA/KV
Credit card turnover
1978 1981 1984 19870
2
4
6
8
10
12
14
16
£bn
Actual Predicted
Visa turnoverExponential Growth curve
How would you use such curve for forecasting?What role does judgement play in trend projection?
DRA/KV
Other trend curves (S curves)
Simple modifiedexponential
Logistic curveGompertz curve
Logarithmicparabola
0
b
abcY tt
01
1
cbe
Yctt
0
2
c
aeY ctbtt
0,0
cb
aeYctbe
t
DRA/KV
Trend and seasonality
Time Sales q1 q2 q3 q41 37.2 0 0 0 12 15.7 1 0 0 03 11 0 1 0 04 26.6 0 0 1 05 28.9 0 0 0 16 12 1 0 0 07 6.6 0 1 0 08 20.9 0 0 1 09 23.5 0 0 0 1
Quarterly data
Sales
0
510
1520
25
3035
40
0 10 20 30 40Quarters
$m
Regression with seasonal dummy variables
Sales = a + b Time + c q2 + d q3 + e q4
Include q1 in the model?
DRA/KV
Multiple regression with seasonal
dummiesRegression Statistics
Multiple R 0.95R Square 0.90Adjusted R Square 0.88Standard Error 2.96Observations 36.00
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 14.78 1.31 11.30 0.00 12.11 17.44q2 -3.75 1.40 -2.69 0.01 -6.60 -0.91q3 8.53 1.40 6.10 0.00 5.68 11.38q4 15.66 1.40 11.23 0.00 12.82 18.51Time -0.25 0.05 -5.17 0.00 -0.34 -0.15
Equation: ?Interpretation: ?
Time Line Fit Plot
0
5
10
15
20
25
30
35
40
0 5 10 15 20 25 30 35 40
Time
Sal
es
Sales
Predicted Sales
DRA/KV
Econometric modelling
Regression Analysis
Sales = f(GNP, price, advertising)
Econometric Modelling
Sales = f(GNP, price, advertising)
advertising = f(salest-1)
production cost = f(sales, labour cost, materials
cost)
price = f(production cost, price of substitutes)
exogenous - endogenous variables
Simultaneous parameter estimation
DRA/KVThe CEF model
• The CEF model of the UK economy– Agents
• Individuals• Banks• Other financial institutions• Government• Overseas agents
– Markets• Market for goods and services• Market for labour• Market for capital goods
• Agents interact in each market influencing supply and demand, which in turn determine price and quantities.– 500 equations (!)
DRA/KV
Judgemental forecasting
• Individual– Subjective probability assessment
• Group forecasting– Sales force method– Executive committees– Expert panels– Delphi method
• (Feedback, reassessment)
• Problems in judgemental forecasting– Bias– Anchoring– Conservatism/Optimism– Overconfidence
• Combining forecasts
DRA/KVForecasting change
DRA/KV
Crude price oil forecasts
The dangers of straight line forecasts
DRA/KV
Energy forecasting in West Germany
Energy Consumption - forecasts vrs. actual data
From Diefenbacher and Johnson“The politics of Energy Forecasting”
Persistence of mental models!
DRA/KV
Airline industry forecasts
DRA/KV
Forecasting & Planning
• Traditional view of forecasting– The past explains the future– Passive or adaptive attitude towards
the future
• Modern view– Active and creative approaches to
forecasting– Making things happen
DRA/KVScenario planning
“It is impossible to forecast the future and it may be dangerous to do so”
Use of scenarios in planning
Develop a small number of internally consistent and credible views of how the world will look in the future, that present testing conditions for the business.
The future will of course be different from all of these views/scenarios, but if the company is prepared to cope with any of them, it will be able to cope with the real world.
DRA/KVScenarios in Shell
Oil shock scenario:Shell analyse the impact
of a $15/bbl price on cashflows and investment plans
Oil shock scenario:Shell analyse the impact
of a $15/bbl price on cashflows and investment plans
Re-evaluation of up-streamplans and cash-flow positionof the operating companies
Re-evaluation of up-streamplans and cash-flow positionof the operating companies
Oil price fallsfrom $28/bbl to $10/bbl
Oil price fallsfrom $28/bbl to $10/bbl
Scenariodesign
StrategicPlan
Event
Early1985
Early1986
Result
DRA/KV
Advantages of Scenario Planning
• Challenge preconceived ideas and single point forecasts
• Explore a wide range of uncertainties
• Encourage an active and creative attitude to the future
• Provide a background for specific project evaluation
• Provide a vehicle for communication between the different parts of the organisation
DRA/KV
Forecasting - Summary
• All forecasts are wrong!• Never trust single point forecasts• Data patterns
– Trend, seasonality, cyclicality, randomness
• Time-series forecasting– Trend curves
• Causal forecasting– Regression
• Judgemental forecasting• Scenario planning
DRA/KV
Preparation for Regression workshop
• Read the note on Regression Analysis
• Work on the “Tutorial on Regression
Analysis using Excel”
• Practice on creating descriptive
statistics and histograms in Excel
(ExcelStats.xls)
• Select your workshop partner
• In preparation for the exam work on
regression exercises