Hazards associated with flash floods First law of Quantitative Precipitation Forecasting (QPF)
Quantitative Methods of Forecasting -...
Transcript of Quantitative Methods of Forecasting -...
Quantitative Methods in
Forecasting
Assoc. Prof. Christian Tanushev, Ph.D.
9 February 2015
Overview
• Review on Qualitative Methods of Foreceasting
• Types of Cross Sectional Study
• Statistical average: mode, median, mean
• Variance and Standard deviation
• Smoothing Forecasting Methods
– “Moving Averages”
– Exponential Smoothing
– Regression models
2
Forecasting - definitions
• Forecasting is the process of estimation in
unknown situations
• A forecast is a prediction based on knowledge
of past behavior.
• Normally, the prediction is expressed as a
probability. The prediction is an assertion of
likelihood that an outcome will take place.
3
Salvador Dali, Explosion Of Faith
In A Cathedral, 1974
4
Boian Donev, Pegasus, 1995
5
The Knowledge Spiral
6
Types of KnowledgeMode of
knowing
Knowing by
experience
Knowing how to
do, "know-how"
Conceptual (theoretical)
knowing
Example
"The benches in this
lecture-hall are not
comfortable."
"I know how to
design a good TV
couch."
"A suitable seat height for
British grown-ups is 44 cm."
Area of
validity
Pieces of knowledge
are detached and
valid only in one case
Knowledge can be
applied in several
instances
Knowledge can be applied
to all instances of the same
type. It contains mainly
general rules
Mode of
presentation
The essential sense of
"tacit" knowledge
cannot be explained
verbally
Skill of trade.Many
important points of
these cannot be
presented verbally
The knowledge can be
expressed in words and
exact models, and it can be
printed as a handbook
Method of
teaching the
knowledge
Cannot be taught. Can
be learned only by own
experience
The master shows how
the thing is done; the
student imitates the
master
Lectures and reading of text-
books
7
Cross Sectional Study
(No time perspective)One single case will be
studied, or a few similar
cases. Holistic view
Case study
- Exploratory Case Study (chemistry elements)
- Case Study Based on Earlier Theory (Periodic Table of
Elements of Mendeleev)
- Normative Case Study (nuclear decay)
A few different cases will be
studied. Holistic view
Comparative Study
- Descriptive Comparison
- Normative Comparison (sample)
A large number of different
cases will be studied
Classification
– Exploratory Classification
- Classification into Given Classes
- Normative Classification
Variables or measurements
from a large number of
cases are analyzed
Quantitative Analysis
- Analyzing Individual Variables
- Analyzing Relationships between Variables
- Normative Study of Variables
8
Case Study• The most usual target in case studies is to
describe the object or phenomenon - not only
its external appearance but also its internal
structure
9
Model
10
Exploratory Case Study• Exploratory study, in other words not basing
the study on any earlier model or theory, is
usually laborious, slow and uncertain, so
usually you will want to avoid such an
approach if you can. The normal method is to start with a thorough
search of literature.
• If you can distinguish any historical evolution that has taken place
around the object, it can help you to detect a dynamic invariance in
the phenomenon.
• In the case that your material consists of several similar objects or
cases your target normally becomes to find out what is common to
all the cases: what is the static invariance in them. When studying
artifacts it could be a typical form, pattern or proportion. When
studying people it could be their prevalent attitude, widespread taste
or a typical behavior.11
Case Study Based on Theory• It is common that in the beginning
of exploratory study you will take aholistic look at the objects. It means that you start by gathering as much information about the objects as possible, and postpone the task of cutting away unnecessary data until you get a better picture about what is necessary.
• Any object can be looked at from several different viewpoints, either from the angles of various established sciences or just from miscellaneous practical points of view.
12
Normative Case Study
• An important use of normative case study is to
guide the development of a new version of an
existing model of a product
13
Descriptive Comparison• In the initial phases of the study you only reach
descriptive answers to the question what the object is and what it is like, and from this basis you can then try to explain or answer the question why the object is as it is.
• In comparative analysis you can apply all the usual types of explanation: by earlier events, by later events, and contextual explanation. It can be useful to make a table
14
Normative Comparison
• In normative analysis one of the principal
criteria is evaluative like "satisfaction",
"usefulness" etc., and the aim of the study is to
point out the best (in this respect) among the
alternatives that are being studied.
• The final aim perhaps is not only to find the
best, but also to improve it or similar objects
later on
15
Exploratory Classification
• The goal in classification is always the same:
to reveal the systematic structure, invariance,
that exists in all the cases (population) that you
study.
16
Classification into Given
Classes• Fuzzy classification is a method which aims
at placing all the cases or specimens in one or other of the classes even if the "fit" is not perfect
– Cluster analysis, which is best suited when the classification is to be made on the basis of numerical data, and
– Typology which is appropriate for all other kinds of material. Typology is a method of classification where each class is formed around a "typical" or "pure" exemplar.
17
Linnean Classification System
KINGDOMSTRUCTURAL
ORGANIZATION
METHOD OF
NUTRITION
TYPES OF
ORGANISMS
NAMED
SPECIES
TOTAL
SPECIES
(estimate)
Monera
Small, simple single
prokaryotic cell (nucleus is
not enclosed by a membrane);
some form chains or mats
Absorb food
Bacteria, blue-green
algae, and
spirochetes
4,000 1,000,000
Protista
Large, single eukaryotic cell
(nucleus is enclosed by a
membrane); some form chains
or colonies
Absorb, ingest,
and/or
photosynthesize
food
Protozoans and algae
of various types80,000 600,000
Fungi
Multicellular filamentous
form with specialized
eukaryotic cells
Absorb food
Funguses, molds,
mushrooms, yeasts,
mildews, and smuts
72,000 1,500,000
Plantae
Multicellular form with
specialized eukaryotic cells;
do not have their own means
of locomotion
Photosynthesize
food
Mosses, ferns, woody
and non-woody
flowering plants
270,000 320,000
Animalia
Multicellular form with
specialized eukaryotic cells;
have their own means of
locomotion
Ingest food
Sponges, worms,
insects, fish,
amphibians, reptiles,
birds, and mammals
1,326,239 9,812,298
18
Normative Classification
• Normative is any cross tabulation where one of
the dimensions expresses an evaluation
19
Product A Product B Product C
Good 81 % 34 % 9 %
Average 4 % 36 % 60 %
Bad 15 % 30 % 31 %
Analyzing Individual Variables
• Often the preliminary exploration starts with a
single variable.
• Before you submit data to analysis, it will often
be useful to perform some preliminary
operations. These may include:
– Removal of data which are obviously erroneous or
irrelevant.
– Normalizing or reducing your data means that you
eliminate the influence of some well known but
uninteresting factor. 20
Types of Scales
• Nominal Scale. Any numbers used are mere labels
• Ordinal Scale. Numbers indicate the relative position of items
• Interval scale. Numbers indicate the magnitude of difference between items, no absolute zero point.
• Ratio scale. Numbers indicate magnitude of difference and there is a fixed zero point
21
Mode• An average is a statistic which characterizes the typical
value of your data and eliminates the random scattering of values
• A mode is the value that occurs the most frequently in a data set or a probability distribution. The mode of the sample: { 1, 1, 2, 2, 2, 4, 4, 5, 6 } is 2.– The sample: { -4, -2, -2, 0, 2, 2, 3} has two modes -2 & 2 –
bimodal distribution
22
Median
• A median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half.
• No algebraic positional average which takes the value of the unit that is at the middle of prearranged statistical row.– Median of the set: { 1, 1, 2, 2, 2, 4, 4, 5, 6 } There are 5
observations: ( 1, 2, 4, 5, 6 ) – uneven number, so we just pick the third observation and the median is 4.
– Median of the set: { 1, 2, 2, 2, 4, 4, 5 }. The observations are 4: ( 1, 2, 4, 5 ) – even number, so the median is half sum from the second and the third observations and median is (2+4)/2=3.
• The average price of housing is usually the median.
23
Median Graph
24
Statistical Arithmetic Mean
• Non weighted arithmetic mean (average)
• Weighted arithmetic mean
• The expected value of a random variable, which
is also called the population mean25
Arithmetic mean• Statistical set: 7, 12, 17, 24, 35 has a mean 19
• If we assume the probabilities relative to these
values as follows 0.1; 0.4; 0.2; 0.2; 0.1, then
the weighted average will be 17.2
26
195
95
5
3524171271
n
x
x
n
i
i
2.171
2.17
1.02.02.04.01.0
1.0.352.0*242.0*174.0*121.0*7*
1
1
n
i
i
n
i
i
w
wx
x
Normal Distribution• If your studies involve people, your measurements
quite often turn out to be distributed according to a certain curve, the so called Gauss curve
• One of its properties is that 68% of all measurements will differ from the mean by no more than the standard deviation and 95% by two standard deviations
27
Averages in Economic Forecasting
28
Weighted Average
(11,98=5*6+6*18+7*20+...+32*1)
Median (The value of the 100-th
from 199 observations)
Dispersion (Statistical Variation)
• Once you have calculated the average value, it
would sometimes be interesting to describe how
far the singular values are scattered around the
average. To this end, you may choose between a
variety of statistics
• In connection with the arithmetic mean you will
often want to calculate the standard deviation
29
Variance Formulas
• Dispersion is a measure in variability or spread
in a variable or a probability distribution
• The formulas for a population and a sample
will differ
30
n
xxn
i
i
1
2
2
1
1
2
2
n
xxn
i
i
Variance of a Random Variable
• The variance of a random variable is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value (mean). Whereas the mean is a way to describe the location of a distribution, the variance is a way to capture its scale or degree of being spread out.
31
2
1
2 *Pr
n
iii rEr
n
iii rrE
1
*Pr
Standard Deviation
• The positive square root of the variance, called the standard deviation, has the same units as the original variable and can be easier to interpret for this reason
• The standard deviation is a measure of risk, it evaluates the probable deviation of the real values from the expected values.
32
2
n
xxn
i
i
1
2
2
Diachronic, or historical, study
Holistic study of
the evolution of
individuals or
specimens
Analyzing Development
- Describing Development
- Explaining Development
- Normative Study of
Development
Study of the
development of
variables
Study of Time Series
33
Describing Development
• Individual development. Temporal view of an
industrial product
• Development of a class.
• A time series is a line of variable values collected
under a period of time, usually at even intervals.
– The curve is the most usual presentation for time
series.
– Time is normally presented on the horizontal x-axis.
34
Explaining Development
• Often a mere description of the changes in the object
of study does not suffice, and the researcher is asked
to uncover also the reasons and/or effects of the
changes.
• The reasons can be taken either from:
– the past (causal explanation).
– the concurrent context
– the future (i.e. from the intentions of people)
35
Normative Study of Development
• The focal point is to evaluate specific criteria –utility, effectiveness, functionality, safety, beauty, ecofriendliness, price
36
Dominant goal in the design
theory of architecture:
Style of
architecture:
BeautyDoric, Ionian and Corinthian
styles
Religious salvation The Gothic style
Beauty (the classical goal
restored)
Renaissance, baroque, rococo,
neo-classical style
Individualisml'Art Nouveau and other personal
styles like Gaudi's
Utility Functionalism
Beauty – Doric Order
37
Beauty – Ionic Order
38
Beauty – Corinthian Order
39
Religion Salvation – Gothic Style
40
Beauty - Renaissance
41
Individualism – Gaudi Personal Style
42
Utility – Functionalism
43
Time Series
44
Jefferson Memorial
45
Lincoln Memorial
46
Time Series
• If we take a closer look at the variation of the time series, it often reveals components, all of which have their specific regularities which can be analyzed. The most usual of these components are:
– A trend is a linear direction of development over a period of time.
– A periodic variation is a cyclical variation recurring in a similar form all over again.
– Conjuncture variation occurs repeatedly in the same way as a periodic variation, but its length and form vary
– Random variation is usually eliminated by means of the flexible average method.
47
Time Series - Graph
48
Extrapolation• Extrapolation is the most usual method of forecasting. It is
based on the assumption that present development will continue in the same direction and with unvarying speed (or alternatively, with steadily growing or diminishing speed, i.e. a logarithmic extrapolation).
49
d
d
1 2 3 4 5 t
d
Smoothing Methods
• Smoothing Techniques apply to a dynamic set
of historical (observed in practice) data to
calculate the forecast the future value for some
future event of the series.
• The basic notion inherent in smoothing
methods is that there is some pattern to the
series of data and that this pattern will continue
in the future; consequently, past events are
drawn upon to predict or forecast future events
50
Three Basic Methods
• The accuracy of the forecast with smoothing
techniques depends primarily on
– the cohesiveness of the series of historical data and
– how far in the future the forecast is made.
• Smoothing Techniques
– “Moving Averages”
– Exponential Smoothing
– Regression models
51
Moving Averages
The method of moving averages uses the average
of some specified historical period to forecast
the value of a future period.
Moving average for 3, 5, 9, 21 periods...
52
n
AAAAF ntttt
t121
1
...
averagesmovingthecalculatetousedperiodsofnumbern
periodtheforvaluesobservedrealyAдоA
periodnexttheforvalueforecastedaisFwhere
ntt
t
1
1
A Double Moving Average
• A Double Moving Average – we calculate it
from already calculated moving averages for
diferent periods of time
53
n
MAMAMAMAF ntttt
t
'
1
'
2
'
1
''' ...
averagesmovingthecalculatetousedperiodsofnumbern
periodtheforaveragesmovingcalculatedMAдоMA
valueforecastedaisFwhere
ntt
t
'
1
'
''
Example of a “moving average” (MA)
54
Year Value 3-years MA Double 3-years MA 5-years MA
2007 31.6
2008 30.5
2009 31.8
2010 34.2 31.3
2011 36.3 32.2
2012 39.3 34.1 32.9
2013 41.7 36.6 32.5 34.4
2014 50.0 39.1 34.3 36.7
2015 46.8 43.7 36.6 40.3
2016 43.7 46.2 39.8 42.8
2017 52.1 46.8 43.0 44.3
2018 63.3 47.5 45.6 46.9
Model of “A Double Moving
Average ”
55
forecastwewhichfuturetheintimeofperiodsofnumbert
averagemovingthecalculatetousedyearsofnumbern
averagesmovingsecondandfirsttheofevaluationMAиMA
tscoefficien
valueforecastedaisFwhere
tt
t
,
,
''
1
'
1
1
)(1
2
2
*
''
1
'
1
''
1
'
1
1
tt
tt
t
MAMAn
MAMA
tF
Model’s Forecast
56
Year Value 3-yrs. MADouble 3-yrs.
MA
α =
2MA’t+1-MA”t+1
Level
β = 2/(n-1) *
(MA’t+1-MA”t+1)
Trend
Ft+1 =
α + β*P
Forecast
2007 28.2
2008 31.6
2009 30.5
2010 31.8 30.10
2011 34.2 31.30
2012 36.3 32.17
2013 39.3 34.10 31.19 37.01 2.91 39.92
2014 41.7 36.60 32.52 40.68 4.08 44.76
2015 50.0 39.10 34.29 43.91 4.81 48.72
2016 46.8 43.67 36.60 50.73 7.07 57.80
2017 43.7 46.17 39.79 52.54 6.38 58.92
2018 52.1 46.83 42.98 50.69 3.86 54.54
Exponential Moving Average
57
11 *1 ttt FAF
periodstimeofnumbern
tcoefficienweightingn
tperiodforvalueactualtheA
tperiodforMAlexponentiaforecastofvaluetheF
where
t
t
;1
2
1
)(
1
111 tttt FAFF
Exponential Moving Average
Forecast Year Value Forecast
α = 0.2
Forecast
α = 0.5
2007 28.2
2008 31.6 28.20 28.20
2009 30.5 28.88 29.90
2010 31.8 29.20 30.20
2011 34.2 29.72 31.00
2012 36.3 30.62 32.60
2013 39.3 31.75 34.45
2014 41.7 33.26 36.88
2015 50.0 34.95 39.29
2016 46.8 37.96 44.64
2017 43.7 39.73 45.72
2018 52.1 40.52 44.7158
EMA – a Trigger in Technical
Analysis
59
26-days EMA
longer and slower
12-days EMA
shorter and
faster
Sell
Signal
Buy
Signal
Regression Analysis
• Explore if a certain variable is causally
dependent on one or more other variables
60
errorforecast
linetheofslopetheofanglethe
axisуwithpointcross
variabletindependenx
variabledependenty
i
i
i
iii xy *
Linear Trend in Time:
Calculating α and β
61
22
..
ttn
ytytn
n
ty
.
ii ty *
Linear Trend Equation Example
62
t y
Week t2 Sales ty
1 1 150 150
2 4 157 314
3 9 162 486
4 16 166 664
5 25 177 885
S t = 15 S t2 = 55 S y = 812 S ty = 2499
(S t)2 = 225
Linear Trend Calculation
63
3.6225275
1218012495
22555*5
812*152499*5
5.1435
15*3.6812
ty *3.65.143
Linear Regression
64
2009 4
2010 8
2011 10
2012 12
2013 13
2014 14
2015 14
2016 ?
2017 ?
2009 4
2010 8
2011 10
2012 12
2013 13
2014 14
2015 14
2016 17.14
2017 18.75
y = 1.607x + 4.2861
0
2
4
6
8
10
12
14
16
18
20
2009 2010 2011 2012 2013 2014 2015 2016 2017
Forecast Accuracy
• Forecast Error - difference between the actual value and predicted value for a given time period
et = At – Ft• Mean Absolute Deviation (MAD)
– Average absolute error
• Mean Squared Error (MSE)
– Average of squared error
• Mean Absolute Percent Error (MAPE)
– Average absolute percent error
65
MAD, MSE, and MAPE
66
n
FA
MAD
n
t
tt
1
n
FA
MSE
n
t
tt
1
2
n
A
FA
MAPE
n
t t
tt
1
100*
Calculation of MAD, MSE, MAPE
67
Period Actual Forecast (A-F) |A-F| (A-F)^2 (|A-F|/A)*100
1 217 215 2 2 4 0.92
2 213 216 -3 3 9 1.41
3 216 215 1 1 1 0.46
4 210 214 -4 4 16 1.90
5 213 211 2 2 4 0.94
6 219 214 5 5 25 2.28
7 216 217 -1 1 1 0.46
8 212 216 -4 4 16 1.89
-2 22 76 10.26
MAD= 2.75
MSE= 9.50
MAPE= 1.28
Regression Log Function
68
2009 4 4
2010 8 8
2011 10 10
2012 12 12
2013 13 13
2014 14 14
2015 14 14
2016 15.32017 16.0
y = 5.3577ln(x) + 4.1892
0
2
4
6
8
10
12
14
16
18
2009 2010 2011 2012 2013 2014 2015 2016 2017
Summary
• Quantitative methods of forecasting provide us
with an expected value of a certain indicator.
• These measures helps us to obtain a clear
picture of the future
• Excellent knowledge of statistics helps us to be
more precise in forecasting economic variables
69