Time Series Analysis with R -...
Transcript of Time Series Analysis with R -...
Time Series Analysis with R
Jinseog Kim
Department of Staistics and Information Science
Dongguk University
E-mail:[email protected]
0-0
1 Time series analysis in R
1.1 k�N�á~ &P��7�£�· �Dø5� R-packages
• stats: (R l��:rJ�v�t�)
• datasets: The R Datasets Package
• forecast(forecasting)
• tseries: Time series analysis and computational finance
• fma: Data sets from ”Forecasting: methods and applications” by
Makridakis, Wheelwright & Hyndman (1998)
0-1
> library(datasets)
> class(LakeHuron)
[1] "ts"
> plot(LakeHuron, ylab="¡Ý÷¬È(in feet)", xlab=" ¬Ï᤼"))
> lag.plot(LakeHuron, lag=4, diag.col = "forest green", do.lines=F)
0-2
lag 1
LakeH
uro
n576
577
578
579
580
581
582
576 578 580 582
lag 2
LakeH
uro
n
lag 3
LakeH
uro
n
lag 4
LakeH
uro
n
576
577
578
579
580
581
582
576 578 580 582
Figure 1: lag.plot
0-3
WWWusage( Internet Usage per Minute) : A time series of the numbers of
users connected to the Internet through a server every minute.
> work <- diff(WWWusage)
> par(mfrow = c(2,1));
> plot(WWWusage);
> plot(work)
0-4
lynx:Annual number of lynx trapped in McKenzie river district of north-
west Canada: 1821∼1934.
#plot(lynx)
#tsdisplay(lynx)
dlynx <- diff(lynx)
par(mfrow = c(2,1));
ts.plot(lynx);
plot(dlynx)
0-5
Time
lyn
x1820 1840 1860 1880 1900 1920
03
00
07
00
0
Time
dly
nx
1820 1840 1860 1880 1900 1920
−3
00
00
30
00
Figure 2: lynx plot
0-6
random walk:Simulation data
rw<-rbinom(100, 1, prob=0.5)*2-1
rw<-cumsum(rw)
par(mfrow = c(2,1));
ts.plot(rw,
main="random walk form independent Bernoulli distribution");
rw<-rnorm(100)
rw<-cumsum(rw)
ts.plot(rw,
main="random walk form independent normal distribution");
0-7
random walk form independent Bernoulli distribution
Time
rw0 20 40 60 80 100
−5
05
random walk form independent normal distribution
Time
rw
0 20 40 60 80 100
−10
05
0-8
rw<-matrix(ncol=10, nrow=100)
for(i in 1:10)
{
x<-rnorm(100)
rw[,i]<-cumsum(x)
ts.plot(rw,
main="random walk form independent normal distribution",
col=1:30);
}
0-9
filter:Linear filter
filter(x, filter, method = c(”convolution”, ”recursive”), sides = 2, circular =
FALSE, init)
yt = xt + f1yt−1 + ... + fpyt−p
plot(dlynx)
lines(filter(dlinx, rep(1,6))/6, col=2)
lines(filter(dlynx, rep(1,6))/6, col=2)
lines(filter(dlynx, rep(1,10))/10, col=3)
0-11
1.2 k�N�á~�¿ÌfC£�· �Dø5� Áþ�ÊÁ
• ar: Fit Autoregressive Models to Time Series
• arima: ARIMA Modelling of Time Series
1.3 k�N�á~+�Øכ�� smoothing or filtering£�· �Dø5�
Áþ�ÊÁ
• tsSmooth: Use Fixed-Interval Smoothing
• filter: Linear Filtering on a Time Series
• HoltWinters: Holt-Winters Filtering
0-13
• KalmanLike: Kalman Filtering
• KalmanRun: Kalman Filtering
• KalmanSmooth: Kalman Filtering
1.4 k�N�á~�¿ÌfCãÃ�× Ud¥�G±ê£�· ��·�]� �ÐM� Áþ�ÊÁ
• predict.ar: Fit Autoregressive Models to Time Series
• predict.Arima: Forecast from ARIMA fits
• predict.arima0: ARIMA Modelling of Time Series - Preliminary Ver-
sion
0-14
• predict.HoltWinters: prediction function for fitted Holt-Winters mod-
els
• KalmanForecast: Kalman Filtering
1.5 k�N�á~+�Øכ�� e��ï&P��7�£�· �Dø5� Áþ�ÊÁ
• lag: Lag a Time Series
• acf: Auto-Correlation Function Estimation
• ccf: Cross Correlation Function Estimation
• pacf: Partial Correlation Function Estimation
0-15
1.6 k�N�á~+�Øכ�� e��ï&P��7�£�· �Dø5� Áþ�ÊÁ(plot)
• lag.plot: Time Series Lag Plots
• ts.plot: Plot Multiple Time Series
• tsdiag: Diagnostic Plots for Time-Series Fits
• plot.acf: Plot Autocovariance and Autocorrelation Functions
1.7 k�N�á~&P��7�Uc"� ��à�~É�Ça�£�· �Dø5� Áþ�ÊÁ
• Box.test: Box-Pierce and Ljung-Box Tests
• PP.test: Phillips-Perron Test for Unit Roots
0-16
1.8 k�N�á~+�Øכ�� �ñd�æ·ÿb�£�· �Dø5� Áþ�ÊÁ
• ts: Time-Series Objects
• ts.intersect: Bind Two or More Time Series
• ts.plot: Plot Multiple Time Series
• ts.union: Bind Two or More Time Series
• tsp: Tsp Attribute of Time-Series-like Objects
0-17
2 k�N�á~�¿ÌfC
r�>�\P��+þA\�"f_� SX�Ò�¦���ú:
{Xt},#�l�"f t ��H r�çß�, t ≥ 0.
s�1lxîç�H�+þA(Moving average models; MA): at ∼iid (0, σ2)���¦ ½+É M:, ��6£§
_� �+þA�¦ �¦�9 ���.
Xt = at + φat−1 + φ2at−2 + φ3at−3 + ...
0A_� d���Ér ��6£§õ� °ú s� ³ð�&³½+É Ãº e����.
Xt = φXt−1 + at.
0-18
2.1 e�Ôeµ5���� ��� £� #aÇa�h�
• &ñ�©�$í(stationarity): weak stationarity(���&ñ�©�$í)
– E(Xt) = µ, ���H t\� @/ �#� îç�Hs� {9�&ñ ���.
– V ar(Xt) = σ2
– Cov(Xt, Xt−h) = σ(h).
Note:Xt, Yt�� ÇÐÇÐ Ça�(�×k�N�á~l�¢�> aXt + bYt �¿ Ça�(�×k�N�á~l���.
Check the stationarity for following model:
Xt = et + 0.4et−1, et ∼ WN(0, σ2),
0-19
• ��l�/BNì�ríß�(autocovariance):
V ar(Xt) = σ(0)
Cov(Xt, Xt−h) = σ(h)
V ar(Xt−h, Xt) = σ(−h), σ(h) = σ(−h)
|σ(h)| ≤ σ(0)
• ��l��©��'a(autocorrelation):
ρh = corr(Xt, Xt−h) =σ(h)σ(0)
ρ0 = 1
ρk = ρ−h
0-20
2.2 Ça�(�×k�N�á~�+ Ud
• Ñþ�Ò�oú�6£§: white noise process
at ∼ (0, σ2)
• Xt = φXt−1 + at, |φ| < 1
• Xt = θat−1 + at, |θ| < 1
• SX�Ò�¦�Ð'��-random walk process
Xt = Xt−1 + at,éß� at ∼ (0, σ2).
X0 = 0ܼ�Ð��&ñ ����, Xt = (Xt−2+at−1)+at = . . . = a1+a2+. . . at.
0-21
2.3 Fitting ARIMA models
2.3.1 AR models
We are now going to load up some of the datasets and try to fit ARIMA
models to them
• Exercise 1. Load up the dataset beavers from R, and then analyse
the temperature data in the dataframe beaver1: plot it, then inspect
the ACF, then try fitting AR models using the ar command in its
various forms; ar.yw, ar.burg, ar.mle. Using the first 80 observations,
predict ahead the remaining 34: I had stored the temperatures in a
time-series object y, so I did
0-22
> new <-y[1:80]
> pr <-predict.ar(F2,new,n.ahead=34)
> plot(y)
> lines(pr$pred,col=’’red’’)
Comment on what you have found.
• Exercise 2. Load up the dataset austres, and plot it. What do you
see? Take differences, and inspect the ACF. Try fitting various AR
models. Ifda is the difference sequence, try the following commands:
> var(da)
> F1 <-ar.yw(da)
0-23
> F1$ar
> F1$var.pred
> F2 <-ar.burg(da)
> F2$ar
> F2$var.pred
> acf(F2$resid,na.action=na.omit,lag.max=30)
> acf(F1$resid,na.action=na.omit,lag.max=30)
> F3 <-ar.burg(da, aic=F, order.max=4)
> F3$ar
> F3$var.pred
> acf(F3$resid,na.action=na.omit,lag.max=30)
0-24
There are three models fitted here; which would you prefer to use and
why?
• Exercise 3. Load up the dataset treering, and plot it. Does it appear
that the data should be transformed? Do there appear to be outlying
values? Inspect the ACF. Does this suggest a possible model for the
data? Try some of the following.
> var(treering)
> F1 <-ar.yw(treering);F2 ar.burg(treering)
> F1$ar; F2$ar
> F1$var.pred; F2$var.pred
> F3 <-ar.burg(treering, aic=F, order.max=3)
0-25
> F3$ar
> F3$var.pred
• Exercise 4. Following the lines of the earlier examples, find an appor-
priate model for the data in the R dataset lh. If you choose something
other than an AR(1) model, compare your choice with an AR(1) and
explain why you think your choice is to be preferred.
• Exercise 5. See what you make of the lynx data.
• Exercise 6. And lastly, find some model to fit the sunspot.month data.
0-26