Time Series analysis using ARMA models can be done as follows:
Plot the data / ACF and examine the characteristics
Data model (variance change, trend, seasonality)
Transform and Eliminate / Estimate the components
Test the residuals if IID
Select p and q for the ARMA(p,q) using PCAF and ACF
Select the best
Estimation of the parameters
ARMA(p,q) are stationary time series models given by:
\[
\Phi(B) X_t = \Theta(B) Z_t, \qquad Z_t \sim WN(0,\sigma^2)
\]
where \(\Phi(B) = 1-\phi_1X_{t-1} - \ldots - \phi_pX_{t-p}\) and \(\Theta(B) = 1+\theta_1Z_{t-1} + \ldots + \theta_qZ_{t-q}\)
Simulate ARMA models
MA(1) example
\[
X_t = 0.8Z_{t-1} + Z_t; Z_t \sim WN(0,1)
\]
library(itsmr)
ma1 = arima.sim(model = list(ar=c(),ma=c(0.8)),n=200)
plot.ts(ma1)
# Innovations estimator - MA(q)
ia(ma1,q=1)
$phi
[1] 0
$theta
[1] 0.7553325
$sigma2
[1] 0.9501531
$aicc
[1] 562.2551
$se.phi
[1] 0
$se.theta
[1] 0.07071068
AR(1) example
\[
X_t = 0.4 X_{t-1} + Z_t; Z_t \sim WN(0,1)
\]
library(itsmr)
ar1 = arima.sim(model = list(ar=c(0.4),ma=c()),n=200)
plot.ts(ar1)
# yule-walker estimator - AR(p)
yw(ar1,p=1)
$phi
[1] 0.3499744
$theta
[1] 0
$sigma2
[1] 1.072415
$aicc
[1] 585.7495
$se.phi
[1] 0.06623888
$se.theta
[1] 0
ARMA(1,1) example
\[
X_t = 0.4X_{t-1} + 0.5 Z_{t-1} + Z_t; Z_t \sim WN(0,1)
\]
library(itsmr)
arma11 = arima.sim(model = list(ar=c(0.4),ma=c(0.5)),n=200)
plot.ts(arma11)
# HR estimator - ARMA(p,q)
hannan(arma11,p = 1,q = 1)
$phi
[1] 0.3442114
$theta
[1] 0.509029
$sigma2
[1] 0.9163617
$aicc
[1] 556.978
$se.phi
[1] 0.08566999
$se.theta
[1] 0.1191102
MLE estimator
library(itsmr)
arma11 = specify(ar=c(0.4),ma=c(0.5),sigma2=1)
check(arma11)
data.sim = sim(arma11,n=200)
plot.ts(data.sim)
acf(data.sim,lag.max = 40)
# MLE estimator - ARMA(p,q)
arma(data.sim,p=1,q=1)
$phi
[1] 0.3877843
$theta
[1] 0.4967005
$sigma2
[1] 1.094681
$aicc
[1] 592.5891
$se.phi
[1] 0.0879537
$se.theta
[1] 0.08303409
Example Dow Jones Utilities Index 1972
Our data model:
\(X_t = m_t + Y_t\)
# differencing operator
r = diff(dowj)
mean(r)
Null hypothesis: Residuals are iid noise.
Test Distribution Statistic p-value
Ljung-Box Q Q ~ chisq(20) 46.43 7e-04 *
McLeod-Li Q Q ~ chisq(20) 18.77 0.5367
Turning points T (T-50)/3.7 ~ N(0,1) 44 0.1008
Diff signs S (S-38)/2.5 ~ N(0,1) 37 0.6949
Rank P (P-1463)/113.7 ~ N(0,1) 1512 0.6664
Some possible models are: - AR(1) or MA(1) or MA(2) - ARMA(1,1) or ARMA(1,2)
# AR(1)
ar1 = arma(x = R,p = 1,q=0)
ma1 = arma(x = R,p = 0,q=1)
ma2 = arma(x = R,p = 0,q=2)
arma11 = arma(x = R,p = 1,q=1)
arma12 = arma(x = R,p = 1,q=2)
# autofit
automodel = autofit(R)
cbind(ar1=ar1$aicc,
ma1=ma1$aicc,
ma2=ma2$aicc,
arma11=arma11$aicc,
arma12=arma12$aicc,
automodel=automodel$aicc)
ar1 ma1 ma2 arma11 arma12 automodel
[1,] 74.48316 79.15577 77.0124 75.13406 76.53209 74.48316
The best model in terms of AICC is AR(1) given by:
\[
(1-0.45B)(\nabla X_t - 0.134) = Z_t, \qquad Z_t \sim WN(0,0.145)
\]
Or we can rewrite it: \(X_t - X_{t-1} - 0.134= Y_t\)
\[
Y_t = 0.45Y_{t-1} + Z_t
\]
We need to see if the the residuals of the model are iid?
model = specify(ar=arma11$phi , ma=arma11$theta , sigma2 = arma11$sigma2)
M= c("diff",1)
rfit = Resid(x = dowj,M = M,a = model) # residual from the model Z_t
test(rfit)
Null hypothesis: Residuals are iid noise.
Test Distribution Statistic p-value
Ljung-Box Q Q ~ chisq(20) 28.23 0.1041
McLeod-Li Q Q ~ chisq(20) 19.16 0.5116
Turning points T (T-50)/3.7 ~ N(0,1) 47 0.4119
Diff signs S (S-38)/2.5 ~ N(0,1) 36 0.4328
Rank P (P-1463)/113.7 ~ N(0,1) 1445 0.8742
The AR(1) model is not adequate since the iid hypothesis was rejected.
We fit ARMA(1,1) with AICC=75.13 and the model is adequate:
The equation is:
\[
(1-0.77B)(\nabla X_t - 0.134) = Z_t -0.42Z_{t-1}, \quad Z_t \sim WN(0,0.142)
\] The 95% CI for the parameters are:
c(arma11$phi + 1.95 *arma11$se.phi,arma11$phi - 1.95* arma11$se.phi)
c(arma11$theta + 1.95 *arma11$se.theta,arma11$theta - 1.95 *arma11$se.theta)
Wine data
We do have: trend (line) and seasonality (d=12)
data.model = c("season",12,"trend",1)
r = Resid(wine,M=data.model) # yt = xt - mt - st - 1.347
test(r)
Null hypothesis: Residuals are iid noise.
Test Distribution Statistic p-value
Ljung-Box Q Q ~ chisq(20) 37.27 0.0108 *
McLeod-Li Q Q ~ chisq(20) 8.23 0.9902
Turning points T (T-93.3)/5 ~ N(0,1) 92 0.7894
Diff signs S (S-70.5)/3.5 ~ N(0,1) 69 0.6639
Rank P (P-5005.5)/283.5 ~ N(0,1) 5038 0.9087
# autofit
bestmodel = autofit(r,p=0:2,q=0:2)
# test iid for the fit
model = specify(ar=bestmodel$phi , ma=bestmodel$theta , sigma2 = bestmodel$sigma2)
rfit = Resid(x = wine,M = data.model,a = model)
test(rfit)
Null hypothesis: Residuals are iid noise.
Test Distribution Statistic p-value
Ljung-Box Q Q ~ chisq(20) 19.03 0.5198
McLeod-Li Q Q ~ chisq(20) 7.29 0.9956
Turning points T (T-93.3)/5 ~ N(0,1) 88 0.2854
Diff signs S (S-70.5)/3.5 ~ N(0,1) 71 0.8848
Rank P (P-5005.5)/283.5 ~ N(0,1) 5021 0.9564
We write the model:
\[
X_t = 1.347 + m_t + s_t + 0.18X_{t-1} + 0.19X_{t-2} + Z_t, \quad Z_t \sim WN(0,30746.27)
\]