ARMA Modeling

Author

Dr. Cohen

Time Series analysis using ARMA models can be done as follows:

  1. Plot the data / ACF and examine the characteristics

  2. Data model (variance change, trend, seasonality)

  3. Transform and Eliminate / Estimate the components

  4. Test the residuals if IID

  5. Select p and q for the ARMA(p,q) using PCAF and ACF

  6. Select the best

Estimation of the parameters

ARMA(p,q) are stationary time series models given by:

\[ \Phi(B) X_t = \Theta(B) Z_t, \qquad Z_t \sim WN(0,\sigma^2) \]

where \(\Phi(B) = 1-\phi_1X_{t-1} - \ldots - \phi_pX_{t-p}\) and \(\Theta(B) = 1+\theta_1Z_{t-1} + \ldots + \theta_qZ_{t-q}\)

Simulate ARMA models

MA(1) example

\[ X_t = 0.8Z_{t-1} + Z_t; Z_t \sim WN(0,1) \]

library(itsmr)
ma1 = arima.sim(model = list(ar=c(),ma=c(0.8)),n=200)
plot.ts(ma1)

plota(ma1)

# Innovations estimator - MA(q)
ia(ma1,q=1)
$phi
[1] 0

$theta
[1] 0.7553325

$sigma2
[1] 0.9501531

$aicc
[1] 562.2551

$se.phi
[1] 0

$se.theta
[1] 0.07071068

AR(1) example

\[ X_t = 0.4 X_{t-1} + Z_t; Z_t \sim WN(0,1) \]

library(itsmr)
ar1 = arima.sim(model = list(ar=c(0.4),ma=c()),n=200)
plot.ts(ar1)

plota(ar1)

# yule-walker estimator - AR(p)
yw(ar1,p=1)
$phi
[1] 0.3499744

$theta
[1] 0

$sigma2
[1] 1.072415

$aicc
[1] 585.7495

$se.phi
[1] 0.06623888

$se.theta
[1] 0

ARMA(1,1) example

\[ X_t = 0.4X_{t-1} + 0.5 Z_{t-1} + Z_t; Z_t \sim WN(0,1) \]

library(itsmr)
arma11 = arima.sim(model = list(ar=c(0.4),ma=c(0.5)),n=200)
plot.ts(arma11)

plota(arma11)

# HR estimator - ARMA(p,q)
hannan(arma11,p = 1,q = 1)
$phi
[1] 0.3442114

$theta
[1] 0.509029

$sigma2
[1] 0.9163617

$aicc
[1] 556.978

$se.phi
[1] 0.08566999

$se.theta
[1] 0.1191102

MLE estimator

library(itsmr)
arma11 = specify(ar=c(0.4),ma=c(0.5),sigma2=1)
check(arma11)
Causal
Invertible
data.sim = sim(arma11,n=200)
plot.ts(data.sim)

acf(data.sim,lag.max = 40)

# MLE estimator - ARMA(p,q)
arma(data.sim,p=1,q=1)
$phi
[1] 0.3877843

$theta
[1] 0.4967005

$sigma2
[1] 1.094681

$aicc
[1] 592.5891

$se.phi
[1] 0.0879537

$se.theta
[1] 0.08303409

Example Dow Jones Utilities Index 1972

plot.ts(dowj)

acf(dowj,lag.max = 50)

Our data model:

\(X_t = m_t + Y_t\)

# differencing operator
r = diff(dowj)
mean(r)
[1] 0.1336364
R = r - mean(r)
test(R)
Null hypothesis: Residuals are iid noise.
Test                        Distribution Statistic   p-value
Ljung-Box Q                Q ~ chisq(20)     46.43     7e-04 *
McLeod-Li Q                Q ~ chisq(20)     18.77    0.5367
Turning points T     (T-50)/3.7 ~ N(0,1)        44    0.1008
Diff signs S         (S-38)/2.5 ~ N(0,1)        37    0.6949
Rank P           (P-1463)/113.7 ~ N(0,1)      1512    0.6664

Some possible models are: - AR(1) or MA(1) or MA(2) - ARMA(1,1) or ARMA(1,2)

# AR(1)
ar1 = arma(x = R,p = 1,q=0)
ma1 = arma(x = R,p = 0,q=1)
ma2 = arma(x = R,p = 0,q=2)
arma11 = arma(x = R,p = 1,q=1)
arma12 = arma(x = R,p = 1,q=2)
# autofit
automodel = autofit(R)

cbind(ar1=ar1$aicc,
      ma1=ma1$aicc,
      ma2=ma2$aicc,
      arma11=arma11$aicc,
      arma12=arma12$aicc,
      automodel=automodel$aicc)
          ar1      ma1     ma2   arma11   arma12 automodel
[1,] 74.48316 79.15577 77.0124 75.13406 76.53209  74.48316

The best model in terms of AICC is AR(1) given by:

\[ (1-0.45B)(\nabla X_t - 0.134) = Z_t, \qquad Z_t \sim WN(0,0.145) \]

Or we can rewrite it: \(X_t - X_{t-1} - 0.134= Y_t\)

\[ Y_t = 0.45Y_{t-1} + Z_t \]

We need to see if the the residuals of the model are iid?

model = specify(ar=arma11$phi , ma=arma11$theta , sigma2 = arma11$sigma2)
M= c("diff",1)
rfit = Resid(x = dowj,M = M,a = model) # residual from the model Z_t
test(rfit)
Null hypothesis: Residuals are iid noise.
Test                        Distribution Statistic   p-value
Ljung-Box Q                Q ~ chisq(20)     28.23    0.1041
McLeod-Li Q                Q ~ chisq(20)     19.16    0.5116
Turning points T     (T-50)/3.7 ~ N(0,1)        47    0.4119
Diff signs S         (S-38)/2.5 ~ N(0,1)        36    0.4328
Rank P           (P-1463)/113.7 ~ N(0,1)      1445    0.8742

The AR(1) model is not adequate since the iid hypothesis was rejected.

We fit ARMA(1,1) with AICC=75.13 and the model is adequate:

The equation is:

\[ (1-0.77B)(\nabla X_t - 0.134) = Z_t -0.42Z_{t-1}, \quad Z_t \sim WN(0,0.142) \] The 95% CI for the parameters are:

c(arma11$phi + 1.95 *arma11$se.phi,arma11$phi - 1.95* arma11$se.phi)
[1] 1.1886235 0.3439741
c(arma11$theta + 1.95 *arma11$se.theta,arma11$theta - 1.95 *arma11$se.theta)
[1]  0.2099895 -1.0495410

Wine data

plot.ts(wine)

acf(wine)

We do have: trend (line) and seasonality (d=12)

data.model = c("season",12,"trend",1)
r = Resid(wine,M=data.model) # yt = xt - mt - st - 1.347
test(r)
Null hypothesis: Residuals are iid noise.
Test                        Distribution Statistic   p-value
Ljung-Box Q                Q ~ chisq(20)     37.27    0.0108 *
McLeod-Li Q                Q ~ chisq(20)      8.23    0.9902
Turning points T     (T-93.3)/5 ~ N(0,1)        92    0.7894
Diff signs S       (S-70.5)/3.5 ~ N(0,1)        69    0.6639
Rank P         (P-5005.5)/283.5 ~ N(0,1)      5038    0.9087

# autofit
bestmodel = autofit(r,p=0:2,q=0:2)
# test iid for the fit
model = specify(ar=bestmodel$phi , ma=bestmodel$theta , sigma2 = bestmodel$sigma2)
rfit = Resid(x = wine,M = data.model,a = model)
test(rfit)
Null hypothesis: Residuals are iid noise.
Test                        Distribution Statistic   p-value
Ljung-Box Q                Q ~ chisq(20)     19.03    0.5198
McLeod-Li Q                Q ~ chisq(20)      7.29    0.9956
Turning points T     (T-93.3)/5 ~ N(0,1)        88    0.2854
Diff signs S       (S-70.5)/3.5 ~ N(0,1)        71    0.8848
Rank P         (P-5005.5)/283.5 ~ N(0,1)      5021    0.9564

We write the model:

\[ X_t = 1.347 + m_t + s_t + 0.18X_{t-1} + 0.19X_{t-2} + Z_t, \quad Z_t \sim WN(0,30746.27) \]