- Moving Average Probability Model
- Properties
- Relation to Autoregression

- Extensions
- ARIMA model
- Seasonal ARIMA Model

- Multivariate Models
- Application: Macro Forecasting

- Russian economist Eugen Slutsky investigated how regular economic patterns could arise from pure randomness
- He took a long string of random numbers from the lottery, and took the average of the first 10
- Then he shifted the window over by 1, taking an average of the 2nd through 11th numbers
- He repeated this through the full string, producing a sequence of overlapping averages, a
**moving average**

- In this process, he produced a sequence that looked remarkably like the fluctuations observed in economic series
- He theorized that such a mechanism, where sequences of purely random shocks are combined to produce a single number, might describe the origins of economic cycles
- Today,
*moving average*refers to a process generated by overlapping weighted combinations of random variables

- Consider a sequence of mean 0 uncorrelated
**shocks**\(e_t\) \(t=1\ldots T\) which are*not*observed- \(E[e_t]=0\), \(E[e_te_{t-h}]=0\) for all t, for all \(h\neq0\)

- The observed data is still \(\{y_t\}_{t=1}^T\)
- The
**Moving Average**Model (of order q) says that \(y_t\) is a weighted sum of present and past shocks- For all \(t\), \(y_t=e_t+\sum_{j=1}^{q}\theta_{j}e_{t-j}\)

- Coefficients \(\theta=\{\theta_j\}_{j=1}^{q}\) determine relationship between observations over time

- In lag polynomial notation, let \(\theta(L)=1+\sum_{j=1}^{q}\theta_jL^j\), then \(y_t=\theta(L)e_t\)
- To produce a full likelihood, strengthen assumption to \(e_t\overset{iid}{\sim}f()\), usually \(N(0,\sigma_e^2)\)
- Compare to the
**Autoregression**model: \(y_t\) is a weighted sum of a present shock and past*observations*- For all \(t\), \(y_t=e_t+\sum_{j=1}^{q}b_{j}y_{t-j}\)
- Lag polynomial representation \(b(L)y_t=e_t\) has lags on the left side instead of the right

- The appeal of the MA model comes from the fact that although \(e_t\) never directly seen, the properties of the data are characterized explicitly by the model
- In particular, the Autocovariance function is determined by parameters \(\theta\)
- \(\gamma(h):=Cov(y_{t},y_{t-h})=Cov(e_t+\sum_{j=1}^{q}\theta_{j}e_{t-j},e_{t-h}+\sum_{j=1}^{q}\theta_{j}e_{t-j-h})\) \(=\sigma_e^2\sum_{j=0}^{q}\sum_{k=0}^q\theta_{j}\theta_{k}\delta_{k=h+j}\)
- In words, covariances are determined by the moving average components shared between times

- Consider, e.g. MA(1), \(y_t=e_t+\theta_1e_{t-1}\)
- \(\gamma(0)=Var(y_t)=\sigma_e^2(1+\theta_1^2)\), \(\gamma(1)=Cov(y_t,y_{t-1})=\sigma_e^2\theta_1\), \(\gamma(j)=0\), for all \(j\geq 2\)

- Last property, that autocovariance function drops to 0 at \(q+1\), is true for any \(MA(q)\)
- Beyond the window of length q, the moving averages do not overlap, and observations are uncorrelated

- MA(q) model allows modeling short term correlations up to length \(q\) horizon: predicted mean goes to 0 in finite time
- By allowing long enough \(q\), can, with many coefficients, describe very general patterns of relationships
- As an MA model is linear, like AR model, does not allow for general patterns of higher-order properties like conditional variance, skewness etc

- Difficulty of models with latent variables is that one cannot learn about the \(e_t\) directly, but only their properties
- With two or more random components determining one observation, might not be able to distinguish which one was the source of any change
- This is called an
**identification**problem in econometrics

- This is called an
- Consider MA(1) processes \(y_t=e_t+\frac{1}{2}e_{t-1}\), \(e_t\overset{iid}{\sim}N(0,\sigma_{e}^2)\) and \(y_t=u_t+2u_{t-1}\), \(u_t\overset{iid}{\sim}N(0,\frac{\sigma_{e}^2}{4})\)
- We see ACF of first is \(\gamma(0)=\frac{5}{4}\sigma_e^2\), \(\gamma(1)=\frac{\sigma_e^2}{2}\)
- ACF of second process is exactly the same, so by normality, distribution is also exactly the same

- This is an example of a general phenomenon: factorizing \(\theta(L)=\Pi_{j=1}^{q}(1+t_jL)\) with inverse roots \(t_j\), there is an equivalent representation \(\widetilde{\theta}(L)=\Pi_{j=1}^{q}(1+\frac{1}{t_j}L^j)\) with flipped roots and a different \(\sigma^2_e\)
- Does this make a difference? Not for forecasting: properties of series the same either way
- Can simply restrict interest to a representation with all roots inside the unit circle, for Bayesian or statistical approach

- General lesson is that when building model out of unobserved parts, might not be able to learn all about them

- Properties of MA model can be seen by repeated substitution
- Consider MA(1) \(y_t=e_t+\theta_1e_{t-1}\) rearrange as \(e_t=y_{t}-\theta_1e_{t-1}\)

- Can substitute in past values \(e_{t-1}\) into this formula repeatedly
- \(e_t=y_{t}-\theta_1(y_{t-1}-\theta_1e_{t-2})=y_{t}-\theta_1y_{t-1}+\theta^2_1e_{t-2}\)
- \(=y_{t}-\theta_1y_{t-1}+\theta^2_1(y_{t-2}-\theta_1e_{t-3})=y_{t}-\theta_1y_{t-1}+\theta^2_1y_{t-2}-\theta^3_1e_{t-3}=\ldots\)

- Continuing indefinitely, have, in lag polynomial notation, \((1-\sum_{j=1}^{\infty}(-\theta_1)^jL^j)y_t=e_t\)
- This is exactly an (infinite order) autoregression model

- This equivalence holds in general: a finite order MA model is an infinite order AR model
- Intuition: because \(e_t\) never seen exactly, must use all past information in \(\mathcal{Y}_t\) to predict it
- Observable implication: PACF will decay to 0 smoothly, rather than dropping off like AR
- Equivalence also holds in reverse: a finite order AR model is an infinite order MA model
- Repeatedly substituting, AR(1) \(y_t=b_1y_{t-1}+e_t\) becomes \(y_t=e_t+\sum_{j=1}^{\infty}b^j_1e_{t-j}\)

- Can use either, but one representation may be much less complex

- Because the shocks are not observed, likelihood formula for MA model surprisingly complicated
- In general, no closed form formula exists for conditional likelihood

- Reason is that \(y_t\) and \(y_{t-1}\) both depend on \(e_{t-1}\), but to know what part of \(y_{t-1}\) comes from \(e_{t-1}\) need to know what comes from \(e_{t-2}\), which affects \(y_{t-2}\) which requires… etc
- A variety of solutions exist which nevertheless allow valid estimates
- Likelihood can be constructed by
*recursive*algorithm, step by step- Requires conditioning each period, which requires using Bayes rule, which requires integration
- But if \(e_t\sim N(0,\sigma_e^2)\), there is a fast and exact recursive algorithm

- Typical approach: use likelihood from normal case even if you don’t think distribution is normal
- This is called a
**quasi-**or**pseudo-**likelihood, and \(\theta\) can be estimated by maximizing it: this is default method in R in`arima`

command - Or use penalized estimation, or do Bayesian inference with it

- This is called a
- Alternative 1: Choose parameters \(\widehat{\theta}\) to match estimated ACF to model-implied ACF
- Utilizes fact that only covariances are modeled by process, doesn’t need normality

- Alternative 2: Convert to infinite AR form and estimate by least squares, truncating at some finite order
- Not exact, but since decay usually exponential, truncated part is close to negligible and prediction becomes very easy: just use AR formulas

- To efficiently match all features of correlation patterns, can also use
*both*AR and MA - An Autoregressive Moving Average Model of Orders p and q or
**ARMA(p,q)**model has form, for all t, \[y_{t}=\sum_{j=1}^{p}b_jy_{t-j}+e_{t}+\sum_{k=1}^{q}\theta_{k}e_{t-k}\]- Where \(e_t\) is still a mean 0 white noise sequence

- In lag polynomial notation \((1-\sum_{j=1}^{q}b_jL^j)y_t=(1+\sum_{j=1}^{q}\theta_jL^j)e_t\)
- By same equivalency results, an ARMA model is equivalent to an infinite AR model or an infinite MA model
- Can denote infinite MA representation as \(y_t=\psi(L)e_t:=\frac{\theta(L)}{b(L)}e_t\)
- ARMA model allows finite representation for very general patterns

- Estimation again usually by normal (quasi-)likelihood, with recursive algorithm

- In addition to invertibility condition for MA roots, have one more equivalency
- If factorizations of \(b(L)\) and \(\theta(L)\) share a root, can factor out from both sides and get model with exact same predictions
- Not a problem for prediction: just present in factorized form
- Estimation can behave weirdly when roots are “close” (worse approximation, etc)

- Condition for stationarity or an ARMA model is that the AR part satisfy conditions for stationarity of AR model
- All roots of lag polynomial \(b(x)=(1-\sum_{j=1}^{q}b_jL^j)\) are outside the unit circle

- In the case of d unit roots, differencing \(y_t\) d times can restore stationarity
- If \(\Delta^d y_t\) is an ARMA(p,q) process, \(y_t\) is called an
**ARIMA(p,d,q)**process - ARIMA model allows long run random trend, plus very general short run patterns
- Exactly the same unit root tests as in AR case apply: run Phillips-Perron, KPSS, or ADF to determine d
`auto.arima`

executes following steps- Test \(y_t\) for unit roots by KPSS test, differencing if found to be nonstationary, repeating until stationarity
- Represent ARMA (quasi)likelihood by recursive algorithm at different orders \((p,q)\)
- Use AICc to choose orders, then estimate \(b,\theta\) by maximizing (quasi)likelihood

- This takes statistical approach to prediction: looks for best model of data
- Performs well for mean forecasting over models close to ARIMA class

- Bayesian approach requires priors over all AR and MA coefficients
- Often use infinite AR representation for simplicity: use priors to discipline large number of coefficients

- Use ARIMA around around trend model to account for deterministic growth, seasonality, etc
- Can also extend model to add
*non-deterministic*seasonal patterns - For a series of frequency \(m\), may have particular relationships across intervals of length \(m\) or regular subsets
- Can create seasonal versions of ARIMA models by allowing relationships across \(m\) lags
- Useful for modeling commonly seen spikes in ACF at seasonal intervals
- Eg. strong December sales one year may be followed by strong December sales next year, on average

- Seasonal AR(1) is \(y_t=B_1y_{t-m}+e_t\), seasonal MA(1) is \(y_t=e_t+\Theta_1e_{t-m}\), seasonal first difference is \(y_t-y_{t-m}\)
- Can combine and extnd to higher orders to create Seasonal ARIMA (P,D,Q)
- \((1-\sum_{j=1}^{P}B_jL^{mj})(1-L^m)^Dy_t=(1+\sum_{k=1}^{Q}\Theta_jL^{mj})e_t\)

- Can add on top of standard ARIMA to match seasonal and cyclical patterns by multiplying lag polynomials
- Seasonal ARIMA(p,d,q)(P,D,Q) takes form \((1-\sum_{n=1}^{p}b_nL^{n})(1-\sum_{j=1}^{P}B_jL^{mj})(1-L)^d(1-L^m)^Dy_t=(1+\sum_{k=1}^{q}\Theta_jL^{j})(1+\sum_{k=1}^{Q}\Theta_jL^{mj})e_t\)

- Estimation similar to standard ARIMA: test for integration order, use quasi-maximum likelihood to fit coefficients
- Seasonal components permitted by default in
`auto.arima`

: can exclude if series known to be deseasonalized

- Seasonal components permitted by default in

- Let’s take the 7 series from last class forecasting exercise, following Litterman (1986)
- GNP Growth, Inflation, Unemployment, M1 Money Stock, Private Fixed Investment, Commercial Paper Interest Rates, and Inventory Growth quarterly from 1971-2020Q1

- Choose level of differencing using tests, then compare (by AICc) AR only, MA only, and ARMA choices for each series
- Implement by
`auto.arima`

with restrictions to AR order p or MA order q

- Implement by
- Series restricted to AR or MA only appear to need more parameters than if allowed to use both
- Unrestricted model for inflation is ARIMA(0,1,1), but if forced to MA(0), select AR(4): need many AR coefficients to approximate MA property
- Unemployment has reverse: choose ARIMA(1,1,0), but need MA(3) to match if AR order restricted to 0

- Resulting forecasts of most series similar across specifications due to rapid mean reversion
- MA reverts to mean (or trend if series integrated or trending) in finite time, AR reverts over infinite time, but distance decays exponentially fast
- Difference can be more substantial if mean reversion slow, with AR near but not at unit root

```
#Libraries
library(fredr) # Data from FRED API
library(fpp2) #Forecasting and Plotting tools
library(vars) #Vector Autoregressions
library(knitr) #Use knitr to make tables
library(kableExtra) #Extra options for tables
library(dplyr) #Data Manipulation
library(tseries) #Time series functions including stationarity tests
library(gridExtra) #Graph Display
# Package "BMR" for BVAR estimation is not on CRAN, but is instead maintained by an individual
# It must be installed directly from the Github repo: uncomment the following code to do so
# library(devtools) #Library to allow downloading packages from Github
# install_github("kthohr/BMR")
# library(BMR) #Bayesian Macroeconometrics in R
# Note that if running this code on Kaggle, internet access must be enabled to download and install the package
# If installed locally, there may be difficulties due to differences in your local environment (in particular, versions of C++)
# For this reason, relying local installation is not recommended unless you have a spare afternoon to dig through help files
#An alternative, similar library, is BVAR: it is on CRAN
library(BVAR) #Bayesian Vector Autoregressions
##Obtain and transform NIPA Data (cf Lecture 08)
fredr_set_key("8782f247febb41f291821950cf9118b6") #Key I obtained for this class
## Load Series: Series choices and names as in Litterman (1986)
RGNP<-fredr(series_id = "GNPC96",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29"),
units="cch") #Real Gross National Product, log change
INFLA<-fredr(series_id = "GNPDEF",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29"),
units="cch") #GNP Deflator, log change
UNEMP<-fredr(series_id = "UNRATE",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29"),
frequency="q") #Unemployment Rate, quarterly
M1<-fredr(series_id = "M1SL",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29"),
frequency="q",
units="log") #Log M1 Money Stock, quarterly
INVEST<-fredr(series_id = "GPDI",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29"),
units="log") #Log Gross Domestic Private Investment
# The 4-6 month commercial paper rate series used in Litterman (1986) has been discontinued:
# For sample continuity, we merge the series for 3 month commercial paper rates from 1971-1997 with the 3 month non-financial commercial paper rate series
# This series also has last start date, so it dictates start date for series
CPRATE1<-fredr(series_id = "WCP3M",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("1996-10-01"),
vintage_dates = as.Date("2021-03-29"),
frequency="q") #3 Month commercial paper rate, quarterly, 1971-1997
CPRATE2<-fredr(series_id = "CPN3M",
observation_start = as.Date("1997-01-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29"),
frequency="q") #3 Month AA nonfinancial commercial paper rate, quarterly, 1997-2018
CPRATE<-full_join(CPRATE1,CPRATE2) #Merge 2 series to create continuous 3 month commercial paper rate series from 1971-2018
CBI<-fredr(series_id = "CBI",
observation_start = as.Date("1971-04-01"),
observation_end = as.Date("2020-01-01"),
vintage_dates = as.Date("2021-03-29")) #Change in Private Inventories
#Format the series as quarterly time series objects, starting at the first date
rgnp<-ts(RGNP$value,frequency = 4,start=c(1971,2),names="Real Gross National Product")
infla<-ts(INFLA$value,frequency = 4,start=c(1971,2),names="Inflation")
unemp<-ts(UNEMP$value,frequency = 4,start=c(1971,2),names="Unemployment")
m1<-ts(M1$value,frequency = 4,start=c(1971,2),names="Money Stock")
invest<-ts(INVEST$value,frequency = 4,start=c(1971,2),names="Private Investment")
cprate<-ts(CPRATE$value,frequency = 4,start=c(1971,2),names="Commercial Paper Rate")
cbi<-ts(CBI$value,frequency = 4,start=c(1971,2),names="Change in Inventories")
#Express as a data frame
macrodata<-data.frame(rgnp,infla,unemp,m1,invest,cprate,cbi)
nlags<-6 # Number of lags to use
nseries<-length(macrodata[1,]) #Number of series used
Series<-c("Real GNP Growth","Inflation","Unemployment","Money Stock","Private Investment","Commercial Paper Rate","Change in Inventories")
```

```
#Use auto.arima to choose AR order after KPSS test without trend
#Do this also for MA, and for ARMA
ARIstatmodels<-list()
IMAstatmodels<-list()
ARIMAstatmodels<-list()
Integrationorder<-list()
ARorder<-list()
MAorder<-list()
ARorder2<-list()
MAorder2<-list()
for (i in 1:nseries){
ARIstatmodels[[i]]<-auto.arima(macrodata[,i],max.q=0,seasonal=FALSE) #Apply auto.arima set to (nonseasonal) ARI only
IMAstatmodels[[i]]<-auto.arima(macrodata[,i],max.p=0,seasonal=FALSE) #Apply auto.arima set to (nonseasonal) IMA only
ARIMAstatmodels[[i]]<-auto.arima(macrodata[,i],seasonal=FALSE) #Apply auto.arima set to (nonseasonal) ARIMA
Integrationorder[i]<-ARIMAstatmodels[[i]]$arma[6] #Integration order chosen (uses KPSS Test)
ARorder[i]<-ARIstatmodels[[i]]$arma[1] #AR order chosen in AR only (uses AICc)
MAorder[i]<-IMAstatmodels[[i]]$arma[2] #MA order chosen in MA only (uses AICc)
ARorder2[i]<-ARIMAstatmodels[[i]]$arma[1] #AR order chosen in ARMA (uses AICc)
MAorder2[i]<-ARIMAstatmodels[[i]]$arma[2] #MA order chosen in ARMA (uses AICc)
}
```

```
armamodels<-data.frame(as.numeric(Integrationorder),as.numeric(ARorder),
as.numeric(MAorder),as.numeric(ARorder2),as.numeric(MAorder2))
rownames(armamodels)<-Series
colnames(armamodels)<-c("d","p (AR only)","q (MA only)","p (ARMA)","q (ARMA)")
armamodels %>%
kable(caption="Autoregression, Moving Average, and ARMA Models") %>%
kable_styling(bootstrap_options = "striped")
```

d | p (AR only) | q (MA only) | p (ARMA) | q (ARMA) | |
---|---|---|---|---|---|

Real GNP Growth | 0 | 2 | 3 | 1 | 1 |

Inflation | 1 | 4 | 1 | 0 | 1 |

Unemployment | 1 | 1 | 3 | 1 | 0 |

Money Stock | 1 | 2 | 3 | 1 | 1 |

Private Investment | 1 | 1 | 1 | 1 | 0 |

Commercial Paper Rate | 1 | 3 | 2 | 0 | 2 |

Change in Inventories | 1 | 0 | 0 | 0 | 0 |

```
#Construct Forecasts of Each Series by Univariate ARI, IMA, ARIMA models, with 95% confidence intervals
ARIfcsts<-list()
ARIMAfcsts<-list()
IMAfcsts<-list()
for (i in 1:nseries) {
ARIfcsts[[i]]<-forecast::forecast(ARIstatmodels[[i]],h=20,level=95)
ARIMAfcsts[[i]]<-forecast::forecast(ARIMAstatmodels[[i]],h=20,level=95)
IMAfcsts[[i]]<-forecast::forecast(IMAstatmodels[[i]],h=20,level=95)
}
forecastplots<-list()
for (i in 1:nseries){
pastwindow<-window(macrodata[,i],start=c(2000,1))
#Plot all forecasts
forecastplots[[i]]<-autoplot(pastwindow)+
autolayer(ARIMAfcsts[[i]],alpha=0.4,series="ARIMA")+
autolayer(ARIfcsts[[i]],alpha=0.4,series="ARI")+
autolayer(IMAfcsts[[i]],alpha=0.4,series="IMA")+
labs(x="Date",y=colnames(macrodata)[i],title=Series[i])
}
grid.arrange(grobs=forecastplots,nrow=4,ncol=2)
```