MAFS6010R - Portfolio Optimization with R
MSc in Financial Mathematics
Fall 2018-19, HKUST, Hong Kong

This R session will illustrate the different models for multivariate financial time series, in particular, for the conditional mean and conditional covariance matrix or volatility.

(Useful R links: Cookbook R, Quick-R, R documentation, CRAN, METACRAN.)

The following R packages are widely used by the R community:

**rugarch**: [see intro to rugarch] probably the best and most comprehensive package with a neverending list of different ARMA and GARCH models, including different distributions for the residual, namely, Gaussian/normal distribution, Student distribution, generalized error distributions, skewed distributions, generalized hyperbolic distributions, generalized hyperbolic skewed Student distribution, Johnson’s reparameterized SU distribution, as well as utilities for simulating, forecasting, and plotting. It includes the following GARCH models:- standard GARCH model (’sGARCH’)
- integrated GARCH model (’iGARCH’)
- exponential GARCH model
- GJR-GARCH model (’gjrGARCH’)
- asymmetric power ARCH model (’apARCH’)
- Exponential GARCH model
- family GARCH model (’fGARCH’)
- Absolute Value GARCH (AVGARCH) model (submodel = ’AVGARCH’)
- GJR GARCH (GJRGARCH) model (submodel = ’GJRGARCH’)
- Threshold GARCH (TGARCH) model (submodel = ’TGARCH’)
- Nonlinear ARCH model (submodel = ’NGARCH’)
- Nonlinear Asymmetric GARCH model (submodel = ’NAGARCH’)
- Asymmetric Power ARCH model (submodel = ’APARCH’)
- Full fGARCH model (submodel = ’ALLGARCH’)

- Component sGARCH model (’csGARCH’)
- Multiplicative Component sGARCH model (’mcsGARCH’)
- realized GARCH model (’realGARCH’)
- fractionally integrated GARCH model (’fiGARCH’)

**fArma**: this is a popular package for ARMA modeling**forecast**: also a popular package for ARMA modeling**fGarch**: this is a popular package for GARCH models**rmgarch**: extension of**rugarch**to the multivariate case. It includes the models: CCC, DCC, GARCH-Copula, GO-GARCH. [Vignette]**MTS**: multivariate time series modeling based on Tsay’s book.

There are many other packages worth exploring such as **stochvol** (stochastic volatility is an alternative modeling to GARCH), **prophet**, **fpp2** (includes Holt-Winter modeling and much more), **tsDyn**, **vars**, **rucm**, etc. For a full list of packages for time series modeling, check the CRAN Task View: Time Series Analysis.

The book Forecasting: principles and practice is available online and also comes with the R package **fpp2**.

We will start with the simple i.i.d. model as a warm-up. The i.i.d. model assumes an \(N\)-dimensional Gaussian time series for the log-returns \(\mathbf{x}_t\): \[ \mathbf{x}_t \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma}). \] The sample estimators for the mean and covariance matrix are, respectively, the sample mean \[ \hat{\boldsymbol{\mu}} = \frac{1}{T}\sum_{t=1}^T \mathbf{x}_t \] and the sample covariance matrix \[ \hat{\boldsymbol{\Sigma}} = \frac{1}{T-1}\sum_{t=1}^T (\mathbf{x}_t - \hat{\boldsymbol{\mu}})^T(\mathbf{x}_t - \hat{\boldsymbol{\mu}}). \]

We will always start by generating synthetic data to become familiar with the procedure and to make sure the estimation process gives the correct result (i.e., sanity check). Only once everything is under control we will move to real market data and fit the different models.

Let’s generate synthetic i.i.d. data and estimate the mean and covariance matrix:

```
# generate Gaussian synthetic return data
library(mvtnorm)
set.seed(357)
N <- 100
T <- 120
mu <- runif(N)
U <- t(rmvnorm(n = round(0.7*N), sigma = 0.1*diag(N)))
Sigma <- U %*% t(U) + diag(N)
X <- rmvnorm(n = T, mean = mu, sigma = Sigma)
# sample estimates (sample mean and sample covariance matrix)
mu_sm <- colMeans(X)
Sigma_scm <- cov(X)
# error
norm(mu_sm - mu, "2")
#> [1] 2.443633
norm(Sigma_scm - Sigma, "F")
#> [1] 70.79889
```

Now, let’s do it again for different number of observations \(T\):

```
# first generate all the data
set.seed(357)
N <- 100
T_max <- 1000
mu <- runif(N)
U <- t(rmvnorm(n = round(0.7*N), sigma = 0.1*diag(N)))
Sigma <- U %*% t(U) + diag(N)
X <- rmvnorm(n = T_max, mean = mu, sigma = Sigma)
# now loop over subsets of the samples
error_mu_vs_T <- NULL
error_Sigma_vs_T <- NULL
T_sweep <- ceiling(seq(1.01*N, T_max, length.out = 20))
for (T in T_sweep) {
X_ <- X[1:T, ]
# sample estimates
mu_sm <- colMeans(X_)
Sigma_scm <- cov(X_)
# compute errors
error_mu_vs_T <- c(error_mu_vs_T, norm(mu_sm - mu, "2"))
error_Sigma_vs_T <- c(error_Sigma_vs_T, norm(Sigma_scm -
```