This R session will illustrate the incorporation of prior information in the estimation of $$\boldsymbol{\mu}$$ and $$\boldsymbol{\Sigma}$$.

(Useful R links: Cookbook R, Quick-R, R documentation, CRAN, METACRAN.)

# Shrinkage

Suppose we have an $$N$$-dimensional i.i.d. Gaussian time series $\mathbf{x}_t \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma}).$ The sample estimators for the mean and covariance matrix are, respectively, the sample mean $\hat{\boldsymbol{\mu}} = \frac{1}{T}\sum_{t=1}^T \mathbf{x}_t$ and the sample covariance matrix $\hat{\boldsymbol{\Sigma}} = \frac{1}{T-1}\sum_{t=1}^T (\mathbf{x}_t - \hat{\boldsymbol{\mu}})^T(\mathbf{x}_t - \hat{\boldsymbol{\mu}}).$ We will next explore the idea of shrinkage in order to improve those sample estimators when $$T$$ is not large enough.

## Shrinkage in $$\boldsymbol{\mu}$$

The James-Stein estimator is $\hat{\boldsymbol{\mu}}^\textsf{JS} = (1-\rho) \hat{\boldsymbol{\mu}} + \rho \mathbf{t}$ where $$\mathbf{t}$$ is the shrinkage target and $$0\le\rho\le1$$ is the amount of shrinkage. Common choices for the target $$\mathbf{t}$$ are:

• $$\mathbf{t} = \mathbf{0}$$
• $$\mathbf{t} = 0.1 \times \mathbf{1}$$
• $$\mathbf{t} = \frac{\mathbf{1}^T \hat{\boldsymbol{\mu}}}{N} \times \mathbf{1}$$ (grand mean target)
• $$\mathbf{t} = \frac{\mathbf{1}^T\hat{\boldsymbol{\Sigma}}^{-1}\hat{\boldsymbol{\mu}}}{\mathbf{1}^T\hat{\boldsymbol{\Sigma}}^{-1}\mathbf{1}} \times \mathbf{1}$$ (volatility-weighted mean)

A choice of $$\rho$$ that improves the estimatior error is $\rho = \frac{1}{T}\frac{N\bar{\boldsymbol{\lambda}}-2\boldsymbol{\lambda}_\textsf{max}}{\|\hat{\boldsymbol{\mu}} - \mathbf{t}\|^2}$ where $$\bar{\boldsymbol{\lambda}}$$ and $$\boldsymbol{\lambda}_\textsf{max}$$ are the average and maximum values, respectively, of the eigenvalues of $$\boldsymbol{\Sigma}$$ (or $$\hat{\boldsymbol{\Sigma}}$$).

We will first explore this estimator using synthetic data. Let’s start generating the data:

# generate Gaussian synthetic return data
library(mvtnorm)
set.seed(357)
N <- 100
T <- 120
mu <- runif(N)
U <- t(rmvnorm(n = round(0.7*N), sigma = 0.1*diag(N)))
Sigma <- U %*% t(U) + diag(N)
X <- rmvnorm(n = T, mean = mu, sigma = Sigma)
str(X)
#>  num [1:120, 1:100] 3.57 0.19 -0.807 5.042 0.281 ...

# sample estimates (sample mean and sample covariance matrix)
mu_sm <- colMeans(X)
Sigma_scm <- cov(X)

Now let’s try the James-Stein estimator with the four targets:

# define the four targets
t_1 <- rep(0, N)
t_2 <- rep(0.1, N)
t_3 <- rep(mean(mu_sm), N)
t_4 <- rep(sum(solve(Sigma_scm, mu_sm))/sum(solve(Sigma_scm, rep(1, N))), N)

# compute the corresponding four rho's
lambdas <- eigen(Sigma_scm)\$values
lmd_mean <- mean(lambdas)
lmd_max <- max(lambdas)
rho_1 <- (1/T)*(N*lmd_mean - 2*lmd_max)/norm(mu_sm - t_1, "2")^2
rho_2 <- (1/T)*(N*lmd_mean - 2*lmd_max)/norm(mu_sm - t_2, "2")^2
rho_3 <- (1/T)*(N*lmd_mean - 2*lmd_max)/norm(mu_sm - t_3, "2")^2
rho_4 <- (1/T)*(N*lmd_mean - 2*lmd_max)/norm(mu_sm - t_4, "2")^2

# finally the James-Stein estimators
mu_JS_1 <- (1-rho_1)*mu_sm + rho_1*t_1
mu_JS_2 <- (1-rho_2)*mu_sm + rho_2*t_2
mu_JS_3 <- (1-rho_3)*mu_sm + rho_3*t_3
mu_JS_4 <- (1-rho_4)*mu_sm + rho_4*t_4

Let’s now compare the estimation error measured in terms of squared error and PRIAL (percentage relative improvement in average loss): \begin{align} \textsf{SE} &= \|\hat{\boldsymbol{\mu}}^\textsf{JS} - \boldsymbol{\mu}\|^2\\ \textsf{PRIAL} &= \frac{\|\hat{\boldsymbol{\mu}} - \boldsymbol{\mu}\|^2 - \|\hat{\boldsymbol{\mu}}^\textsf{JS} - \boldsymbol{\mu}\|^2}{\|\hat{\boldsymbol{\mu}} - \boldsymbol{\mu}\|^2} \end{align}

# compute squared error and PRIAL
SE_mu <- c(norm(mu_sm - mu, "2&quo