MAFS5310 - Portfolio Optimization with R
MSc in Financial Mathematics
The Hong Kong University of Science and Technology (HKUST)
Fall 2020-21

Outline

  • Introduction

  • Warm-Up: Markowitz’s Portfolio

    • Signal model
    • Markowitz’s formulation
    • Drawbacks of Markowitz portfolio
  • Alternative Measures of Risk: DR, VaR, CVaR, and DD

  • Mean-DR portfolio

  • Mean-CVaR portfolio

  • Mean-DD portfolio

  • Conclusions

Introduction

Motivation

Markowitz’s portfolio has never been fully embraced by practitioners, among other reasons because

  1. variance is not a good measure of risk in practice since it penalizes both the unwanted high losses and the desired low losses: the solution is to use alternative measures for risk, e.g., VaR and CVaR,

  2. it is highly sensitive to parameter estimation errors (i.e., to the covariance matrix \(\boldsymbol{\Sigma}\) and especially to the mean vector \(\boldsymbol{\mu}\)): solution is robust optimization and improved parameter estimation,

  3. it only considers the risk of the portfolio as a whole and ignores the risk diversification (i.e., concentrates risk too much in few assets, this was observed in the 2008 financial crisis): solution is the risk parity portfolio.



👉 We will here consider more meaningful measures for risk than the variance, like the downside risk (DR), Value-at-Risk (VaR), Conditional VaR (CVaR) or Expected Shortfall (ES), and drawdown (DD).

Warm-Up: Markowitz Portfolio

Signal model

Returns

  • Let us denote the log-returns of \(N\) assets at time \(t\) with the vector \(\mathbf{r}_{t}\in\mathbb{R}^{N}\) (i.e., \(r_{it}=\log{p_{i,t}}-\log{p_{i,t-1}}\)).

  • Note that the log-returns are almost the same as the linear returns \(R_{it}=\frac{p_{i,t}-p_{i,t-1}}{p_{i,t-1}}\), i.e., \(r_{it}\approx R_{it}\).

  • The time index \(t\) can denote any arbitrary period such as days, weeks, months, 5-min intervals, etc.

  • \(\mathcal{F}_{t-1}\) denotes the previous historical data.

  • Econometrics aims at modeling \(\mathbf{r}_{t}\) conditional on \(\mathcal{F}_{t-1}\).

  • \(\mathbf{r}_{t}\) is a multivariate stochastic process with conditional mean and covariance matrix denoted as (Feng and Palomar 2016) \[\begin{aligned} \boldsymbol{\mu}_{t} &\triangleq\textsf{E}\left[\mathbf{r}_{t}\mid\mathcal{F}_{t-1}\right]\\ \boldsymbol{\Sigma}_{t} &\triangleq\textsf{Cov}\left[\mathbf{r}_{t}\mid\mathcal{F}_{t-1}\right]=\textsf{E}\left[(\mathbf{r}_{t}-\boldsymbol{\mu}_{t})(\mathbf{r}_{t}-\boldsymbol{\mu}_{t})^{T}\mid\mathcal{F}_{t-1}\right]. \end{aligned}\]

i.i.d. model

  • For simplicity we will assume that \(\mathbf{r}_{t}\) follows an i.i.d. distribution (which is not very innacurate in general).


  • That is, both the conditional mean and conditional covariance are constant: \[\begin{aligned} \boldsymbol{\mu}_{t} &= \boldsymbol{\mu},\\ \boldsymbol{\Sigma}_{t} &= \boldsymbol{\Sigma}. \end{aligned}\]


  • Very simple model, however, it is one of the most fundamental assumptions for many important works, e.g., the Nobel prize-winning Markowitz portfolio theory (Markowitz 1952).

Parameter estimation

  • Consider the i.i.d. model: \[\mathbf{r}_{t}=\boldsymbol{\mu}+\mathbf{w}_{t},\] where \(\boldsymbol{\mu}\in\mathbb{R}^{N}\) is the mean and \(\mathbf{w}_{t}\in\mathbb{R}^{N}\) is an i.i.d. process with zero mean and constant covariance matrix \(\boldsymbol{\Sigma}\).

  • The mean vector \(\boldsymbol{\mu}\) and covariance matrix \(\boldsymbol{\Sigma}\) have to be estimated in practice based on \(T\) observations.

  • The simplest estimators are the sample estimators:

    • sample mean: \(\quad\hat{\boldsymbol{\mu}} =\frac{1}{T}\sum_{t=1}^{T}\mathbf{r}_{t}\)
    • sample covariance matrix: \(\quad\hat{\boldsymbol{\Sigma}} =\frac{1}{T-1}\sum_{t=1}^{T}(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}})(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}})^{T}.\)
  • Many more sophisticated estimators exist, namely: shrinkage estimators, Black-Litterman estimators, etc.

Parameter estimation

  • The parameter estimates \(\hat{\boldsymbol{\mu}}\) and \(\hat{\boldsymbol{\Sigma}}\) are only good for large \(T\), otherwise the estimation error is unacceptable.

  • For instance, the sample mean is particularly a very inefficient estimator, with very noisy estimates (Meucci 2005).

  • In practice, \(T\) cannot be large enough due to either:

    • unavailability of data or
    • lack of stationarity of data.
  • As a consequence, the estimates contain too much estimation error and a portfolio design (e.g., Markowitz mean-variance) based on those estimates can be severely affected (Chopra and Ziemba 1993).

  • Indeed, this is why Markowitz portfolio and other extensions are rarely used by practitioners.

Markowitz formulation

Portfolio return

  • Suppose the capital budget is \(B\) dollars.

  • The portfolio \(\mathbf{w}\in\mathbb{R}^{N}\) denotes the normalized dollar weights of the \(N\) assets such that \(\mathbf{1}^{T}\mathbf{w}=1\) (so \(B\mathbf{w}\) denotes dollars invested in the assets).

  • For each asset \(i\), the initial wealth is \(Bw_{i}\) and the end wealth is \[Bw_{i}\left(p_{i,t}/p_{i,t-1}\right)=Bw_{i}\left(R_{it}+1\right).\]

  • Then the portfolio return is \[R_{t}^{p}= \frac{\sum_{i=1}^{N}Bw_{i}\left(R_{it}+1\right)-B}{B}=\sum_{i=1}^{N}w_{i}R_{it}\approx\sum_{i=1}^{N}w_{i}r_{it}=\mathbf{w}^{T}\mathbf{r}_{t}\]

  • The portfolio expected return and variance are \(\mathbf{w}^{T}\boldsymbol{\mu}\) and \(\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\), respectively.

Performance measures

  • Expected return: \(\mathbf{w}^{T}\boldsymbol{\mu}\)

  • Volatility: \(\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}\)

  • Sharpe Ratio (SR): expected excess return per unit of risk \[\mathsf{SR} =\frac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\] where \(r_{f}\) is the risk-free rate (e.g., interest rate on a three-month U.S. Treasury bill).

  • Information Ratio (IR): SR with respect to a benchmark (e.g., the market index): \(\mathsf{IR} =\frac{\textsf{E}\left[\mathbf{w}^T\mathbf{r}_t - r_{b,t}\right]}{\sqrt{\textsf{Var}\left[\mathbf{w}^T\mathbf{r}_t - r_{b,t}\right]}}\).

  • Drawdown: decline from a historical peak of the cumulative profit \(X(t)\): \[D(T)=\max_{t\in[0,T]}X(t)-X(T)\]

  • VaR (Value at Risk): quantile of the loss.

  • ES (Expected Shortfall) or CVaR (Conditional Value at Risk): expected value of the loss above some quantile.

Practical constraints

  • Capital budget constraint: \[\mathbf{1}^T\mathbf{w} = 1.\]

  • Long-only constraint: \[\mathbf{w} \geq 0.\]

  • Dollar-neutral or self-financing constraint: \[\mathbf{1}^T\mathbf{w} = 0.\]

  • Holding constraint: \[\mathbf{l}\leq\mathbf{w}\leq \mathbf{u}\] where \(\mathbf{l}\in\mathbb{R}^{N}\) and \(\mathbf{u}\in\mathbb{R}^{N}\) are lower and upper bounds of the asset positions, respectively.

Practical constraints

  • Leverage constraint: \[\left\Vert \mathbf{w}\right\Vert _{1}\leq L.\]

  • Cardinality constraint: \[\left\Vert \mathbf{w}\right\Vert _{0} \leq K.\]

  • Turnover constraint: \[\left\Vert \mathbf{w}-\mathbf{w}_{0}\right\Vert _{1} \leq u\] where \(\mathbf{w}_{0}\) is the currently held portfolio.


  • Market-neutral constraint: \[\boldsymbol{\beta}^T\mathbf{w} = 0.\]

Risk control

  • In finance, the expected return \(\mathbf{w}^{T}\boldsymbol{\mu}\) is very relevant as it quantifies the average benefit.


  • However, in practice, the average performance is not enough to characterize an investment and one needs to control the probability of going bankrupt.


  • Risk measures control how risky an investment strategy is.


  • The most basic measure of risk is given by the variance (Markowitz 1952): a higher variance means that there are large peaks in the distribution which may cause a big loss.


  • There are more sophisticated risk measures such as downside risk, VaR, ES, etc.

Mean-variance tradeoff

  • The mean return \(\mathbf{w}^{T}\boldsymbol{\mu}\) and the variance (risk) \(\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\) (equivalently, the standard deviation or volatility \(\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}\)) constitute two important performance measures.


  • Usually, the higher the mean return the higher the variance and vice-versa.


  • Thus, we are faced with two objectives to be optimized: it is a multi-objective optimization problem.


  • They define a fundamental mean-variance tradeoff curve (Pareto curve).


  • The choice of a specific point in this tradeoff curve depends on how agressive or risk-averse the investor is.

Mean-variance tradeoff

Markowitz mean-variance portfolio (1952)

  • The idea of the Markowitz mean-variance portfolio (MVP) (Markowitz 1952) is to find a trade-off between the expected return \(\mathbf{w}^{T}\boldsymbol{\mu}\) and the risk of the portfolio measured by the variance \(\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu}-\lambda\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1 \end{array}\] where \(\mathbf{w}^{T}\mathbf{1}=1\) is the capital budget constraint and \(\lambda\) is a parameter that controls how risk-averse the investor is.
  • This is a convex quadratic problem (QP) with only one linear constraint which admits a closed-form solution: \[\mathbf{w}_{\sf MVP} = \frac{1}{2\lambda}\boldsymbol{\Sigma}^{-1}\left(\boldsymbol{\mu}+\nu\mathbf{1}\right),\] where \(\nu\) is the optimal dual variable \(\nu=\frac{2\lambda-\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\boldsymbol{\mu}}{\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\mathbf{1}}\).

Global Minimum Variance Portfolio (GMVP)

  • The global minimum variance portfolio (GMVP) ignores the expected return and focuses on the risk only: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1. \end{array}\]

  • It is a simple convex QP with solution \[\mathbf{w}_{\sf GMVP}=\frac{1}{\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\mathbf{1}}\boldsymbol{\Sigma}^{-1}\mathbf{1}.\]

  • It is widely used in academic papers for simplicity of evaluation and comparison of different estimators of the covariance matrix \(\boldsymbol{\Sigma}\) (while ignoring the estimation of \(\boldsymbol{\mu}\)).

R session: Loading market data

We will load some stock market data and divide it into a training part (for the estimation of the expected return \(\boldsymbol{\mu}\) and covariance matrix \(\boldsymbol{\Sigma}\), and subsequent portfolio design) and a test part (for the out-of-sample performance evaluation).

In particular, we will start by loading some stock data from three different sectors:

library(xts)  # to manipulate time series of stock data
library(quantmod)  # to download stock data
library(PerformanceAnalytics)  # to compute performance measures

# download data from YahooFinance
stock_namelist <- c("AAPL", "AMD", "ADI",  "ABBV", "AEZS", "A",  "APD", "AA","CF")
prices <- xts()
for (i in 1:length(stock_namelist)) {
  tmp <- Ad(getSymbols(stock_namelist[i], from = "2013-01-01", to = "2018-12-31", auto.assign = FALSE))
  tmp <- na.approx(tmp, na.rm = FALSE)  # interpolate NAs
  prices <- cbind(prices, tmp)
}
colnames(prices) <- stock_namelist
indexClass(prices) <- "Date"
str(prices)
R>> An 'xts' object on 2013-01-02/2016-12-30 containing:
R>>   Data: num [1:1008, 1:9] 55.5 54.8 53.2 52.9 53.1 ...
R>>  - attr(*, "dimnames")=List of 2
R>>   ..$ : NULL
R>>   ..$ : chr [1:9] "AAPL" "AMD" "ADI" "ABBV" ...
R>>   Indexed by objects of class: [Date] TZ: UTC
R>>   xts Attributes:  
R>>  NULL
head(prices)
R>>                AAPL  AMD      ADI     ABBV AEZS        A      APD       AA       CF
R>> 2013-01-02 55.47129 2.53 37.50844 27.60817  253 27.97831 66.43760 20.62187 29.44429
R>> 2013-01-03 54.77112 2.49 36.90318 27.38020  254 28.07852 66.20545 20.80537 29.30506
R>> 2013-01-04 53.24548 2.59 36.24679 27.03431  257 28.63300 67.09529 21.24121 29.96146
R>> 2013-01-07 52.93227 2.67 36.35761 27.08934  259 28.42591 67.03341 20.87419 29.84922
R>> 2013-01-08 53.07474 2.67 35.98254 26.49976  255 28.19877 67.15720 20.87419 29.41162
R>> 2013-01-09 52.24524 2.63 35.88875 26.64913  258 28.96036 68.06254 20.82831 30.44025
tail(prices)
R>>                AAPL   AMD      ADI     ABBV AEZS        A      APD    AA       CF
R>> 2016-12-22 111.8445 11.60 70.07727 55.82264 4.10 44.91877 136.0360 29.75 27.82244
R>> 2016-12-23 112.0657 11.58 70.44936 56.43826 4.10 45.14350 136.3992 29.71 28.37686
R>> 2016-12-27 112.7774 12.07 70.89778 56.58311 4.05 45.44642 137.2373 29.65 29.52265
R>> 2016-12-28 112.2965 11.55 70.18222 56.37488 3.55 44.67448 135.1328 29.43 29.22696
R>> 2016-12-29 112.2676 11.59 70.20129 56.79134 3.60 44.72544 135.2359 28.89 29.47645
R>> 2016-12-30 111.3924 11.34 69.28539 56.69175 3.60 44.64704 134.7206 28.08 29.08836
# compute log-returns and linear returns
X_log <- diff(log(prices))[-1]
X_lin <- (prices/lag(prices) - 1)[-1]

# or alternatively...
X_log <- CalculateReturns(prices, "log")[-1]
X_lin <- CalculateReturns(prices)[-1]

N <- ncol(X_log)  # number of stocks
T <- nrow(X_log)  # number of days

We can take a look at the prices of the stocks:

plot(prices/rep(prices[1, ], each = nrow(prices)), col = rainbow10equal, legend.loc = "topleft",
     main = "Normalized prices")

We now divide the data into a training set and test set:

T_trn <- round(0.7*T)  # 70% of data
X_log_trn <- X_log[1:T_trn, ]
X_log_tst <- X_log[(T_trn+1):T, ]
X_lin_trn <- X_lin[1:T_trn, ]
X_lin_tst <- X_lin[(T_trn+1):T, ]

We can now use the training set to obtain the sample estimates from the returns \(\mathbf{x}_t\) (i.e., sample means and sample covariance matrix) as \[ \begin{align} \hat{\boldsymbol{\mu}} & = \frac{1}{T}\sum_{t=1}^T \mathbf{x}_t\\ \hat{\boldsymbol{\Sigma}} & = \frac{1}{T-1}\sum_{t=1}^T (\mathbf{x}_t - \hat{\boldsymbol{\mu}})(\mathbf{x}_t - \hat{\boldsymbol{\mu}})^T \end{align} \]

mu <- colMeans(X_log_trn)
Sigma <- cov(X_log_trn)

R session: Warm-up - Markowitz’s portfolio

Markowitz’s mean-variance portfolio (MVP) with no shorting is formulated as \[ \begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \boldsymbol{\mu}^T\mathbf{w} -\lambda\mathbf{w}^T\mathbf{\Sigma}\mathbf{w}\\ {\textsf{subject to}} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0}. \end{array} \]

For completeness, we can also consider the Global Minimum Variance Portfolio (GMVP), which doesn’t make use of \(\boldsymbol{\mu}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^T\mathbf{\Sigma}\mathbf{w}\\ {\textsf{subject to}} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0}. \end{array}\]

Since a closed-form solution does not exist with the constraint \(\mathbf{w}\ge\mathbf{0}\), we need to resort to a solver. We can convenientky use the package CVXR (although the computational cost will be high and the solution not totally robust, if necessary use a QP solver like quadprog):

library(CVXR)

# create function for GMVP
portolioGMVP <- function(Sigma) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Minimize(quad_form(w, Sigma)), 
                  constraints = list(w >= 0, sum(w) == 1))
  result <- solve(prob)
  w <- as.vector(result$getValue(w))
  names(w) <- colnames(Sigma)
  return(w)
}

portolioMarkowitz <- function(mu, Sigma, lmd = 0.5) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Maximize(t(mu) %*% w - lmd*quad_form(w, Sigma)),
                  constraints = list(w >= 0, sum(w) == 1))
  result <- solve(prob)
  w <- as.vector(result$getValue(w))
  names(w) <- colnames(Sigma)
  return(w)
}

# theses functions can now be used as
w_Markowitz <- portolioMarkowitz(mu, Sigma)
w_GMVP <- portolioGMVP(Sigma)
w_all <- cbind("GMVP"      = w_GMVP, 
               "Markowitz" = w_Markowitz)

We can now compare the allocations of the portfolios:

barplot(t(w_all), col = rainbow10equal[1:2], legend = colnames(w_all), beside = TRUE,
        main = "Portfolio allocation", xlab = "stocks", ylab = "dollars")

Then we can assess the performance (in-sample vs out-of-sample):

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_trn))
R>>           Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> GMVP                 0.1873             0.1576                    1.1886
R>> Markowitz            0.2636             0.2167                    1.2165
t(table.AnnualizedReturns(ret_all_tst))
R>>           Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> GMVP                 0.1430             0.1779                    0.8038
R>> Markowitz            0.0732             0.2020                    0.3624

We can see that the mean-variance Markowitz portfolio performs even worse than the GMVP in the out-of-sample (the in-sample Sharpe ratio is approximately the same though).

Let’s plot the wealth evolution (cumulative PnL) over the whole time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)",
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)

To get a more clear picture, it is useful to plot the drawdown:

{ chart.Drawdown(ret_all, main = "Drawdown of portfolios", 
                 legend.loc = "bottomleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

We can see that the drawdown of Markowitz’s mean-variance portfolio is indeed much worse than that of the GMVP.

Drawbacks of Markowitz’s portfolio

Drawbacks of Markowitz’s formulation

Markowitz’s portfolio has never been fully embraced by practitioners, among other reasons because

  1. variance is not a good measure of risk in practice since it penalizes both the unwanted high losses and the desired low losses: the solution is to use alternative measures for risk, e.g., VaR and CVaR,

  2. it is highly sensitive to parameter estimation errors (i.e., to the covariance matrix \(\boldsymbol{\Sigma}\) and especially to the mean vector \(\boldsymbol{\mu}\)): solution is robust optimization and improved parameter estimation,

  3. it only considers the risk of the portfolio as a whole and ignores the risk diversification (i.e., concentrates risk too much in few assets, this was observed in the 2008 financial crisis): solution is the risk parity portfolio.



👉 We will here consider more meaningful measures for risk than the variance, like the downside risk (DR), Value-at-Risk (VaR), Conditional VaR (CVaR) or Expected Shortfall (ES), and drawdown (DD).

Alternative Measures of Risk: DR, VaR, CVaR, and DD

Variance as risk measure

  • In finance, the mean return is very relevant as it quantifies the average benefit of the investment.

  • However, in practice, the average performance is not good enough and one needs to control the probability of going bankrupt.

  • Risk measures control how risky an investment strategy is.

  • The most basic measure of risk is the variance as considered by Markowitz (1952): a higher variance means that there are large peaks in the risk distribution which may cause a big loss.

  • However, Markowitz himself already recognized and stressed the limitations of the mean-variance analysis (Markowitz 1959).

Alternatives to variance as risk measure

  • Variance is not a good measure of risk in practice since it penalizes both the unwanted high losses and the desired low losses (or gains) (McNeil et al. 2005).

  • Indeed, the mean-variance portfolio framework penalizes up-side and down-side risk equally, whereas most investors don’t mind up-side risk.

  • To overcome the limitations of the variance as risk measure, a number of alternative risk measures have been proposed, for example:

    • Downside Risk (DR)
    • Value-at-Risk (VaR)
    • Conditional Value-at-Risk (CVaR)
    • Drawdown (DD):
      • maximum DD
      • average DD
      • Conditional Drawdown at Risk (CDaR)

Downside risk (DR)

  • Let \(R\) be a random variable representing the return of an asset or portfolio (e.g., \(R=\mathbf{w}^T\mathbf{r}\) where \(\mathbf{r}\) denotes the vector of random returns of the assets).

  • We are familiar with the mean return \(\mu=\mathsf{E}[R]\) and with the variance \(\sigma^2=\mathsf{E}[(R-\mu)^2]\).

  • The idea of downside risk is that the left-handside of the return distribution involves risk while the right-handside contains the better investment opportunities.

  • Interest in downside risk arose in the early 1950s.

  • One example is the semi-variance, already considered by Markowitz (1959).

  • The semi-variance measures the variability of the returns below the mean.

LPM and semi-variance

  • The semi-variance is a special case of the more general lower partial moments (LPM): \[\textsf{LPM} = \mathsf{E}\left[\left((\tau - R)^+\right)^\alpha\right],\] where \((\cdot)^+=\max(0, \cdot)\).
  • The parameter \(\tau\) is termed the disaster level.
  • The parameter \(\alpha\) reflects the investor’s feeling about the relative consequences of falling short of \(\tau\) by various amounts:
    • the value \(\alpha=1\) (which suits a neutral investor) separates risk-seeking (\(0<\alpha<1\)) from risk-averse (\(\alpha>1\)) behavior with regard to returns below the target \(\tau\).
  • By changing the parameters \(\alpha\) and \(\tau\) most downside measures used in practice can be formed.
  • In particular, setting \(\alpha=2\) and \(\tau=\mathsf{E}[R]\) yields the semi-variance (or lower partial variance): \[\textsf{SV} = \mathsf{E}\left[\left((E[R] - R)^+\right)^2\right].\]

Value-at-Risk (VaR)

  • To overcome the drawback of variance, another popular single side risk measurement is the Value-at-Risk (VaR) initially proposed by J.P. Morgan.

  • VaR denotes the maximum loss with a specified confidence level (e.g., confidence level = 95%, period = 1 day).

  • Let \(\xi\) be a random variable representing the loss from a portfolio over some period of time (e.g., \(\xi=-\mathbf{w}^T\mathbf{r}\) where \(\mathbf{r}\) denotes the vector of random returns of the assets).

  • The VaR is defined as \[\mathsf{VaR}_{\alpha} = \inf\left\{\xi_0:\mathsf{Pr}\left(\xi\leq\xi_0\right)\geq\alpha\right\}\] with \(\alpha\) the confidence level, say, \(\alpha=0.95\).

  • However, this measure does not take into account losses exceeding VaR, is nonconvex, and is not subadditive.

Conditional Value-at-Risk (CVaR)

  • The Conditional Value-at-Risk (CVaR) is also called Expected Shortfall (ES).
  • The CVaR takes into account the shape of the losses exceeding the VaR through the average: \[\mathsf{CVaR}_{\alpha} = \mathsf{E}\left[\xi\mid\xi\geq\mathsf{VaR}_{\alpha}\right].\]

Drawdown

  • The drawdown (DD) at time \(t\) is defined as the decline from a historical peak of the cumulative profit \(X(t)\).

  • The unnormalized version is \[D(t)^{\sf unnorm} = \max_{1\le\tau\le t}X(\tau) - X(t).\]

  • But in practice, the normalized version is used: \[D(t) = \frac{{\sf HWM}(t) - X(t)}{{\sf HWM}(t)}\] where \({\sf HWM}(t)\) is the high water mark of \(X(t)\) defined as \[{\sf HWM}(t) = \max_{1\le\tau\le t}X(\tau).\]

Drawdown

Drawdown

  • Then one can define the maximum DD (Max-DD) over a period \(t=1,\ldots,T\) as \[M(T)=\max_{1\le t\le T}D(t)\]

  • Also the average DD (Ave-DD) over a period \(t=1,\ldots,T\) as \[A(T)=\frac{1}{T}\sum_{1\le t\le T}D(t)\]

  • Similarly to the CVaR, we can define the Conditional Drawdown at Risk (CDaR) as the mean of the worst \(100(1-\alpha)\%\) drawdowns.

Mean-DR portfolio

Mean-downside risk portfolio

  • Recall Markowitz mean-variance portfolio formulation: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

  • Instead of using the variance we can use a downside risk measure, obtaining the mean-downside risk formulation (introduced in 1977).

  • For example, the LPM can be approximated as (\(R_t=\mathbf{w}^T\mathbf{r}_t\)) \[E\left[\left((\tau - R)^+\right)^\alpha\right] \approx \frac{1}{T}\sum_{t=1}^T\left((\tau - R_t)^+\right)^\alpha.\]

  • The mean-LPM portfolio formulation is the convex (depending on \(\alpha\)) problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\left(\tau - \mathbf{w}^T\mathbf{r}_t\right)^+\right)^\alpha\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

Mean - semi-variance portfolio

  • In particular, we can approximate the semi-variance as \[\begin{aligned} E\left[\left((E[R] - R)^+\right)^2\right] & \approx \frac{1}{T}\sum_{t=1}^T\left(\left(E[R] - R_t\right)^+\right)^2\\ & \approx \frac{1}{T}\sum_{t=1}^T\left(\left(\frac{1}{T}\sum_{t=1}^TR_t - R_t\right)^+\right)^2 \end{aligned}\]

  • The mean - semi-variance portfolio formulation is the convex QP problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^+\right)^2\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

  • As a curiosity, Markowitz’s mean-variance formulation can be similarly rewritten: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^2\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

Mean-LPM portfolio portfolio with different \(\alpha\)

  • For less risk-averse investors, we can consider the mean-LPM portfolio formulation with \(\alpha=1\), which is an LP: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^+\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]


  • For more risk-averse investors, we can consider the mean-LPM convex portfolio formulation with \(\alpha=3\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^+\right)^3\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

R session: Mean-downside risk portfolio

The Markowitz’s porftolio, the variance \(\mathbf{w}^T\mathbf{\Sigma}\mathbf{w}\) is a measure of risk, but is not a good measure of risk in practice since it penalizes both the unwanted high losses and the desired low losses (or gains).

Indeed, the mean-variance portfolio framework penalizes up-side and down-side risk equally, whereas most investors don’t mind up-side risk.

To overcome the limitations of the variance as risk measure, a number of alternative risk measures have been proposed. We will now consider the Downside Risk (DR):

Let \(R\) be a random variable representing the return. The lower partial moments (LPM) is defined as \[\textsf{LPM} = \mathsf{E}\left[\left((\tau - R)^+\right)^\alpha\right],\] where \((\cdot)^+=\max(0, \cdot)\) and we will use \(\tau = \mathsf{E}\left[R\right]\) as disaster level. The parameter \(\alpha\) allows different levels of risk-aversion:

  • \(\alpha=1\) suits a neutral investor: \[\mathsf{E}\left[(E[R] - R)^+\right]\]
  • \(\alpha=2\) is more risk-averse and yields the semi-variance: \[\textsf{SV} = \mathsf{E}\left[\left((E[R] - R)^+\right)^2\right]\]
  • \(\alpha=3\) is even more risk-averse: \[\mathsf{E}\left[\left((E[R] - R)^+\right)^3\right]\]

Downside Risk portfolio:
We will use sample approximation of the downside risk: \[\begin{aligned} E\left[\left((E[R] - R)^+\right)^\alpha\right] & \approx \frac{1}{T}\sum_{t=1}^T\left(\left(E[R] - R_t\right)^+\right)^\alpha\\ & \approx \frac{1}{T}\sum_{t=1}^T\left(\left(\frac{1}{T}\sum_{\tau=1}^TR_\tau - R_t\right)^+\right)^\alpha. \end{aligned}\]

The mean-downside risk portfolio formulation is the convex (well, depends on \(\alpha\)) problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^+\right)^\alpha\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0}. \end{array}\]

The mean - semi-variance (\(\alpha=2\)) portfolio formulation is the convex QP problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^+\right)^2\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0}. \end{array}\]

Interestingly, for \(\alpha=1\) the problem is an LP: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}-\lambda \frac{1}{T}\sum_{t=1}^T\left(\mathbf{w}^T\boldsymbol{\mu} - \mathbf{w}^T\mathbf{r}_t\right)^+\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0}. \end{array}\]

For \(\alpha=3\) the problem is not an LP or QP, but still convex.

We can now compute the different portfolios \(\mathbf{w}\) for different values of \(\alpha\) with the CVXR package (see https://cvxr.rbind.io/cvxr_functions/ for list of functions):

library(CVXR)

portfolioDR <- function(X, lmd = 0.5, alpha = 2) {
  T <- nrow(X)
  N <- ncol(X)
  X <- as.matrix(X)
  mu <- colMeans(X)
  w <- Variable(N)
  
  prob <- Problem(Maximize(t(w) %*% mu - (lmd/T) * sum(pos(t(mu) %*% w - X %*% w))^alpha),
                  constraints = list(w >= 0, sum(w) == 1))
  result <- solve(prob)
  return(as.vector(result$getValue(w)))
}

w_DR_alpha1 <- portfolioDR(X_log_trn, alpha = 1)
w_DR_alpha2 <- portfolioDR(X_log_trn, alpha = 2)
w_DR_alpha3 <- portfolioDR(X_log_trn, alpha = 3)

Let compare the performance (in-sample vs out-of-sample):

# combine portfolios
w_all <- cbind(w_all, 
               "DR-alpha-1" = w_DR_alpha1, 
               "DR-alpha-2" = w_DR_alpha2,
               "DR-alpha-3" = w_DR_alpha3)

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_trn))
R>>            Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> GMVP                  0.1873             0.1576                    1.1886
R>> Markowitz             0.2636             0.2167                    1.2165
R>> DR-alpha-1            0.2349             0.1621                    1.4490
R>> DR-alpha-2            0.2105             0.1590                    1.3235
R>> DR-alpha-3            0.1947             0.1581                    1.2316
t(table.AnnualizedReturns(ret_all_tst))
R>>            Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> GMVP                  0.1430             0.1779                    0.8038
R>> Markowitz             0.0732             0.2020                    0.3624
R>> DR-alpha-1            0.0837             0.1702                    0.4918
R>> DR-alpha-2            0.1068             0.1741                    0.6133
R>> DR-alpha-3            0.1174             0.1748                    0.6715
table.DownsideRisk(ret_all_tst)
R>>                                  GMVP Markowitz DR-alpha-1 DR-alpha-2 DR-alpha-3
R>> Semi Deviation                 0.0082    0.0090     0.0077     0.0080     0.0081
R>> Gain Deviation                 0.0066    0.0088     0.0066     0.0067     0.0066
R>> Loss Deviation                 0.0078    0.0090     0.0072     0.0075     0.0077
R>> Downside Deviation (MAR=210%)  0.0130    0.0139     0.0128     0.0129     0.0129
R>> Downside Deviation (Rf=0%)     0.0079    0.0089     0.0075     0.0077     0.0078
R>> Downside Deviation (0%)        0.0079    0.0089     0.0075     0.0077     0.0078
R>> Maximum Drawdown               0.1743    0.2038     0.1906     0.1770     0.1728
R>> Historical VaR (95%)          -0.0166   -0.0204    -0.0166    -0.0173    -0.0169
R>> Historical ES (95%)           -0.0254   -0.0281    -0.0238    -0.0249    -0.0251
R>> Modified VaR (95%)            -0.0188   -0.0201    -0.0178    -0.0183    -0.0185
R>> Modified ES (95%)             -0.0271   -0.0299    -0.0247    -0.0263    -0.0268

Let us plot the cumulative PnL over time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich6equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich6equal)

Let’s plot the drawdown:

{ chart.Drawdown(ret_all, main = "Drawdown of portfolios", 
                 legend.loc = "bottomleft", colorset = rich6equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

Indeed, the mean-variance Markowitz portfolio has the worst drawdown. As for the downside risk portfolio, we will keep the one with \(\alpha=3\).

w_all <- w_all[, ! colnames(w_all) %in% c("DR-alpha-1", "DR-alpha-2")]

Mean-CVaR portfolio

Mean-CVaR portfolio

  • A portfolio formulation dealing directly with VaR and CVaR quantities is not tractable.

  • Let \(f\left(\mathbf{w},\mathbf{r}\right)\) be an arbitrary cost function, where \(\mathbf{w}\) is the optimization variable (portfolio) and \(\mathbf{r}\) denotes the random asset returns.

    • Example: \(f\left(\mathbf{w},\mathbf{r}\right)=-\mathbf{w}^{T}\boldsymbol{r}\).
  • Consider, for example, the maximization of the mean return subject to a CVaR risk constraint on the loss: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu}\\ \textsf{subject to} & \mathsf{CVaR}_{\alpha}\left(f\left(\mathbf{w},\mathbf{r}\right)\right)\leq c\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0} \end{array}\] where \[\mathsf{CVaR}_{\alpha}\left(f\left(\mathbf{w},\mathbf{r}\right)\right) = \textsf{E}\left[f\left(\mathbf{w},\mathbf{r}\right)\mid f\left(\mathbf{w},\mathbf{r}\right)\geq\mathsf{VaR}_{\alpha}\left(f\left(\mathbf{w},\mathbf{r}\right)\right)\right].\]

CVaR portfolio

  • Rockafellar and Uryasev (2000) first proposed to minimize the CVaR of the portfolio loss as follows: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathsf{CVaR}_{\alpha}\left(\mathbf{w}^T\mathbf{r}\right)\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0} \end{array}\] where \[\mathsf{CVaR}_{\alpha}\left(\mathbf{w}^T\mathbf{r}\right) = \textsf{E}\left[\mathbf{w}^T\mathbf{r}\mid \mathbf{w}^T\mathbf{r}\geq\mathsf{VaR}_{\alpha}\left(\mathbf{w}^T\mathbf{r}\right)\right].\]

CVaR in convex form

  • Define the auxiliary convex function \[F_{\alpha}(\mathbf{w},\zeta)=\zeta+\frac{1}{1-\alpha}\mathsf{E}\left[-\mathbf{w}^{T}\mathbf{r}-\zeta\right]^{+},\] where \(\left[x\right]^{+}=\max\left(x,0\right)\).

  • Rockafellar and Uryasev show that

    1. \(\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})\) is a minimizer of \(F_{\alpha}(\mathbf{w},\zeta)\) w.r.t. \(\zeta\): \[\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})\in \arg\min_{\zeta}F_{\alpha}(\mathbf{w},\zeta).\]
    2. \(\mathsf{CVaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})\) equals minimum \(F_{\alpha}(\mathbf{w},\zeta)\) w.r.t. \(\zeta\): \[\mathsf{CVaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})= \min_{\zeta}F_{\alpha}(\mathbf{w},\zeta).\]

Proof CVaR in convex form\(^*\)

  • The minimizer of \(F_{\alpha}(\mathbf{w},\zeta)\) w.r.t. \(\zeta\) satisfies: \(0\in\partial_{\zeta}F_{\alpha}(\mathbf{w},\zeta^{\star})\). For example, we choose the following subgradient: \[\begin{aligned} 0=s_{\zeta}F_{\alpha}(\mathbf{w},\zeta^{\star}) & =1-\frac{1}{1-\alpha}\int\mathbf{1}_{\left\{-\mathbf{w}^{T}\mathbf{r}>\zeta^{\star}\right\}}p(\mathbf{r})d\mathbf{r}\\ & =1-\frac{1}{1-\alpha}P\left(-\mathbf{w}^{T}\mathbf{r}>\zeta^{\star}\right), \end{aligned}\] where \(\mathbf{1}_{\{\cdot\}}\) is the indicator function. Solving the above equation, we have \[P\left(-\mathbf{w}^{T}\mathbf{r}>\zeta^{\star}\right)=1-\alpha \Longrightarrow \zeta^{\star}=\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r}).\]

Proof CVaR in convex form\(^*\)

  • First, we have \[\min_{\zeta}F_{\alpha}(\mathbf{w},\zeta)=F_{\alpha}(\mathbf{w},\zeta^{\star}) =\zeta^{\star}+\frac{1}{1-\alpha}\mathsf{E}[-\mathbf{w}^{T}\mathbf{r}-\zeta^{\star}]^{+}.\] Then, recall that
    \[\begin{aligned} \mathsf{CVaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r}) & = \mathsf{E}\left[-\mathbf{w}^{T}\mathbf{r}\big|-\mathbf{w}^{T}\mathbf{r}>\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})\right]\\ & = \frac{1}{1-\alpha}\int_{-\mathbf{w}^{T}\mathbf{r}>\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})}\left(-\mathbf{w}^{T}\mathbf{r}\right)p(\mathbf{r})d\mathbf{r}\\ & = \frac{1}{1-\alpha}\int\left[-\mathbf{w}^{T}\mathbf{r}-\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r})\right]^{+}p(\mathbf{r})d\mathbf{r}\\ & \qquad +\mathsf{VaR}_{\alpha}(-\mathbf{w}^{T}\mathbf{r}). \end{aligned}\]

CVaR in Convex Form

Corollary 1:

\[ \min_{\mathbf{w}}\mathsf{CVaR}_{\alpha}\left(-\mathbf{w}^{T}\mathbf{r}\right)=\min_{\mathbf{w},\zeta}F_{\alpha}\left(\mathbf{w},\zeta\right). \]

  • In words, minimizing \(F_{\alpha}\left(\mathbf{w},\zeta\right)\) simultaneously calculates the optimal CVaR and VaR.

Corollary 2:

Because \(-\mathbf{w}^{T}\mathbf{r}\) is convex (in fact, linear) in \(\mathbf{w}\) for each \(\mathbf{r}\), then \(F_{\alpha}\left(\mathbf{w},\zeta\right)\) is convex!

Proof: \[F_{\alpha}\left(\mathbf{w},\zeta\right)=\zeta+\frac{1}{1-\alpha}\int\left[-\mathbf{w}^{T}\mathbf{r}-\zeta\right]^{+}p\left(\mathbf{r}\right)d\mathbf{r}.\]

Sample Average Approximation of CVaR

  • Sample average approximation of \(F_{\alpha}(\mathbf{w},\zeta)\): \[\begin{aligned} F_{\alpha}(\mathbf{w},\zeta) & =\zeta+\frac{1}{1-\alpha}\mathsf{E}\left[-\mathbf{w}^T\mathbf{r}-\zeta\right]^{+}\\ & \approx\zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}\left[-\mathbf{w}^T\mathbf{r}_t-\zeta\right]^{+}. \end{aligned}\]

CVaR portfolio as an LP

  • We first include the dummy variables \(z_t\): \[z_t\geq\left[-\mathbf{w}^T\mathbf{r}_t-\zeta\right]^{+} \Longrightarrow z_t\geq -\mathbf{w}^T\mathbf{r}_t-\zeta,\,z_t\geq0\]


  • CVaR portfolio problem can be approximated by an LP: \[\begin{array}{ll} \underset{\mathbf{w}, \{z_t\}, \zeta}{\textsf{minimize}} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t}\\ \textsf{subject to} & 0\leq z_{t}\geq-\mathbf{w}^{T}\mathbf{r}_{t}-\zeta,\quad t=1,\dots,T\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

Mean-CVaR portfolio as an LP

  • We can also consider the maximization of the mean return subject to a CVaR constraint: \[\begin{array}{ll} \underset{\mathbf{w}, \{z_t\}, \zeta}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t} \le c\\ & 0\leq z_{t}\geq-\mathbf{w}^{T}\mathbf{r}_{t}-\zeta,\quad t=1,\dots,T\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

  • Or a mean-CVaR objective: \[\begin{array}{ll} \underset{\mathbf{w}, \{z_t\}, \zeta}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu} - \lambda\left(\zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t}\right)\\ \textsf{subject to} & 0\leq z_{t}\geq-\mathbf{w}^{T}\mathbf{r}_{t}-\zeta,\quad t=1,\dots,T\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

R session: Mean-CVaR portfolio

The CVaR can be conveniently obtained as the minimum of an auxiliary convex function1: \[\mathsf{CVaR}_{\alpha}\left(\mathbf{w}^{T}\mathbf{r}\right)=\min_{\zeta}F_{\alpha}\left(\mathbf{w},\zeta\right)\] where \[F_{\alpha}(\mathbf{w},\zeta)=\zeta+\frac{1}{1-\alpha}\mathsf{E}\left[-\mathbf{w}^{T}\mathbf{r}-\zeta\right]^{+}.\]

We can use a sample average approximation of \(F_{\alpha}(\mathbf{w},\zeta)\): \[F_{\alpha}(\mathbf{w},\zeta) \approx \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}\left[-\mathbf{w}^T\mathbf{r}_t-\zeta\right]^{+}.\] We define the dummy variables \(z_t\): \[z_t\geq\left[-\mathbf{w}^T\mathbf{r}_t-\zeta\right]^{+} \Longrightarrow z_t\geq -\mathbf{w}^T\mathbf{r}_t-\zeta,\,z_t\geq0.\]

The mean-CVaR portfolio formulation can be finally written as the (convex) LP: \[\begin{array}{ll} \underset{\mathbf{w}, \mathbf{z}, \zeta}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu} - \lambda\left(\zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t}\right)\\ \textsf{subject to} & 0\leq z_{t}\geq-\mathbf{w}^{T}\mathbf{r}_{t}-\zeta,\quad t=1,\dots,T\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

We are now ready to compute the mean-CVaR portfolio with the CVXR package:

portolioCVaR <- function(X, lmd = 0.5, alpha = 0.95) {
  T <- nrow(X)
  N <- ncol(X)
  X <- as.matrix(X)
  mu <- colMeans(X)
  # variables
  w <- Variable(N)
  z <- Variable(T)
  zeta <- Variable(1)
  # problem
  prob <- Problem(Maximize(t(w) %*% mu - lmd*zeta - (lmd/(T*(1-alpha))) * sum(z)),
                  constraints = list(z >= 0, z >= -X %*% w - zeta,
                                     w >= 0, sum(w) == 1))
  result <- solve(prob)
  return(as.vector(result$getValue(w)))
}

w_CVaR095 <- portolioCVaR(X_log_trn, alpha = 0.95)
w_CVaR099 <- portolioCVaR(X_log_trn, alpha = 0.98)

Let compare the performance (in-sample vs out-of-sample):

# combine portfolios
w_all <- cbind(w_all, 
               "CVaR-alpha-0.95" = w_CVaR095,
               "CVaR-alpha-0.99" = w_CVaR099)

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_trn))
R>>                 Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> GMVP                       0.1873             0.1576                    1.1886
R>> Markowitz                  0.2636             0.2167                    1.2165
R>> DR-alpha-3                 0.1947             0.1581                    1.2316
R>> CVaR-alpha-0.95            0.2091             0.1706                    1.2257
R>> CVaR-alpha-0.99            0.2193             0.1652                    1.3273
t(table.AnnualizedReturns(ret_all_tst))
R>>                 Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> GMVP                       0.1430             0.1779                    0.8038
R>> Markowitz                  0.0732             0.2020                    0.3624
R>> DR-alpha-3                 0.1174             0.1748                    0.6715
R>> CVaR-alpha-0.95            0.1526             0.1689                    0.9039
R>> CVaR-alpha-0.99            0.1675             0.1667                    1.0050

The CVaR with a confidence level of 99% has by far the highest Sharpe ratio. However, one cannot conclude too much from these results as more exhaustive backtests should be conducted.

We can also observe its good performance from the cumulative PnL over time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich6equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich6equal)

Let’s plot the drawdown:

{ chart.Drawdown(ret_all, main = "Drawdown of portfolios", 
                 legend.loc = "bottomleft", colorset = rich6equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

We will then keep the CVaR portfolio with \(\alpha=0.99\).

w_all <- w_all[, ! colnames(w_all) %in% c("CVaR-alpha-0.95")]

Mean-DD portfolio

Drawdown (DD)

  • Let \(\mathbf{r}(t)\) be the return vector of the \(N\) stocks at time \(t\).
  • Define the cumulative (uncompounded) return vector as \[\mathbf{r}^{\sf cum}(t) = \sum_{\tau=1}^t\mathbf{r}(\tau)\] (Note: the compounded return is \(\prod_{\tau=1}^t(\mathbf{1} + \mathbf{r}(\tau)) - \mathbf{1}\).)
  • The portfolio return is \(r_p(t) = \mathbf{w}^T\mathbf{r}(t)\) and cumulative return \[r_p^{\sf cum}(t) = \mathbf{w}^T\mathbf{r}^{\sf cum}(t)\]
  • The drawdown (DD) at time \(t\) can be written as \[D(t)=\max_{1\le\tau\le t}r_p^{\sf cum}(\tau) - r_p^{\sf cum}(t)\]

Max-DD, Ave-DD, and CDaR

  • The maximum DD (Max-DD) over a period \(t=1,\ldots,T\) is \[M(T)=\max_{1\le t\le T}D(t)\]
  • The average DD (Ave-DD) over a period \(t=1,\ldots,T\) is \[A(T)=\frac{1}{T}\sum_{1\le t\le T}D(t)\]
  • Similarly to the CVaR, we can define the Conditional Drawdown at Risk (CDaR) as the mean of the worst \(100(1-\alpha)\%\) drawdowns: \[\Delta_\alpha(T) = \frac{1}{(1-\alpha)T}\sum_{t\in\Omega_\alpha}D(t),\] where \(\Omega_\alpha = \{1\le t\le T \mid D(t)\ge \xi_\alpha\}\) with \(\xi_\alpha\) being the threshold such that exactly \(100(1-\alpha)\%\) of drawdowns exceeds that limit.

CDaR in Convex Form

  • The CDaR can be conveniently expressed as (Chekhlov et al. 2000) \[\Delta_\alpha(\mathbf{w}) = \min_\zeta \left\{ \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}[D_t(\mathbf{w})-\zeta]^{+} \right\}\]
  • When \(\alpha\) tends to 1, the CDaR tends to the maximum drawdown, i.e., \[\Delta_1(T) = M(T)\]
  • When \(\alpha\) tends to 0, the CDaR tends to the average drawdown, i.e., \[\Delta_0(T) = A(T)\]

Mean - Max-DD portfolio as an LP

  • We can consider the maximization of the mean return subject to a Max-DD constraint: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \max_{1\le t\le T} \{\max_{1\le\tau\le t}\mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} - \mathbf{w}^T\mathbf{r}_t^{\sf cum}\} \le c\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]


  • Removing one maximum operator is trivial: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \max_{1\le\tau\le t}\mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} - \mathbf{w}^T\mathbf{r}_t^{\sf cum} \le c, \quad\forall 1\le t\le T\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

Mean - Max-DD portfolio as an LP

  • To remove the other max operator, we need to introduce some additional variables: \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & u_t - \mathbf{w}^T\mathbf{r}_t^{\sf cum} \le c, \quad\forall 1\le t\le T\\ & u_t \ge \mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} \qquad\quad \forall 1\le t\le T, 1\le\tau\le t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

  • We can reduce the large number of constraints by rewriting it as \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & u_t - \mathbf{w}^T\mathbf{r}_t^{\sf cum} \le c, \quad\forall 1\le t\le T\\ & u_t \ge \mathbf{w}^T\mathbf{r}_{t}^{\sf cum}\\ & u_t \ge u_{t-1}\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

Mean - Max-DD portfolio as an LP

  • We can finally write the maximization of the mean return subject to the Max-DD constraint as \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \mathbf{w}^T\mathbf{r}_{t}^{\sf cum} \le u_t \le \mathbf{w}^T\mathbf{r}_t^{\sf cum} + c, \quad\forall 1\le t\le T\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

R session: Mean - Max-DD portfolio

We can formulate the maximization of the expected return subject to a Max-DD constraint as \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \max_{1\le t\le T} \{\max_{1\le\tau\le t}\mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} - \mathbf{w}^T\mathbf{r}_t^{\sf cum}\} \le c\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}, \end{array}\] which can be more conveniently rewritten as the following LP: \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \mathbf{w}^T\mathbf{r}_{t}^{\sf cum} \le u_t \le \mathbf{w}^T\mathbf{r}_t^{\sf cum} + c, \quad\forall 1\le t\le T\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

portfolioMaxDD <- function(X, c = 0.2) {
  T <- nrow(X)
  N <- ncol(X)
  X <- as.matrix(X)
  X_cum <- apply(X, MARGIN = 2, FUN = cumsum)
  mu <- colMeans(X)
  # variables
  w <- Variable(N)
  u <- Variable(T)
  # problem
  prob <- Problem(Maximize(t(w) %*% mu),
                  constraints = list(w >= 0, sum(w) == 1,
                                     u <= X_cum %*% w + c,
                                     u >= X_cum %*% w,
                                     u[-1] >= u[-T]))
  result <- solve(prob)
  return(as.vector(result$getValue(w)))
}

w_MaxDD_c018 <- portfolioMaxDD(X_log_trn, c = 0.18)
w_MaxDD_c021 <- portfolioMaxDD(X_log_trn, c = 0.21)
w_MaxDD_c024 <- portfolioMaxDD(X_log_trn, c = 0.25)

Let compare the performance (in-sample vs out-of-sample):

# combine portfolios
w_all <- cbind(w_all, 
               "Max-DD-c-018" = w_MaxDD_c018, 
               "Max-DD-c-021" = w_MaxDD_c021, 
               "Max-DD-c-024" = w_MaxDD_c024)

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_tst)[3, ])
R>>                 Annualized Sharpe (Rf=0%)
R>> GMVP                               0.8038
R>> Markowitz                          0.3624
R>> DR-alpha-3                         0.6715
R>> CVaR-alpha-0.99                    1.0050
R>> Max-DD-c-018                       0.4986
R>> Max-DD-c-021                       0.3158
R>> Max-DD-c-024                       0.2197
t(maxDrawdown(ret_all_trn))
R>>                 Worst Drawdown
R>> GMVP                 0.1896967
R>> Markowitz            0.1822648
R>> DR-alpha-3           0.1872492
R>> CVaR-alpha-0.99      0.1698014
R>> Max-DD-c-018         0.1653578
R>> Max-DD-c-021         0.1949750
R>> Max-DD-c-024         0.2287675
t(maxDrawdown(ret_all_tst))
R>>                 Worst Drawdown
R>> GMVP                 0.1743464
R>> Markowitz            0.2038456
R>> DR-alpha-3           0.1727884
R>> CVaR-alpha-0.99      0.1613238
R>> Max-DD-c-018         0.1897650
R>> Max-DD-c-021         0.2103964
R>> Max-DD-c-024         0.2324404

We can see that the Max-DD designs indeed have a controlled maximum drawdown at least in-sample; however, out-of-sample it is not mantained. This is probably due to not having enough training samples. Also, the Sharpe ratio doesn’t seem to be very good.

Let us plot the cumulative PnL over time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)

Let’s plot the drawdown:

{ chart.Drawdown(ret_all, main = "Drawdown of portfolios", 
                 legend.loc = "bottomleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

Indeed, the Max-DD portfolio with \(c=0.18\) has the lowest drawdown (in the in-sample period). In terms of Sharpe ratio, however, it is not that good. We will then keep the Max-DD portfolio with \(c=0.18\):

w_all <- w_all[, ! colnames(w_all) %in% c("Max-DD-c-021", "Max-DD-c-024")]

Mean - Ave-DD portfolio as an LP

  • Similarly, we can consider the maximization of the mean return subject to an Ave-DD constraint: \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \frac{1}{T}\sum_{t=1}^T (u_t - \mathbf{w}^T\mathbf{r}_t^{\sf cum}) \le c\\ & u_t \ge \mathbf{w}^T\mathbf{r}_{t}^{\sf cum}\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0} \end{array}\] or, equivalently, \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \frac{1}{T}\sum_{t=1}^T u_t \le \sum_{t=1}^T\mathbf{w}^T\mathbf{r}_t^{\sf cum} + c\\ & \mathbf{w}^T\mathbf{r}_{t}^{\sf cum} \le u_t\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

R session: Mean - Ave-DD portfolio

We can formulate the maximization of the expected return subject to an Ave-DD constraint as the following LP: \[\begin{array}{ll} \underset{\mathbf{w}, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \frac{1}{T}\sum_{t=1}^T u_t \le \sum_{t=1}^T\mathbf{w}^T\mathbf{r}_t^{\sf cum} + c\\ & \mathbf{w}^T\mathbf{r}_{t}^{\sf cum} \le u_t\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

portfolioAveDD <- function(X, c = 0.2) {
  T <- nrow(X)
  N <- ncol(X)
  X <- as.matrix(X)
  X_cum <- apply(X, MARGIN = 2, FUN = cumsum)
  mu <- colMeans(X)
  # variables
  w <- Variable(N)
  u <- Variable(T)
  # problem
  prob <- Problem(Maximize(t(w) %*% mu),
                  constraints = list(w >= 0, sum(w) == 1,
                                     mean(u) <= mean(X_cum %*% w) + c,
                                     u >= X_cum %*% w,
                                     u[-1] >= u[-T]))
  result <- solve(prob)
  return(as.vector(result$getValue(w)))
}

w_AveDD_c004 <- portfolioAveDD(X_log_trn, c = 0.04)
w_AveDD_c006 <- portfolioAveDD(X_log_trn, c = 0.06)
w_AveDD_c008 <- portfolioAveDD(X_log_trn, c = 0.08)

Let compare the performance (in-sample vs out-of-sample):

# combine portfolios
w_all <- cbind(w_all, 
               "Ave-DD-c-004" = w_AveDD_c004, 
               "Ave-DD-c-006" = w_AveDD_c006, 
               "Ave-DD-c-008" = w_AveDD_c008)

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_tst)[3, ])
R>>                 Annualized Sharpe (Rf=0%)
R>> GMVP                               0.8038
R>> Markowitz                          0.3624
R>> DR-alpha-3                         0.6715
R>> CVaR-alpha-0.99                    1.0050
R>> Max-DD-c-018                       0.4986
R>> Ave-DD-c-004                       0.4205
R>> Ave-DD-c-006                       0.2010
R>> Ave-DD-c-008                       0.1572
t(maxDrawdown(ret_all_tst))
R>>                 Worst Drawdown
R>> GMVP                 0.1743464
R>> Markowitz            0.2038456
R>> DR-alpha-3           0.1727884
R>> CVaR-alpha-0.99      0.1613238
R>> Max-DD-c-018         0.1897650
R>> Ave-DD-c-004         0.1964124
R>> Ave-DD-c-006         0.2380185
R>> Ave-DD-c-008         0.2512568
t(AverageDrawdown(ret_all_trn))
R>>                 Average Drawdown
R>> GMVP                  0.02540231
R>> Markowitz             0.03040071
R>> DR-alpha-3            0.02511744
R>> CVaR-alpha-0.99       0.02495849
R>> Max-DD-c-018          0.02762046
R>> Ave-DD-c-004          0.02754424
R>> Ave-DD-c-006          0.03593749
R>> Ave-DD-c-008          0.03581146
t(AverageDrawdown(ret_all_tst))
R>>                 Average Drawdown
R>> GMVP                  0.02318766
R>> Markowitz             0.04120036
R>> DR-alpha-3            0.02306279
R>> CVaR-alpha-0.99       0.02331892
R>> Max-DD-c-018          0.02556956
R>> Ave-DD-c-004          0.03283453
R>> Ave-DD-c-006          0.06894089
R>> Ave-DD-c-008          0.07448326

The Ave-DD designs have a controlled average drawdown. But in terms of Sharpe ratio they are not especially outstanding.

Let us plot the cumulative PnL over time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)

Now the drawdown:

{ chart.Drawdown(ret_all, main = "Drawdown of portfolios", 
                 legend.loc = "bottomleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

The Ave-DD portfolio with \(c=0.04\) seems to have the best performance and we keep it for future comparisons:

w_all <- w_all[, ! colnames(w_all) %in% c("Ave-DD-c-006", "Ave-DD-c-008")]

Mean-CDaR portfolio as an LP

  • Finally, we can consider the maximization of the mean return subject to a CDaR constraint: \[\begin{array}{ll} \underset{\mathbf{w}, \zeta}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}\left[ \max_{1\le\tau\le t}\mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} - \mathbf{w}^T\mathbf{r}_t^{\sf cum} - \zeta \right]^+ \le c\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

  • Similarly to the CVaR case, we can get rid of the \([\cdot]^+\) operator by introducing some additional variables: \[\begin{array}{ll} \underset{\mathbf{w}, \{z_t\}, \zeta}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t} \le c\\ & 0\leq z_{t}\geq \max_{1\le\tau\le t}\mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} - \mathbf{w}^T\mathbf{r}_t^{\sf cum} - \zeta, \quad t=1,\dots,T\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

Mean-CDaR portfolio as an LP

  • Similarly to the Max-DD and Ave-DD cases, we can get rid of the max operators by introducing additional variables: \[\begin{array}{ll} \underset{\mathbf{w}, \{z_t\}, \zeta, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t} \le c\\ & 0\leq z_{t}\geq u_t - \mathbf{w}^T\mathbf{r}_t^{\sf cum} - \zeta, \quad t=1,\dots,T\\ & \mathbf{w}^T\mathbf{r}_{t}^{\sf cum} \le u_t\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

R session: Mean-CDaR portfolio

We now consider the maximization of the mean return subject to a CDaR constraint: \[\begin{array}{ll} \underset{\mathbf{w}, \zeta}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}\left[ \max_{1\le\tau\le t}\mathbf{w}^T\mathbf{r}_{\tau}^{\sf cum} - \mathbf{w}^T\mathbf{r}_t^{\sf cum} - \zeta \right]^+ \le c\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}, \end{array}\] which can be conveniently reformulated as the following LP: \[\begin{array}{ll} \underset{\mathbf{w}, \{z_t\}, \zeta, \{u_t\}}{\textsf{maximize}} & \mathbf{w}^T\boldsymbol{\mu}\\ \textsf{subject to} & \zeta+\frac{1}{1-\alpha}\frac{1}{T}\sum_{t=1}^{T}z_{t} \le c\\ & 0\leq z_{t}\geq u_t - \mathbf{w}^T\mathbf{r}_t^{\sf cum} - \zeta, \quad t=1,\dots,T\\ & \mathbf{w}^T\mathbf{r}_{t}^{\sf cum} \le u_t\\ & u_{t-1} \le u_t\\ & \mathbf{1}^T\mathbf{w}=1,\quad \mathbf{w}\ge\mathbf{0}. \end{array}\]

portfolioCDaR <- function(X, c = 0.1, alpha = 0.95) {
  T <- nrow(X)
  N <- ncol(X)
  X <- as.matrix(X)
  X_cum <- apply(X, MARGIN = 2, FUN = cumsum)
  mu <- colMeans(X)
  # variables
  w <- Variable(N)
  z <- Variable(T)
  zeta <- Variable(1)
  u <- Variable(T)
  # problem
  prob <- Problem(Maximize(t(w) %*% mu),
                  constraints = list(w >= 0, sum(w) == 1,
                                     zeta + (1/(T*(1-alpha))) * sum(z) <= c,
                                     z >= 0, z >= u - X_cum %*% w - zeta,
                                     u >= X_cum %*% w,
                                     u[-1] >= u[-T]))
  result <- solve(prob)
  return(as.vector(result$getValue(w)))
}

w_CDaR095_c014 <- portfolioCDaR(X_log_trn, c = 0.15, alpha = 0.95)
w_CDaR095_c016 <- portfolioCDaR(X_log_trn, c = 0.16, alpha = 0.95)
w_CDaR099_c017 <- portfolioCDaR(X_log_trn, c = 0.17, alpha = 0.99)
w_CDaR099_c019 <- portfolioCDaR(X_log_trn, c = 0.19, alpha = 0.99)

Let compare the performance (in sample vs out-of-sample):

# combine portfolios
w_all <- cbind(w_all, 
               "CDaR095-c-014" = w_CDaR095_c014, 
               "CDaR095-c-016" = w_CDaR095_c016, 
               "CDaR099-c-017" = w_CDaR099_c017, 
               "CDaR099-c-019" = w_CDaR099_c019)

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_tst)[3, ])
R>>                 Annualized Sharpe (Rf=0%)
R>> GMVP                               0.8038
R>> Markowitz                          0.3624
R>> DR-alpha-3                         0.6715
R>> CVaR-alpha-0.99                    1.0050
R>> Max-DD-c-018                       0.4986
R>> Ave-DD-c-004                       0.4205
R>> CDaR095-c-014                      0.4081
R>> CDaR095-c-016                      0.3602
R>> CDaR099-c-017                      0.4388
R>> CDaR099-c-019                      0.3474
t(maxDrawdown(ret_all_tst))
R>>                 Worst Drawdown
R>> GMVP                 0.1743464
R>> Markowitz            0.2038456
R>> DR-alpha-3           0.1727884
R>> CVaR-alpha-0.99      0.1613238
R>> Max-DD-c-018         0.1897650
R>> Ave-DD-c-004         0.1964124
R>> CDaR095-c-014        0.1990953
R>> CDaR095-c-016        0.2049023
R>> CDaR099-c-017        0.1949388
R>> CDaR099-c-019        0.2059716
t(AverageDrawdown(ret_all_tst))
R>>                 Average Drawdown
R>> GMVP                  0.02318766
R>> Markowitz             0.04120036
R>> DR-alpha-3            0.02306279
R>> CVaR-alpha-0.99       0.02331892
R>> Max-DD-c-018          0.02556956
R>> Ave-DD-c-004          0.03283453
R>> CDaR095-c-014         0.02938946
R>> CDaR095-c-016         0.03329481
R>> CDaR099-c-017         0.02674348
R>> CDaR099-c-019         0.04738348
t(CDD(ret_all_trn))
R>>                 Conditional Drawdown 5%
R>> GMVP                         0.06945021
R>> Markowitz                    0.07781441
R>> DR-alpha-3                   0.06997487
R>> CVaR-alpha-0.99              0.05764723
R>> Max-DD-c-018                 0.07902538
R>> Ave-DD-c-004                 0.07229477
R>> CDaR095-c-014                0.07213897
R>> CDaR095-c-016                0.07727614
R>> CDaR099-c-017                0.06947492
R>> CDaR099-c-019                0.07887087
t(CDD(ret_all_tst))
R>>                 Conditional Drawdown 5%
R>> GMVP                         0.05096778
R>> Markowitz                    0.11096592
R>> DR-alpha-3                   0.03752729
R>> CVaR-alpha-0.99              0.05553781
R>> Max-DD-c-018                 0.05078352
R>> Ave-DD-c-004                 0.08229586
R>> CDaR095-c-014                0.06492434
R>> CDaR095-c-016                0.08457282
R>> CDaR099-c-017                0.05544522
R>> CDaR099-c-019                0.12630616

Let us plot the cumulative PnL over time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)

The drawdown:

{ chart.Drawdown(ret_all, main = "Drawdown of portfolios", 
                 legend.loc = "bottomleft", colorset = rich10equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

The CDaR portfolio with \(\alpha=0.95\) and \(c=0.14\) seems to have the best performance and we keep it for future comparisons:

w_all <- w_all[, ! colnames(w_all) %in% c("CDaR095-c-016", "CDaR099-c-017", "CDaR099-c-019")]

Comparison of portfolios: dollar allocation

Comparison of portfolios: cumulative P&L

Comparison of portfolios: drawdown

R session: Final comparison of DR, CVaR, and DD portfolios

We now perform a final comparison of the selected portfolios:

  • GMVP
  • Markowitz’s mean variance portfolio
  • DR ($=3)
  • CVaR (\(\alpha=0.99\))
  • DD:
    • Max-DD (\(c=0.18\))
    • Ave-DD (\(c=0.04\))
    • CDaR095 (\(c=0.14\))

However, this is a single backtest and more exhaustive backtesting should be done (the R package portfolioBacktest is very convenient for this).

# recompute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# final performance
t(table.AnnualizedReturns(ret_all_tst)[3, ])
R>>                 Annualized Sharpe (Rf=0%)
R>> GMVP                               0.8038
R>> Markowitz                          0.3624
R>> DR-alpha-3                         0.6715
R>> CVaR-alpha-0.99                    1.0050
R>> Max-DD-c-018                       0.4986
R>> Ave-DD-c-004                       0.4205
R>> CDaR095-c-014                      0.4081
t(maxDrawdown(ret_all_tst))
R>>                 Worst Drawdown
R>> GMVP                 0.1743464
R>> Markowitz            0.2038456
R>> DR-alpha-3           0.1727884
R>> CVaR-alpha-0.99      0.1613238
R>> Max-DD-c-018         0.1897650
R>> Ave-DD-c-004         0.1964124
R>> CDaR095-c-014        0.1990953
t(AverageDrawdown(ret_all_tst))
R>>                 Average Drawdown
R>> GMVP                  0.02318766
R>> Markowitz             0.04120036
R>> DR-alpha-3            0.02306279
R>> CVaR-alpha-0.99       0.02331892
R>> Max-DD-c-018          0.02556956
R>> Ave-DD-c-004          0.03283453
R>> CDaR095-c-014         0.02938946
t(CDD(ret_all_tst))
R>>                 Conditional Drawdown 5%
R>> GMVP                         0.05096778
R>> Markowitz                    0.11096592
R>> DR-alpha-3                   0.03752729
R>> CVaR-alpha-0.99              0.05553781
R>> Max-DD-c-018                 0.05078352
R>> Ave-DD-c-004                 0.08229586
R>> CDaR095-c-014                0.06492434

We can now compare the allocations of the portfolios:

barplot(t(w_all), col = rainbow8equal[1:7], legend = colnames(w_all), beside = TRUE,
        main = "Portfolio allocation", xlab = "stocks", ylab = "dollars")

Let us plot the cumulative PnL over time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)

The drawdown:

chart.Drawdown(ret_all_tst, main = "Drawdown of portfolios (out-of-sample)", 
               legend.loc = "bottomleft", colorset = rich8equal)

Word of caution on DD

  • The maximum drawdown is extremely sensitive to minute changes in the portfolio weights and to the specific time period examined.

  • If the returns are close to normally distributed, the distribution of drawdowns is just a function of the variance, so there’s no need to include drawdowns explicitly in your portfolio construction objective. Minimizing variance is the same as minimizing expected drawdowns.

  • On the other hand, if returns are very non-normal and you want to find a portfolio that minimizes the expected drawdowns, you still wouldn’t choose weights that minimize historical drawdown. Why?

  • Because minimizing historical drawdown is effectively the same as taking all your returns that weren’t part of a drawdown, and hiding them from your optimizer, which will lead to portfolio weights that are a lot less accurately estimated than if you let your optimizer see all the data you have.

  • Instead, you might just include terms in your optimization objective that penalize negative skew and penalize positive kurtosis (Boudt et al. 2019).

Conclusions

Conclusions

  • Markowitz’s mean-variance porftolio, while it started the field of modern portfolio theory in 1952, has not been embraced by practitioners among other reasons because

    • variance (or volatility) is not a good measure of risk.
  • Alternative measures of risk exist such as the downside risk (e.g., semi-variance), VaR, CVaR, and drawdown.

  • We have formulated alternative convex portfolio formulations:

    • mean - downside risk portfolio with \(\alpha=1\), which is an LP;
    • mean - downside risk portfolio with \(\alpha=2\), which is a QP;
    • mean - downside risk portfolio with \(\alpha=3\), which is convex problem;
    • mean - CVaR portfolio, which is an LP;
    • mean - drawdown portfolios (Max-DD, Ave-DD, CDaR), which are LPs.
  • The mean - CVaR portfolio seems to perform well, but more exhaustive backtests need to be conducted.

  • As a note of caution, one has to be careful with CVaR and CDaR portfolios due to the sensitivity to the used data (if not enough data, they will not be reliable).

Thanks

References

Boudt, K., Cornilly, D., Van-Holle, F., & Willems, J. (2019). Algorithmic portfolio tilting to harvest higher moment gains. SSRN: https://ssrn.com/abstract=3378491.

Chekhlov, A., Uryasev, S., & Zabarankin, M. (2000). Portfolio optimization with drawdown constraints. SSRN: https://ssrn.com/abstract=223323.

Chopra, V., & Ziemba, W. (1993). The effect of errors in means, variances and covariances on optimal portfolio choice. Journal of Portfolio Management.

Feng, Y., & Palomar, D. P. (2016). A Signal Processing Perspective on Financial Engineering. Foundations; Trends in Signal Processing, Now Publishers.

Markowitz, H. (1952). Portfolio selection. J. Financ., 7(1), 77–91.

Markowitz, H. (1959). Portfolio selection: Efficient diversification of investments. Wiley.

McNeil, A. J., Frey, R., & Embrechts, P. (2005). Quantitative risk management: Concepts, techniques and tools. Princeton University Press.

Meucci, A. (2005). Risk and asset allocation. Springer.

Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value-at-risk. J. Risk, 2, 21–42.