Portfolio Optimization

MAFS5310 - Portfolio Optimization with R
MSc in Financial Mathematics
The Hong Kong University of Science and Technology (HKUST)
Fall 2020-21

Outline

Primer on Financial Data
Modeling the Returns
Portfolio Basics
Heuristic Portfolios
Markowitz’s Modern Portfolio Theory (MPT)
- Mean-variance portfolio (MVP)
- Global minimum variance portfolio (GMVP)
- Maximum Sharpe ratio portfolio (MSRP)
Risk-Based Portfolios (GMVP, IVP, RPP, MDP, MDCP)
Comparison of Portfolios
Conclusions

Primer on Financial Data

Asset log-prices

Let \(p_{t}\) be the price of an asset at (discrete) time index \(t\).
The fundamental model is based on modeling the log-prices \(y_{t}\triangleq\log p_{t}\) as a random walk: \[y_{t}=\mu+y_{t-1}+\epsilon_{t}\]

Asset returns

For stocks, returns are used for the modeling since they are “stationary” (as opposed to the previous random walk).
Simple return (a.k.a. linear or net return) is \[R_{t} \triangleq\frac{p_{t}-p_{t-1}}{p_{t-1}}=\frac{p_{t}}{p_{t-1}}-1.\]
Log-return (a.k.a. continuously compounded return) is \[r_{t} \triangleq y_{t}-y_{t-1}=\log\frac{p_{t}}{p_{t-1}}=\log\left(1+R_{t}\right).\]
Observe that the log-return is “stationary”: \[r_{t}=y_{t}-y_{t-1}=\mu+\epsilon_{t}\]
Note that \(r_{t}\approx R_{t}\) when \(R_{t}\) is small (i.e., the changes in \(p_t\) are small) (Ruppert 2010).

S&P 500 index - Log-returns

Autocorrelation

ACF of S&P 500 log-returns:

Autocorrelation

ACF of S&P 500 absolute value of log-returns:

Non-Gaussianity and asymmetry

Histograms of S&P 500 log-returns:

Heavy-tailness

QQ plots of S&P 500 log-returns:

Volatility clustering

S&P 500 log-returns:

Volatility clustering removed

Standardized S&P 500 log-returns:

Conditional heavy-tailness

QQ plots of standardized S&P 500 log-returns (conditional heavy-tailness aka aggregational Gaussianity):

Frequency of data

Low frequency (weekly, monthly): Gaussian distributions seems to fit reality after correcting for volatility clustering (except for the asymmetry), but the nonstationarity is a big issue

Medium frequency (daily): definitely heavy tails even after correcting for volatility clustering, as well as asymmetry

High frequency (intraday, 30min, 5min, tick-data): below 5min the noise microstructure starts to reveal itself

Modeling the Returns

Returns of the universe

In practice, we don’t just deal with one asset but with a whole universe of \(N\) assets.
We denote the log-returns of the \(N\) assets at time t with the vector \(\mathbf{r}_{t}\in\mathbb{R}^{N}\).
The time index \(t\) can denote any arbitrary period such as days, weeks, months, 5-min intervals, etc.
\(\mathcal{F}_{t-1}\) denotes the previous historical data.
Econometrics aims at modeling \(\mathbf{r}_{t}\) conditional on \(\mathcal{F}_{t-1}\).
\(\mathbf{r}_{t}\) is a multivariate stochastic process with conditional mean and covariance matrix denoted as (Feng and Palomar 2016) \[\begin{aligned} \boldsymbol{\mu}_{t} &\triangleq\textsf{E}\left[\mathbf{r}_{t}\mid\mathcal{F}_{t-1}\right]\\ \boldsymbol{\Sigma}_{t} &\triangleq\textsf{Cov}\left[\mathbf{r}_{t}\mid\mathcal{F}_{t-1}\right]=\textsf{E}\left[(\mathbf{r}_{t}-\boldsymbol{\mu}_{t})(\mathbf{r}_{t}-\boldsymbol{\mu}_{t})^{T}\mid\mathcal{F}_{t-1}\right]. \end{aligned}\]

i.i.d. model

For simplicity we will assume that \(\mathbf{r}_{t}\) follows an i.i.d. distribution (which is not very innacurate in general).

That is, both the conditional mean and conditional covariance are constant: \[\begin{aligned} \boldsymbol{\mu}_{t} &= \boldsymbol{\mu},\\ \boldsymbol{\Sigma}_{t} &= \boldsymbol{\Sigma}. \end{aligned}\]

Very simple model, however, it is one of the most fundamental assumptions for many important works, e.g., the Nobel prize-winning Markowitz’s portfolio theory (Markowitz 1952).

Factor models

Factor models are special cases of the i.i.d. model with the covariance matrix being decomposed into two parts: low dimensional factors and marginal noise.
The factor model is \[\mathbf{r}_{t}=\boldsymbol{\alpha}+\mathbf{B}\mathbf{f}_{t}+\mathbf{w}_{t},\] where
- \(\boldsymbol{\alpha}\) denotes a constant vector
- \(\mathbf{f}_{t}\in\mathbb{R}^{K}\) with \(K\ll N\) is a vector of a few factors that are responsible for most of the randomness in the market
- \(\mathbf{B}\in\mathbb{R}^{N\times K}\) denotes how the low dimensional factors affect the higher dimensional market assets
- \(\mathbf{w}_{t}\) is a white noise residual vector that has only a marginal effect.
The factors can be explicit or implicit.
Widely used by practitioners (they buy factors at a high premium).
Connections with Principal Component Analysis (PCA) (Jolliffe 2002).

Time-series models

The previous models are i.i.d., but there are hundreds of other models attempting to capture the time correlation or time structure of the returns, as well as the volatility clustering or heteroskedasticity.
To capture the time correlation we have mean models: VAR, VMA, VARMA, VARIMA, VECM, etc.
To capture the volatility clustering we have covariance models: ARCH, GARCH, VEC, DVEC, BEKK, CCC, DCC, etc.
Standard textbook references (Lütkepohl 2007; Tsay 2010, 2013):

📘 H. Lutkepohl. New Introduction to Multiple Time Series Analysis. Springer, 2007.

📘 R. S. Tsay. Multivariate Time Series Analysis: With R and Financial Applications. John Wiley & Sons, 2013.

Simple introductory reference (Feng and Palomar 2016):

📘 Y. Feng and D. P. Palomar. A Signal Processing Perspective on Financial Engineering. Foundations and Trends in Signal Processing, Now Publishers, 2016.

Fitting process

Before we can use a model, we need to estimate the model parameters (for example, in the i.i.d. model: \(\boldsymbol{\mu}\) and \(\boldsymbol{\Sigma}\)) using a training set.
Then use cross-validation to select the best fit (assuming we have different possible models each with a different fit).
Finally, we can use the best fitted model in the test data (aka out-of-sample data) for performance evaluation.

Be careful: if you use the test data many times, basically it is not out-of-sample anymore but cross-validation!

Sample estimates

Consider the i.i.d. model: \[\mathbf{r}_{t}=\boldsymbol{\mu}+\mathbf{w}_{t},\] where \(\boldsymbol{\mu}\in\mathbb{R}^{N}\) is the mean and \(\mathbf{w}_{t}\in\mathbb{R}^{N}\) is an i.i.d. process with zero mean and constant covariance matrix \(\boldsymbol{\Sigma}\).
The mean vector \(\boldsymbol{\mu}\) and covariance matrix \(\boldsymbol{\Sigma}\) have to be estimated in practice based on \(T\) observations.
The simplest estimators are the sample estimators:
- sample mean: \(\quad\hat{\boldsymbol{\mu}} =\frac{1}{T}\sum_{t=1}^{T}\mathbf{r}_{t}\)
- sample covariance matrix: \(\quad\hat{\boldsymbol{\Sigma}} =\frac{1}{T-1}\sum_{t=1}^{T}(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}})(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}})^{T}.\)
  Note that the factor \(1/\left(T-1\right)\) is used instead of \(1/T\) to get an unbiased estimator (asymptotically for \(T\rightarrow\infty\) they coincide).
Many more sophisticated estimators exist, namely: shrinkage estimators, Black-Litterman estimators, etc.

Least-Square (LS) estimator

Minimize the least-square error in the \(T\) observed i.i.d. samples: \[\underset{\boldsymbol{\mu}}{\textsf{minimize}} \quad\frac{1}{T}\sum_{t=1}^{T}\left\Vert \mathbf{r}_{t}-\boldsymbol{\mu}\right\Vert _{2}^{2}.\]
The optimal solution is the sample mean: \[\hat{\boldsymbol{\mu}}=\frac{1}{T}\sum_{t=1}^{T}\mathbf{r}_{t}.\]
The sample covariance of the residuals \(\hat{\mathbf{w}}_{t}=\mathbf{r}_{t}-\hat{\boldsymbol{\mu}}\) is the sample covariance matrix: \[\hat{\boldsymbol{\Sigma}}=\frac{1}{T-1}\sum_{t=1}^{T}\left(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}}\right)\left(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}}\right)^{T}.\]

Maximum Likelihood Estimator (MLE)

Assume \(\mathbf{r}_{t}\) are i.i.d. and follow a Gaussian distribution: \[f(\mathbf{r}) =\frac{1}{\sqrt{\left(2\pi\right)^{N}\left|\boldsymbol{\Sigma}\right|}}e^{-\frac{1}{2}(\mathbf{r}-\boldsymbol{\mu})^{T}\boldsymbol{\Sigma}^{-1}(\mathbf{r}-\boldsymbol{\mu})}.\] where
- \(\boldsymbol{\mu}\in\mathbb{R}^{N}\) is a mean vector that gives the location
- \(\boldsymbol{\Sigma}\in\mathbb{R}^{N\times N}\) is a positive definite covariance matrix that defines the shape.

MLE

Given the \(T\) i.i.d. samples \(\mathbf{r}_{t}, \;t=1,\ldots,T,\) the negative log-likelihood function is \[\begin{aligned} \ell(\boldsymbol{\mu},\boldsymbol{\Sigma}) &= -\log\prod_{t=1}^{T}f(\mathbf{r}_{t})\\ &=\frac{T}{2}\log\left|\boldsymbol{\Sigma}\right| + \sum_{t=1}^{T}\frac{1}{2}(\mathbf{r}_{t}-\boldsymbol{\mu})^{T}\boldsymbol{\Sigma}^{-1}(\mathbf{r}_{t}-\boldsymbol{\mu})+\text{const}. \end{aligned}\]
Setting the derivative of \(\ell(\boldsymbol{\mu},\boldsymbol{\Sigma})\) w.r.t. \(\boldsymbol{\mu}\) and \(\boldsymbol{\Sigma}^{-1}\) to zeros and solving the equations yield: \[\begin{aligned} \hat{\boldsymbol{\mu}} &= \frac{1}{T}\sum_{t=1}^{T}\mathbf{r}_{t}\\ \hat{\boldsymbol{\Sigma}} &= \frac{1}{T}\sum_{t=1}^{T}\left(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}}\right)\left(\mathbf{r}_{t}-\hat{\boldsymbol{\mu}}\right)^{T}. \end{aligned}\]

Parameter estimation

The parameter estimates \(\hat{\boldsymbol{\mu}}\) and \(\hat{\boldsymbol{\Sigma}}\) are only good for large \(T\), otherwise the estimation error is unacceptable.
For instance, the sample mean is particularly a very inefficient estimator, with very noisy estimates (Meucci 2005).
In practice, \(T\) cannot be large enough due to either:
- unavailability of data or
- lack of stationarity of data.
As a consequence, the estimates contain too much estimation error and a portfolio design (e.g., Markowitz mean-variance) based on those estimates can be severely affected (Chopra and Ziemba 1993).
Indeed, this is why Markowitz’s portfolio and other extensions are rarely used by practitioners.

Portfolio Basics

Portfolio return

Suppose the capital budget is \(B\) dollars.
The portfolio \(\mathbf{w}\in\mathbb{R}^{N}\) denotes the normalized dollar weights of the \(N\) assets such that \(\mathbf{1}^{T}\mathbf{w}=1\) (so \(B\mathbf{w}\) denotes dollars invested in the assets).
For each asset \(i\), the initial wealth is \(Bw_{i}\) and the end wealth is \[Bw_{i}\left(p_{i,t}/p_{i,t-1}\right)=Bw_{i}\left(R_{it}+1\right).\]
Then the portfolio return is \[R_{t}^{p}= \frac{\sum_{i=1}^{N}Bw_{i}\left(R_{it}+1\right)-B}{B}=\sum_{i=1}^{N}w_{i}R_{it}\approx\sum_{i=1}^{N}w_{i}r_{it}=\mathbf{w}^{T}\mathbf{r}_{t}\]
The portfolio expected return and variance are \(\mathbf{w}^{T}\boldsymbol{\mu}\) and \(\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\), respectively.

Performance measures

Expected return: \(\mathbf{w}^{T}\boldsymbol{\mu}\)
Volatility: \(\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}\)
Sharpe Ratio (SR): expected excess return per unit of risk \[\mathsf{SR} =\frac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\] where \(r_{f}\) is the risk-free rate (e.g., interest rate on a three-month U.S. Treasury bill).
Information Ratio (IR): SR with respect to a benchmark (e.g., the market index): \(\mathsf{IR} =\frac{\textsf{E}\left[\mathbf{w}^T\mathbf{r}_t - r_{b,t}\right]}{\sqrt{\textsf{Var}\left[\mathbf{w}^T\mathbf{r}_t - r_{b,t}\right]}}\).
Drawdown: decline from a historical peak of the cumulative profit \(X(t)\): \[D(T)=\max_{t\in[0,T]}X(t)-X(T)\]
VaR (Value at Risk): quantile of the loss.
ES (Expected Shortfall) or CVaR (Conditional Value at Risk): expected value of the loss above some quantile.

Practical constraints

Capital budget constraint: \[\mathbf{1}^T\mathbf{w} = 1.\]
Long-only constraint: \[\mathbf{w} \geq 0.\]
Dollar-neutral or self-financing constraint: \[\mathbf{1}^T\mathbf{w} = 0.\]
Holding constraint: \[\mathbf{l}\leq\mathbf{w}\leq \mathbf{u}\] where \(\mathbf{l}\in\mathbb{R}^{N}\) and \(\mathbf{u}\in\mathbb{R}^{N}\) are lower and upper bounds of the asset positions, respectively.

Practical constraints

Leverage constraint: \[\left\Vert \mathbf{w}\right\Vert _{1}\leq L.\]
Cardinality constraint: \[\left\Vert \mathbf{w}\right\Vert _{0} \leq K.\]
Turnover constraint: \[\left\Vert \mathbf{w}-\mathbf{w}_{0}\right\Vert _{1} \leq u\] where \(\mathbf{w}_{0}\) is the currently held portfolio.

Market-neutral constraint: \[\boldsymbol{\beta}^T\mathbf{w} = 0.\]

Wealth, NAV, or cumulative return

The Net Asset Value (NAV) is the value of the portfolio (aka wealth, cumulative return, or cumulative PnL).
The portfolio return in terms of NAV is \(R_{t}^{p} = \frac{\textsf{NAV}_t - \textsf{NAV}_{t-1}}{\textsf{NAV}_{t-1}}\) (note that \(1 + R_{t}^{p} = \frac{\textsf{NAV}_t}{\textsf{NAV}_{t-1}}\)).
Also recall that the portfolio return is calculated from the assets’ returns as \(R_{t}^{p} = \mathbf{w}^{T}\mathbf{r}_{t}\).
Thus, if we fully reinvest, it follows that we can compute the wealth of a portfolio as \[ \textsf{NAV}_t = \textsf{NAV}_0 \times (1+R_{1}^{p}) \times (1+R_{2}^{p}) \times \dots \times (1+R_{t}^{p}). \] where \(\textsf{NAV}_0=B_0\) is the initial budget.
If, instead, we keep investing the same initial budget \(B_0\) at each period, then \[\begin{aligned} \textsf{NAV}_t &= \textsf{NAV}_0 + \textsf{NAV}_0 \times R_{1}^{p} + \textsf{NAV}_0 \times R_{2}^{p} + \dots + \textsf{NAV}_0 \times R_{t}^{p}\\ &= \textsf{NAV}_0 \times (1 + R_{1}^{p} + R_{2}^{p} + \dots + R_{t}^{p}) \end{aligned}\]

R session: Loading market data

We will load some stock market data and divide it into a training part (for the estimation of the expected return \(\boldsymbol{\mu}\) and covariance matrix \(\boldsymbol{\Sigma}\), and subsequent portfolio design) and a test part (for the out-of-sample performance evaluation).

In particular, we will start by loading some stock data from three different sectors:

sector technology: AAPL, AMD, ADI.
sector healthcare: ABBV, AEZS, A.
sector basic materials: APD, AA, CF.

library(xts)  # to manipulate time series of stock data
library(quantmod)  # to download stock data
library(PerformanceAnalytics)  # to compute performance measures

# download data from YahooFinance
stock_namelist <- c("AAPL", "AMD", "ADI",  "ABBV", "AEZS", "A",  "APD", "AA","CF")
prices <- xts()
for (i in 1:length(stock_namelist)) {
  tmp <- Ad(getSymbols(stock_namelist[i], from = "2013-01-01", to = "2016-12-31", auto.assign = FALSE))
  tmp <- na.approx(tmp, na.rm = FALSE)  # interpolate NAs
  prices <- cbind(prices, tmp)
}
colnames(prices) <- stock_namelist
tclass(prices) <- "Date"

str(prices)

R>> An 'xts' object on 2013-01-02/2016-12-30 containing:
R>>   Data: num [1:1008, 1:9] 17.1 16.9 16.4 16.3 16.4 ...
R>>  - attr(*, "dimnames")=List of 2
R>>   ..$ : NULL
R>>   ..$ : chr [1:9] "AAPL" "AMD" "ADI" "ABBV" ...
R>>   Indexed by objects of class: [Date] TZ: UTC
R>>   xts Attributes:  
R>>  NULL

head(prices)

R>>                AAPL  AMD      ADI     ABBV AEZS        A      APD       AA       CF
R>> 2013-01-02 17.09469 2.53 36.35426 25.36849  253 27.86081 64.63237 20.62187 28.26927
R>> 2013-01-03 16.87892 2.49 35.76762 25.15901  254 27.96059 64.40654 20.80537 28.13558
R>> 2013-01-04 16.40876 2.59 35.13143 24.84119  257 28.51276 65.27223 21.24121 28.76579
R>> 2013-01-07 16.31224 2.67 35.23884 24.89175  259 28.30653 65.21201 20.87419 28.65803
R>> 2013-01-08 16.35615 2.67 34.87529 24.35000  255 28.08034 65.33247 20.87419 28.23788
R>> 2013-01-09 16.10052 2.63 34.78441 24.48724  258 28.83873 66.21318 20.82831 29.22548

tail(prices)

R>>                AAPL   AMD      ADI     ABBV AEZS        A      APD    AA       CF
R>> 2016-12-22 27.58162 11.60 67.92088 51.29409 4.10 44.42307 132.3398 29.75 26.71214
R>> 2016-12-23 27.63617 11.58 68.28153 51.85978 4.10 44.64533 132.6931 29.71 27.24443
R>> 2016-12-27 27.81169 12.07 68.71616 51.99287 4.05 44.94490 133.5083 29.65 28.34450
R>> 2016-12-28 27.69310 11.55 68.02262 51.80154 3.55 44.18148 131.4610 29.43 28.06061
R>> 2016-12-29 27.68598 11.59 68.04111 52.18421 3.60 44.23188 131.5613 28.89 28.30014
R>> 2016-12-30 27.47014 11.34 67.15337 52.09269 3.60 44.15434 131.0601 28.08 27.92754

# compute log-returns and linear returns
X_log <- diff(log(prices))[-1]
X_lin <- (prices/lag(prices) - 1)[-1]

# or alternatively...
X_log <- CalculateReturns(prices, "log")[-1]
X_lin <- CalculateReturns(prices)[-1]

N <- ncol(X_log)  # number of stocks
T <- nrow(X_log)  # number of days

We can take a look at the prices of the stocks:

plot(prices/rep(prices[1, ], each = nrow(prices)), col = rainbow10equal, legend.loc = "topleft",
     main = "Normalized prices")

We now divide the data into a training set and test set:

T_trn <- round(0.7*T)  # 70% of data
X_log_trn <- X_log[1:T_trn, ]
X_log_tst <- X_log[(T_trn+1):T, ]
X_lin_trn <- X_lin[1:T_trn, ]
X_lin_tst <- X_lin[(T_trn+1):T, ]

Heuristic Portfolios

Heuristic portfolios

Heuristic portfolios are not formally derived from a sound mathematical foundation. Instead, they are intuitive and based on common sense.

We will explore the following simple and heuristic portfolios:
- Buy & Hold (B&H)
- Buy & Rebalance
- equally weighted portfolio (EWP) or \(1/N\) portfolio
- quintile portfolio
- global maximum return portfolio (GMRP).

Buy & Hold (B&H)

The simplest investment strategy consists of selecting just one asset, allocating the whole budget \(B\) to it:
- Buy & Hold (B&H): chooses one asset and sticks to it forever.
- Buy & Rebalance: chooses one asset but it reevaluates that choice regularly.
The belief behind such investment is that the asset will increase gradually in value over the investment period.
There is no diversification in this strategy.
One can use different methods (like fundamental analysis or technical analysis) to make the choice.
Mathematically, it can be expressed as \[\mathbf{w} = \mathbf{e}_i\] where \(\mathbf{e}_i\) denotes the canonical vector with a 1 on the \(i\)th position and 0 elsewhere.

Buy & Hold (B&H)

Cumulative PnL of 9 possible B&H for 9 assets:

R session: Buy & Hold (B&H)

We will estimate \(\boldsymbol{\mu}\) and \(\boldsymbol{\Sigma}\) from the in-sample log-returns:

mu <- colMeans(X_log_trn)
Sigma <- cov(X_log_trn)

Buy & Hold simply means that we allocate the whole budget to one stock and we stick to it. Since we have \(N=9\) stocks in our universe, we can define \(N=9\) different B&H portfolios, which we will store as column vectors.

# a B&H portfolio is trivially the zero vector with a one on the stock held
w_BnH <- diag(N)
rownames(w_BnH) <- colnames(X_lin)
colnames(w_BnH) <- paste0("B&H - ", colnames(X_lin))
w_BnH

R>>      B&H - AAPL B&H - AMD B&H - ADI B&H - ABBV B&H - AEZS B&H - A B&H - APD B&H - AA B&H - CF
R>> AAPL          1         0         0          0          0       0         0        0        0
R>> AMD           0         1         0          0          0       0         0        0        0
R>> ADI           0         0         1          0          0       0         0        0        0
R>> ABBV          0         0         0          1          0       0         0        0        0
R>> AEZS          0         0         0          0          1       0         0        0        0
R>> A             0         0         0          0          0       1         0        0        0
R>> APD           0         0         0          0          0       0         1        0        0
R>> AA            0         0         0          0          0       0         0        1        0
R>> CF            0         0         0          0          0       0         0        0        1

We can now compute the performance of those \(N=9\) portfolios in the training data with the package PerformanceAnalytics:

# compute returns of all B&H portfolios
ret_BnH <- xts(X_lin %*% w_BnH, index(X_lin))
ret_BnH_trn <- ret_BnH[1:T_trn, ]
ret_BnH_tst <- ret_BnH[-c(1:T_trn), ]
head(ret_BnH)

R>>              B&H - AAPL    B&H - AMD    B&H - ADI   B&H - ABBV   B&H - AEZS      B&H - A
R>> 2013-01-03 -0.012622279 -0.015810277 -0.016136543 -0.008257408  0.003952569  0.003581555
R>> 2013-01-04 -0.027854626  0.040160643 -0.017786924 -0.012632689  0.011811024  0.019748080
R>> 2013-01-07 -0.005882527  0.030888031  0.003057490  0.002035410  0.007782101 -0.007232902
R>> 2013-01-08  0.002691905  0.000000000 -0.010316798 -0.021764159 -0.015444015 -0.007990806
R>> 2013-01-09 -0.015628922 -0.014981273 -0.002605799  0.005636221  0.011764706  0.027007866
R>> 2013-01-10  0.012395810 -0.003802281  0.012113702  0.002950067 -0.003875969  0.007382052
R>>                B&H - APD     B&H - AA     B&H - CF
R>> 2013-01-03 -0.0034940696  0.008898612 -0.004728952
R>> 2013-01-04  0.0134410574  0.020948193  0.022398861
R>> 2013-01-07 -0.0009225822 -0.017278772 -0.003746013
R>> 2013-01-08  0.0018470983  0.000000000 -0.014660708
R>> 2013-01-09  0.0134805106 -0.002197690  0.034974364
R>> 2013-01-10 -0.0009093658 -0.012114520  0.015029246

# performance measures
library(PerformanceAnalytics)
t(table.AnnualizedReturns(ret_BnH_trn))

R>>            Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> B&H - AAPL            0.1662             0.2571                    0.6466
R>> B&H - AMD            -0.0773             0.4930                   -0.1568
R>> B&H - ADI             0.1517             0.2418                    0.6276
R>> B&H - ABBV            0.2167             0.2567                    0.8442
R>> B&H - AEZS           -0.7375             1.1854                   -0.6221
R>> B&H - A               0.0819             0.2221                    0.3688
R>> B&H - APD             0.2150             0.2057                    1.0448
R>> B&H - AA              0.0271             0.3045                    0.0891
R>> B&H - CF              0.1674             0.2838                    0.5901

t(table.AnnualizedReturns(ret_BnH_tst))

R>>            Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> B&H - AAPL            0.0375             0.2386                    0.1572
R>> B&H - AMD             3.2190             0.8017                    4.0153
R>> B&H - ADI             0.2000             0.2453                    0.8151
R>> B&H - ABBV            0.1532             0.3080                    0.4973
R>> B&H - AEZS           -0.3470             1.3677                   -0.2538
R>> B&H - A               0.2220             0.2329                    0.9532
R>> B&H - APD             0.1450             0.1917                    0.7561
R>> B&H - AA              0.2154             0.4655                    0.4628
R>> B&H - CF             -0.3103             0.5015                   -0.6188

Note how the in-sample (ex ante) performance is not maintained out-of-sample (ex post).

We can compute many other performance measures:

table.DownsideRisk(ret_BnH_trn)

R>> VaR calculation produces unreliable result (inverse risk) for column: 1 : -0.434364986114259

R>>                               B&H - AAPL B&H - AMD B&H - ADI B&H - ABBV B&H - AEZS B&H - A
R>> Semi Deviation                    0.0119    0.0221    0.0104     0.0120     0.0423  0.0099
R>> Gain Deviation                    0.0104    0.0231    0.0118     0.0095     0.0936  0.0091
R>> Loss Deviation                    0.0123    0.0237    0.0104     0.0117     0.0528  0.0093
R>> Downside Deviation (MAR=210%)     0.0162    0.0263    0.0149     0.0163     0.0471  0.0148
R>> Downside Deviation (Rf=0%)        0.0115    0.0220    0.0100     0.0115     0.0434  0.0097
R>> Downside Deviation (0%)           0.0115    0.0220    0.0100     0.0115     0.0434  0.0097
R>> Maximum Drawdown                  0.2845    0.6524    0.2639     0.2630     0.9845  0.2296
R>> Historical VaR (95%)             -0.0246   -0.0441   -0.0231    -0.0272    -0.0667 -0.0227
R>> Historical ES (95%)              -0.0370   -0.0744   -0.0324    -0.0377    -0.1512 -0.0304
R>> Modified VaR (95%)               -0.0265   -0.0474   -0.0192    -0.0273         NA -0.0217
R>> Modified ES (95%)                -0.0559   -0.0756   -0.0192    -0.0404    -0.0038 -0.0332
R>>                               B&H - APD B&H - AA B&H - CF
R>> Semi Deviation                   0.0087   0.0130   0.0120
R>> Gain Deviation                   0.0099   0.0145   0.0140
R>> Loss Deviation                   0.0081   0.0122   0.0117
R>> Downside Deviation (MAR=210%)    0.0134   0.0178   0.0165
R>> Downside Deviation (Rf=0%)       0.0083   0.0128   0.0116
R>> Downside Deviation (0%)          0.0083   0.0128   0.0116
R>> Maximum Drawdown                 0.1984   0.5359   0.3573
R>> Historical VaR (95%)            -0.0196  -0.0267  -0.0263
R>> Historical ES (95%)             -0.0268  -0.0417  -0.0386
R>> Modified VaR (95%)              -0.0170  -0.0276  -0.0222
R>> Modified ES (95%)               -0.0191  -0.0347  -0.0222

To compute the wealth or cumulative P&L, we have two options: one assumes the same quantity is repeateadly invested, whereas the other assumes reinvesting (compounding):

# compute cumulative wealth
wealth_arith_BnH_trn <- 1 + cumsum(ret_BnH_trn)  # initial budget of 1$
wealth_geom_BnH_trn <- cumprod(1 + ret_BnH_trn)  # initial budget of 1$

# plots
# same as: 
#   plot(wealth_arith_BnH_trn[, 1], main = "Buy & Hold performance (not compounded)", ylab = "wealth")
chart.CumReturns(ret_BnH_trn[, 1], main = "Buy & Hold performance (not compounded)", 
                 geometric = FALSE, wealth.index = TRUE)

# same as: 
#   plot(wealth_geom_BnH_trn[, 1], main = "Buy & Hold performance (compounded)", ylab = "wealth")
chart.CumReturns(ret_BnH_trn[, 1], main = "Buy & Hold performance (compounded)", 
                 geometric = TRUE, wealth.index = TRUE)

We can plot many more plots with the package PerformanceAnalytics:

# more plots
chart.CumReturns(ret_BnH, main = "Buy & Hold performance", 
                 wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)

charts.PerformanceSummary(ret_BnH_trn, main = "Buy & Hold performance", 
                          wealth.index = TRUE, colorset = rich10equal)

chart.Boxplot(ret_BnH_trn)

chart.RiskReturnScatter(ret_BnH_trn, symbolset = 21, bg = "red")

Equally weighted portfolio (EWP) or \(1/N\) portfolio

One of the most important goals of quantitative portfolio management is to realize the goal of diversification across different assets in a portfolio.
A simple way to achieve diversification is by allocating the capital equally across all the assets.
This strategy is called equally weighted portfolio (EWP), \(1/N\) portfolio, uniform portfolio, or maximum deconcentration portfolio: \[\mathbf{w} = \frac{1}{N}\mathbf{1}.\]
It has been called “Talmudic rule” (Duchin and Levy 2009) since the Babylonian Talmud recommended this strategy approximately 1,500 years ago: “A man should always place his money, one third in land, a third in merchandise, and keep a third in hand.”
It has gained much interest due to superior historical performance and the emergence of several equally weighted ETFs (DeMiguel et al. 2009). For example, Standard & Poor’s has developed many S&P 500 equal weighted indices.

Quintile Portfolio

The quintile portfolio is widely used by practitioners.
Two types: long-only quintile portfolio and long-short quintile portfolio.
Basic idea: 1) rank the \(N\) stocks according to some criterion, 2) divide them into five parts, and 3) long the top part (and possibly short the bottom part).
One can rank the stocks in a multitude of ways (typically based on expensive factors that investment funds buy at a premium price).
If we restric to price data, three common possible rankings are according to:
1. \(\quad\boldsymbol{\mu}\)
2. \(\dfrac{\boldsymbol{\mu}}{\textsf{diag}(\boldsymbol{\Sigma})}\)
3. \(\dfrac{\boldsymbol{\mu}}{\sqrt{\textsf{diag}(\boldsymbol{\Sigma})}}\)

Global maximum return portfolio (GMRP)

Another simple way to make an investment from the \(N\) assets is to only invest on the one with the highest return.
Mathematically, the global maximum return portfolio (GMRP) is formulated as \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1, \quad \mathbf{w}\ge\mathbf{0}. \end{array}\]
This problem is convex and can be optimally solved, but of course the solution is trivial: to allocate all the budget to the asset with maximum return.
However, this seemingly good portfolio lacks diversification and performs poorly because past performance is not a guarantee of future performance.
In addition, the estimation of \(\boldsymbol{\mu}\) is extremely noisy in practice (Chopra and Ziemba 1993).

Example

Dollar allocation for EWP, QuintP, and GMRP:

R session: Comparison of \(1/N\) portfolio, quintile portfolio, and GMRP

We are now ready to consider several portfolio designs and compare their performance (the underlying constraint is simply \(\mathbf{w}\ge\mathbf{0}\) and \(\mathbf{1}^T\mathbf{w}=1\)).

The EWP or \(1/N\) portfolio allocates equal dollar weight to each stock: \(\mathbf{w} = \frac{1}{N}\mathbf{1}\).

w_EWP <- rep(1/N, N)
names(w_EWP) <- colnames(X_lin)
w_EWP

R>>      AAPL       AMD       ADI      ABBV      AEZS         A       APD        AA        CF 
R>> 0.1111111 0.1111111 0.1111111 0.1111111 0.1111111 0.1111111 0.1111111 0.1111111 0.1111111

Quintile portfolios are widely used by practitioners. The idea is to 1) rank the \(N\) stocks, 2) divide them into five parts, and 3) long the top part (and possibly short the bottom part). One can rank the stocks in a multitude of ways (typically based on expensive factors that investment funds buy at a premium price). For our experiments, we will consider three possible rankings according to:

\(\boldsymbol{\mu}\)
\(\boldsymbol{\mu}/\textsf{diag}(\boldsymbol{\Sigma})\)
\(\boldsymbol{\mu}/\sqrt{(\textsf{diag}(\boldsymbol{\Sigma}))}\)

# find indices of sorted stocks
i1 <- sort(mu, decreasing = TRUE, index.return = TRUE)$ix
i2 <- sort(mu/diag(Sigma), decreasing = TRUE, index.return = TRUE)$ix
i3 <- sort(mu/sqrt(diag(Sigma)), decreasing = TRUE, index.return = TRUE)$ix

# create portfolios
w_QuintP_1 <- w_QuintP_2 <- w_QuintP_3 <- rep(0, N)
w_QuintP_1[i1[1:round(N/5)]] <- 1/round(N/5)
w_QuintP_2[i2[1:round(N/5)]] <- 1/round(N/5)
w_QuintP_3[i3[1:round(N/5)]] <- 1/round(N/5)
w_QuintP <- cbind("QuintP (mu)"        = w_QuintP_1, 
                  "QuintP (mu/sigma2)" = w_QuintP_2, 
                  "QuintP (mu/sigma)"  = w_QuintP_3)
rownames(w_QuintP) <- colnames(X_lin)
w_QuintP

R>>      QuintP (mu) QuintP (mu/sigma2) QuintP (mu/sigma)
R>> AAPL         0.0                0.0               0.0
R>> AMD          0.0                0.0               0.0
R>> ADI          0.0                0.0               0.0
R>> ABBV         0.5                0.5               0.5
R>> AEZS         0.0                0.0               0.0
R>> A            0.0                0.0               0.0
R>> APD          0.5                0.5               0.5
R>> AA           0.0                0.0               0.0
R>> CF           0.0                0.0               0.0

The global maximum return portfolio (GMRP) chooses the stock with the hightest return during the in-sample period:

i_max <- which.max(mu)
w_GMRP <- rep(0, N)
w_GMRP[i_max] <- 1
names(w_GMRP) <- colnames(X_lin)
w_GMRP

R>> AAPL  AMD  ADI ABBV AEZS    A  APD   AA   CF 
R>>    0    0    0    1    0    0    0    0    0

We can now compare the allocations of the portfolios:

# put together all portfolios
w_heuristic <- cbind("EWP" = w_EWP, w_QuintP, "GMRP" = w_GMRP)
round(w_heuristic, digits = 2)

R>>       EWP QuintP (mu) QuintP (mu/sigma2) QuintP (mu/sigma) GMRP
R>> AAPL 0.11         0.0                0.0               0.0    0
R>> AMD  0.11         0.0                0.0               0.0    0
R>> ADI  0.11         0.0                0.0               0.0    0
R>> ABBV 0.11         0.5                0.5               0.5    1
R>> AEZS 0.11         0.0                0.0               0.0    0
R>> A    0.11         0.0                0.0               0.0    0
R>> APD  0.11         0.5                0.5               0.5    0
R>> AA   0.11         0.0                0.0               0.0    0
R>> CF   0.11         0.0                0.0               0.0    0

barplot(t(w_heuristic), col = rainbow8equal[1:5], legend = colnames(w_heuristic), beside = TRUE,
        main = "Portfolio allocation of heuristic portfolios", xlab = "stocks", ylab = "dollars")

Then we can compare the performance (in-sample vs out-of-sample):

# compute returns of all portfolios
ret_heuristic <- xts(X_lin %*% w_heuristic, index(X_lin))
ret_heuristic$`QuintP (mu/sigma2)` <- NULL  # remove since it coincides with "QuintP (mu/sigma)"
ret_heuristic_trn <- ret_heuristic[1:T_trn, ]
ret_heuristic_tst <- ret_heuristic[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_heuristic_trn))

R>>                   Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> EWP                          0.0309             0.2057                    0.1503
R>> QuintP (mu)                  0.2268             0.1899                    1.1944
R>> QuintP (mu/sigma)            0.2268             0.1899                    1.1944
R>> GMRP                         0.2167             0.2567                    0.8442

t(table.AnnualizedReturns(ret_heuristic_tst))

R>>                   Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> EWP                          0.3393             0.2764                    1.2273
R>> QuintP (mu)                  0.1644             0.1978                    0.8314
R>> QuintP (mu/sigma)            0.1644             0.1978                    0.8314
R>> GMRP                         0.1532             0.3080                    0.4973

Let’s plot the wealth evolution (cumulative PnL) over time:

{ chart.CumReturns(ret_heuristic, main = "Cumulative return of heuristic portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich8equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

charts.PerformanceSummary(ret_heuristic, main = "Performance of heuristic portfolios", 
                          wealth.index = TRUE, colorset = rich8equal)

Finally, we can plot the risk-return scatter plot:

chart.RiskReturnScatter(ret_heuristic_trn, symbolset = 21, bg = "red",
                        main = "Annualized Return and Risk (in-sample)")

Markowitz’s Modern Portfolio Theory (MPT)

Risk control

In finance, the expected return \(\mathbf{w}^{T}\boldsymbol{\mu}\) is very relevant as it quantifies the average benefit.

However, in practice, the average performance is not enough to characterize an investment and one needs to control the probability of going bankrupt.

Risk measures control how risky an investment strategy is.

The most basic measure of risk is given by the variance (Markowitz 1952): a higher variance means that there are large peaks in the distribution which may cause a big loss.

There are more sophisticated risk measures such as downside risk, VaR, ES, etc.

Mean-variance portfolio (MVP)

Mean-variance tradeoff

The mean return \(\mathbf{w}^{T}\boldsymbol{\mu}\) and the variance (risk) \(\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\) (equivalently, the standard deviation or volatility \(\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}\)) constitute two important performance measures.

Usually, the higher the mean return the higher the variance and vice-versa.

Thus, we are faced with two objectives to be optimized: it is a multi-objective optimization problem.

They define a fundamental mean-variance tradeoff curve (Pareto curve).

The choice of a specific point in this tradeoff curve depends on how agressive or risk-averse the investor is.

Mean-variance tradeoff

Markowitz’s mean-variance portfolio (1952)

The idea of Markowitz’s mean-variance portfolio (MVP) (Markowitz 1952) is to find a trade-off between the expected return \(\mathbf{w}^{T}\boldsymbol{\mu}\) and the risk of the portfolio measured by the variance \(\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu}-\lambda\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1 \end{array}\] where \(\mathbf{w}^{T}\mathbf{1}=1\) is the capital budget constraint and \(\lambda\) is a parameter that controls how risk-averse the investor is.
This is also referred to as Modern Portfolio Theory (MPT).

This is a convex quadratic problem (QP) with only one linear constraint which admits a closed-form solution: \[\mathbf{w}_{\sf MVP} = \frac{1}{2\lambda}\boldsymbol{\Sigma}^{-1}\left(\boldsymbol{\mu}+\nu\mathbf{1}\right),\] where \(\nu\) is the optimal dual variable \(\nu=\frac{2\lambda-\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\boldsymbol{\mu}}{\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\mathbf{1}}\).

Markowitz’s mean-variance portfolio (1952)

There are two alternative obvious reformulations for Markowitz’s portfolio.
Maximization of mean return: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu}\\ \textsf{subject to} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\leq\alpha\\ & \mathbf{1}^{T}\mathbf{w}=1. \end{array}\]
Minimization of risk: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{w}^{T}\boldsymbol{\mu}\geq\beta\\ & \mathbf{1}^{T}\mathbf{w}=1. \end{array}\]
The three formulations give different points on the Pareto optimal curve.
They all require choosing one parameter (\(\alpha\), \(\beta\), or \(\lambda\)).
By sweeping over this parameter, one can recover the whole Pareto optimal curve.

Efficient frontier

The previous three problems result in the same mean-variance trade-off curve (Pareto curve):

Markowitz’s portfolio with practical constraints

A general Markowitz’s portfolio with practical constraints could be: \[\begin{array}{lll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu}-\lambda\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w} & \\ \textsf{subject to} & \mathbf{w}^{T}\mathbf{1}=1 & \text{budget}\\ & \mathbf{w}\ge\mathbf{0} & \text{no shorting}\\ & \left\Vert \mathbf{w}\right\Vert _{1}\leq\gamma & \text{leverage}\\ & \left\Vert \mathbf{w}-\mathbf{w}_{0}\right\Vert _{1}\leq\tau & \text{turnover}\\ & \left\Vert \mathbf{w}\right\Vert _{\infty}\leq u & \text{max position}\\ & \left\Vert \mathbf{w}\right\Vert _{0}\leq K & \text{sparsity} \end{array}\] where:
- \(\gamma\geq1\) controls the amount of shorting and leveraging
- \(\tau>0\) controls the turnover (to control the transaction costs in the rebalancing)
- \(u\) limits the position in each stock
- \(K\) controls the cardinality of the portfolio (to select a small set of stocks from the universe).
Without the sparsity constraint, the problem can be rewritten as a QP.

R session: Markowitz’s MVP

Markowitz’s mean-variance portfolio (MVP) with no shorting is formulated as \[ \begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \boldsymbol{\mu}^T\mathbf{w} -\lambda\mathbf{w}^T\mathbf{\Sigma}\mathbf{w}\\ {\textsf{subject to}} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0}. \end{array} \]

This problem does not have a closed-form solution and we need to resort to using a solver. It is very convenient to use the package CVXR (although the computational cost will be high and the solution not totally robust, if necessary use a QP solver like quadprog):

library(CVXR)

# create function for MVP
MVP <- function(mu, Sigma, lmd = 0.5) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Maximize(t(mu) %*% w - lmd*quad_form(w, Sigma)),
                  constraints = list(w >= 0, sum(w) == 1))
  result <- CVXR::solve(prob)
  w <- as.vector(result$getValue(w))
  names(w) <- colnames(Sigma)
  return(w)
}

# this function can now be used as
w_MVP <- MVP(mu, Sigma, lmd = 2)

Global minimum variance portfolio (GMVP)

Recall the risk minimization formulation: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{w}^{T}\boldsymbol{\mu}\geq\beta\\ & \mathbf{1}^{T}\mathbf{w}=1. \end{array}\]
The global minimum variance portfolio (GMVP) ignores the expected return and focuses on the risk only: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1. \end{array}\]
It is a simple convex QP with solution \[\mathbf{w}_{\sf GMVP}=\frac{1}{\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\mathbf{1}}\boldsymbol{\Sigma}^{-1}\mathbf{1}.\]
It is widely used in academic papers for simplicity of evaluation and comparison of different estimators of the covariance matrix \(\boldsymbol{\Sigma}\) (while ignoring the estimation of \(\boldsymbol{\mu}\)).

GMVP with leverage constraints

The GMVP is typically considered with no-short constraints \(\mathbf{w}\geq\mathbf{0}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1, \;\mathbf{w}\geq\mathbf{0}. \end{array}\]
However, if short-selling is allowed, one needs to limit the amount of leverage to avoid impractical solutions with very large positive and negative weights that cancel out.
A sensible GMVP formulation with leverage is \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1, \;\left\Vert \mathbf{w}\right\Vert _{1}\leq\gamma \end{array}\] where \(\gamma\geq1\) is a parameter that controls the amount of leverage:
- \(\gamma=1\) means no shorting (so equivalent to \(\mathbf{w}\geq\mathbf{0}\))
- \(\gamma>1\) allows some shorting as well as leverage in the longs, e.g., \(\gamma=1.5\) would allow the portfolio \(\mathbf{w}=\left(1.25,-0.25\right)\).

R session: GMVP

The Global Minimum Variance Portfolio (GMVP) with no shorting is formulated as \[ \begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^T\mathbf{\Sigma}\mathbf{w}\\ {\textsf{subject to}} & \mathbf{1}^T\mathbf{w} = 1\\ & \mathbf{w}\ge\mathbf{0} \end{array} \]

Since a closed-form solution does not exist with the constraint \(\mathbf{w}\ge\mathbf{0}\), we need to resort to a solver. We can convenientky use the package CVXR (although the computational cost will be high and the solution not totally robust, if necessary use a QP solver like quadprog):

library(CVXR)

# create function for GMVP
GMVP <- function(Sigma) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Minimize(quad_form(w, Sigma)), 
                  constraints = list(w >= 0, sum(w) == 1))
  result <- CVXR::solve(prob)
  w <- as.vector(result$getValue(w))
  names(w) <- colnames(Sigma)
  return(w)
}

# this function can now be used as
w_GMVP <- GMVP(Sigma)

Maximum Sharpe ratio portfolio (MSRP)

Markowitz’s mean-variance framework provides portfolios along the Pareto-optimal frontier and the choice depends on the risk-aversion of the investor.
But typically one measures an investment with the Sharpe ratio: only one portfolio on the Pareto-optimal frontier achieves the maximum Sharpe ratio.
Precisely, Sharpe (1966) first proposed the maximization of the Sharpe ratio: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right) \end{array}\] where \(r_{f}\) is the return of a risk-free asset.
However, this problem is not convex!
This problem belong to the class of fractional programming (FP) with many methods available for its resolution.

Interlude: Fractional Programming (FP)

Fractional programming (FP) is a family of optimization problems involving ratios.
Its history can be traced back to a paper on economic expansion by von Neumann (1937).
It has since inspired the studies in economics, management science, optics, information theory, communication systems, graph theory, computer science, etc.
Given functions \(f(\mathbf{x})\ge0\) and \(g(\mathbf{x})>0\), a single-ratio FP is \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & \dfrac{f(\mathbf{x})}{g(\mathbf{x})}\\ \textsf{subject to} & \mathbf{x}\in\cal{X}. \end{array}\]
FP has been widely studied and extended to deal with multiple ratios such as \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & \min_i\dfrac{f_i(\mathbf{x})}{g_i(\mathbf{x})}\\ \textsf{subject to} & \mathbf{x}\in\cal{X}. \end{array}\]

Interlude: How to solve FP

FPs are nonconvex problems, so in principle they are tough to solve (Stancu-Minasian 1992).
However, the so-called concave-convex FP can be easibly solved in different ways.
We will focus on the concave-convex single-ratio FP: \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & \dfrac{f(\mathbf{x})}{g(\mathbf{x})}\\ \textsf{subject to} & \mathbf{x}\in\cal{X}. \end{array}\] where \(f(\mathbf{x})\ge0\) is concave and \(g(\mathbf{x})>0\) is convex.
Main approaches:
- via bisection method (aka sandwich technique)
- via Dinkelbach transform
- via Schaible transform (Charnes-Cooper transform for the linear FP case).

Interlude: Solving FP via bisection

The idea is to realize that a concave-convex FP, while not convex, is quasi-convex.
This can be easily seen by rewritting the problem in epigraph form: \[\begin{array}{ll} \underset{\mathbf{x},t}{\textsf{maximize}} & t\\ \textsf{subject to} & t \le \dfrac{f(\mathbf{x})}{g(\mathbf{x})}\\ & \mathbf{x}\in\cal{X}. \end{array}\]
If now we fix the variable \(t\) to some value (so it is not a variable anymore), then we can rewrite it as a convex problem: \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & t\\ \textsf{subject to} & t g(\mathbf{x}) \le f(\mathbf{x})\\ & \mathbf{x}\in\cal{X}. \end{array}\]
At this point, one can easily solve this convex problem optimally with a solver and then solve for \(t\) via the bisection algorithm (aka sandwich technique). It converges to the global optimal solution.

Interlude: Solving FP via bisection

Recall the quasi-convex problem we want to solve: \[\begin{array}{ll} \underset{\mathbf{x},t}{\textsf{maximize}} & t\\ \textsf{subject to} & t g(\mathbf{x}) \le f(\mathbf{x})\\ & \mathbf{x}\in\cal{X}. \end{array}\]

Algorithm 1: Bisection method (aka sandwich technique)

Given upper- and lower-bounds on \(t\): \(t^{\textsf{ub}}\) and \(t^{\textsf{lb}}.\)
1. compute mid-point: \(t=\left(t^{\textsf{ub}}+t^{\textsf{lb}}\right)/2\)
2. solve the following feasibility problem: \[\begin{array}{ll} \underset{\;}{\textsf{find}} & \mathbf{x}\\ \textsf{subject to} & t g(\mathbf{x}) \le f(\mathbf{x})\\ & \mathbf{x}\in\cal{X}. \end{array}\] 3. if feasible, then set \(t^{\textsf{lb}}=t\); otherwise set \(t^{\textsf{ub}}=t\)
4. if \(t^{\textsf{ub}}-t^{\textsf{lb}}>\epsilon\) go to step 1; otherwise finish.

Interlude: Solving FP via Dinkelbach transform\(^*\)

The Dinkelbach transform was proposed in (Dinkelbach 1967).
It reformulates the original concave-convex FP problem \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & \dfrac{f(\mathbf{x})}{g(\mathbf{x})}\\ \textsf{subject to} & \mathbf{x}\in\cal{X}. \end{array}\] as the convex problem \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & f(\mathbf{x}) - y g(\mathbf{x})\\ \textsf{subject to} & \mathbf{x}\in\cal{X} \end{array}\] with a new auxiliary variable \(y\), which is iteratively updated by \[y^{(k)} = \frac{f(\mathbf{x}^{(k)})}{g(\mathbf{x}^{(k)})}\] where \(k\) is the iteration index.

Interlude: Solving FP via Dinkelbach transform\(^*\)

Algorithm 2: Dinkelbach method

Set \(k=0\) and initialize \(\mathbf{x}^{(0)}\)
repeat

Set \(y^{(k)} = \frac{f(\mathbf{x}^{(k)})}{g(\mathbf{x}^{(k)})}\)
Obtain next point \(\mathbf{x}^{(k+1)}\) by solving \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & f(\mathbf{x}) - y^{(k)} g(\mathbf{x})\\ \textsf{subject to} & \mathbf{x}\in\cal{X} \end{array}\]
\(k \gets k+1\)

until convergence
return \(\mathbf{x}^{(k)}\)

Dinkelbach method can be shown to converge to the global optimum of the original concave-convex FP by carefully analyzing the increasingness of \(y^{(k)}\) and the funtion \(F(y) = \arg\max_\mathbf{x} f(\mathbf{x}) - y g(\mathbf{x})\).

Interlude: Solving Linear FP via Charnes-Cooper transform\(^*\)

The Charnes-Cooper transform was proposed in (Charnes and Cooper 1962) to solve the linear FP (LFP) case (Bajalinov 2003): \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & \dfrac{\mathbf{c}^T\mathbf{x}+\alpha}{\mathbf{d}^T\mathbf{x}+\beta}\\ \textsf{subject to} & \mathbf{A}\mathbf{x}\le\mathbf{b} \end{array}\] over the set \(\{\mathbf{x} \mid \mathbf{d}^T\mathbf{x}+\beta > 0\}\).
The Charnes-Cooper transform introduces two new variables \[ \mathbf{y} = \frac{1}{\mathbf{d}^T\mathbf{x}+\beta}\mathbf{x}, \quad t = \frac{1}{\mathbf{d}^T\mathbf{x}+\beta}. \]

Interlude: Solving Linear FP via Charnes-Cooper transform\(^*\)

The LFP becomes then a linear program (LP): \[\begin{array}{ll} \underset{\mathbf{y},t}{\textsf{maximize}} & \mathbf{c}^T\mathbf{y}+\alpha t\\ \textsf{subject to} & \mathbf{A}\mathbf{y}\le\mathbf{b}t\\ & \mathbf{d}^T\mathbf{y}+\beta t = 1\\ & t\ge0\\ \end{array}\]
The solution for \(\mathbf{y}\) and \(t\) yields the solution of the original problem as \[\mathbf{x} = \frac{1}{t}\mathbf{y}.\]
Note that the number of constraints in the LP formulation has increased.

Interlude: Solving Linear FP via Charnes-Cooper transform\(^*\)

Proof:

Any feasible point \(\mathbf{x}\) in the original LFP leads to a feasible point \((\mathbf{y},t)\) in the LP (via the equations in the Charnes-Cooper transform) with the same objective value. The objective is \[ \frac{\mathbf{c}^T\mathbf{x}+\alpha}{\mathbf{d}^T\mathbf{x}+\beta} = \frac{\mathbf{c}^T\mathbf{y}/t+\alpha}{\mathbf{d}^T\mathbf{y}/t+\beta} = \frac{\mathbf{c}^T\mathbf{y}+\alpha t}{\mathbf{d}^T\mathbf{y}+\beta t} \] and since it is scale invariant, we can choose to set the denominator equal to 1.
Conversely, any feasible point \((\mathbf{y},t)\) in the LP leads to a feasible point \(\mathbf{x}\) in the original LFP (via \(\mathbf{x} = \frac{1}{t}\mathbf{y}\)). From the denominator constraint \[ \mathbf{d}^T\mathbf{y}+\beta t = 1 \Longleftrightarrow \mathbf{d}^T\mathbf{y}/t+\beta = 1/t \Longleftrightarrow 1/(\mathbf{d}^T\mathbf{x}+\beta) = t \] which leads to the objective \[ \mathbf{c}^T\mathbf{y}+\alpha t = t (\mathbf{c}^T\mathbf{y}/t+\alpha) = t (\mathbf{c}^T\mathbf{x}+\alpha) = (\mathbf{c}^T\mathbf{x}+\alpha)/(\mathbf{d}^T\mathbf{x}+\beta). \]

Interlude: Solving FP via Schaible transform\(^*\)

The Schaible transform is a generalization of the Charnes-Cooper transform proposed in (Schaible 1974) to solve the concave-convex FP: \[\begin{array}{ll} \underset{\mathbf{x}}{\textsf{maximize}} & \dfrac{f(\mathbf{x})}{g(\mathbf{x})}\\ \textsf{subject to} & \mathbf{x}\in\cal{X}. \end{array}\]
The Schaible transform introduces two new variables (note that \(\mathbf{x} = \mathbf{y}/t\)): \[ \mathbf{y} = \frac{\mathbf{x}}{g(\mathbf{x})}, \quad t = \frac{1}{g(\mathbf{x})}. \]
The original concave-convex FP is equivalent to the convex problem: \[\begin{array}{ll} \underset{\mathbf{y},t}{\textsf{maximize}} & t f\left(\frac{\mathbf{y}}{t}\right)\\ \textsf{subject to} & t g\left(\frac{\mathbf{y}}{t}\right) \le 1\\ & t \ge 0\\ & \mathbf{y}/t\in\cal{X}. \end{array}\]

Solving the Maximum Sharpe ratio portfolio (MSRP)

Recall the maximization of the Sharpe ratio proposed in (Sharpe 1966): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right). \end{array}\]

This problem is nonconvex, but upon recognition as a fractional program (FP), we can consider its resolution via:
- bisection method (aka sandwich technique),
- Dinkelbach transform, and
- Schaible transform (aka Charnes-Cooper transform for the linear FP case).

R session: Maximum Sharpe ratio portfolio via a general-purpose nonlinear solver

The maximum Sharpe ratio portfolio (MSRP) is given by the solution to the nonconvex problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad\mathbf{w}\geq\mathbf{0}. \end{array}\]

We will solve this problem with the general-purpose nonlinear solver nloptr in R:

library(nloptr)

# define the nonconvex objective function
fn_SR <- function(w) {
  return(as.numeric(t(w) %*% mu / sqrt(t(w) %*% Sigma %*% w)))
}
  
# initial point
w0 <- rep(1/N, N)

res <- nloptr::slsqp(w0, fn_SR,
                     lower = rep(0, N),  # w >= 0
                     heq = function(w) return(sum(w) - 1))    # sum(w) = 1
w_nonlinear_solver <- res$par
res

R>> $par
R>> [1] 0.000000e+00 1.611956e-01 0.000000e+00 0.000000e+00 8.388044e-01 0.000000e+00 9.750283e-18
R>> [8] 0.000000e+00 0.000000e+00
R>> 
R>> $value
R>> [1] -0.07987582
R>> 
R>> $iter
R>> [1] 16
R>> 
R>> $convergence
R>> [1] 4
R>> 
R>> $message
R>> [1] "NLOPT_XTOL_REACHED: Optimization stopped because xtol_rel or xtol_abs (above) was reached."

Maximum Sharpe ratio portfolio via bisection

The idea is to realize that the problem, while not convex, is quasi-convex.
This can be easily seen by rewritting the problem in epigraph form: \[\begin{array}{ll} \underset{\mathbf{w},t}{\textsf{maximize}} & t\\ \textsf{subject to} & t \leq \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right). \end{array}\]
If now we fix the variable \(t\) to some value (so it is not a variable anymore), the problem is easily recognized as a (convex) second order cone program (SOCP): \[\begin{array}{ll} \underset{\;}{\textsf{find}} & \mathbf{w}\\ \textsf{subject to} & t \left\Vert \boldsymbol{\Sigma}^{1/2}\mathbf{w}\right\Vert_{2}\leq\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}\\ & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right). \end{array}\]
At this point, one can easily solve the convex problem with an SOCP solver and then solve for \(t\) via yhe bisection algorithm (aka sandwich technique).

R session: Maximum Sharpe ratio portfolio via bisection

We are going to solve the nonconvex problem \[\begin{array}{ll} \underset{\mathbf{w},t}{\textsf{maximize}} & t\\ \textsf{subject to} & t \leq \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right). \end{array}\] via bisection on \(t\) with the following (convex) SOCP problem for a given \(t\): \[\begin{array}{ll} \underset{\;}{\textsf{find}} & \mathbf{w}\\ \textsf{subject to} & t \left\Vert \boldsymbol{\Sigma}^{1/2}\mathbf{w}\right\Vert_{2}\leq\mathbf{w}^{T}\boldsymbol{\mu}\\ & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right). \end{array}\]

# define the inner solver based on an SOCP solver 
# (we will simply use CVXR for convenience, see: https://cvxr.rbind.io/cvxr_functions/)
library(CVXR)

# square-root of matrix Sigma
Sigma_12 <- chol(Sigma)
max(abs(t(Sigma_12) %*% Sigma_12 - Sigma))  # sanity check

R>> [1] 8.673617e-19

# create function for MVP
SOCP_bisection <- function(t) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Maximize(0),
                  constraints = list(t*cvxr_norm(Sigma_12 %*% w, 2) <= t(mu) %*% w,
                                     sum(w) == 1,
                                     w >= 0))
  result <- CVXR::solve(prob)
  return(list("status" = result$status, "w" = as.vector(result$getValue(w))))
}

# now run the bisection algorithm
t_lb <- 0   # for sure the problem is feasible in this case
t_ub <- 10  # a tighter upper bound coud be chose, but a Sharpe ratio of 10 surely cannot be achieved
while(t_ub - t_lb > 1e-6) {
  t <- (t_ub + t_lb)/2  # midpoint
  if(SOCP_bisection(t)$status == "infeasible")
    t_ub <- t
  else
    t_lb <- t
}
w_bisection <- SOCP_bisection(t_lb)$w

# comparison between two solutions
round(cbind(w_nonlinear_solver, w_bisection), digits = 3)

R>>       w_nonlinear_solver w_bisection
R>>  [1,]              0.000       0.183
R>>  [2,]              0.161       0.000
R>>  [3,]              0.000       0.002
R>>  [4,]              0.000       0.257
R>>  [5,]              0.839       0.000
R>>  [6,]              0.000       0.000
R>>  [7,]              0.000       0.482
R>>  [8,]              0.000       0.000
R>>  [9,]              0.000       0.076

# Sharpe ratio of two solutions
c("nonlinear_solver" = fn_SR(w_nonlinear_solver), 
  "bisection"        = fn_SR(w_bisection))

R>> nonlinear_solver        bisection 
R>>      -0.07987582       0.07006683

We can see that the nonlinear solver could not properly solve the nonconvex problem. However, using bisection we can solve the problem optimally.

Maximum Sharpe ratio portfolio via Dinkelbach

The Dinkelbach transform proposed in (Dinkelbach 1967) reformulates the original concave-convex FP problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right) \end{array}\] as the convex problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu} - y\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right) \end{array}\] with a new auxiliary variable \(y\), which is iteratively updated by \[y^{(k)} = \frac{\mathbf{w}^{(k)T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{(k)T}\boldsymbol{\Sigma}\mathbf{w}^{(k)}}}\] where \(k\) is the iteration index.

Maximum Sharpe ratio portfolio via Dinkelbach

By noting that \(\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}} = \left\Vert \boldsymbol{\Sigma}^{1/2}\mathbf{w}\right\Vert_{2}\), where \(\boldsymbol{\Sigma}^{1/2}\) satisfies \(\boldsymbol{\Sigma}^{T/2}\boldsymbol{\Sigma}^{1/2}=\boldsymbol{\Sigma}\), we can finally rewrite the problem as a SOCP.
The iterative Dinkelbach-based method obtains \(\mathbf{w}^{(k+1)}\) by solving the following problem at the \(k\)th iteration: \[\begin{array}{ll} \underset{\mathbf{w},t}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu} - y^{(k)} t\\ \textsf{subject to} & t \ge \left\Vert \boldsymbol{\Sigma}^{1/2}\mathbf{w}\right\Vert_{2}\\ & \mathbf{1}^{T}\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right) \end{array}\] where \[y^{(k)} = \frac{\mathbf{w}^{(k)T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{(k)T}\boldsymbol{\Sigma}\mathbf{w}^{(k)}}}.\]

R session: Maximum Sharpe ratio portfolio via Dinkelbach

We are going to solve the nonconvex problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad \mathbf{w}\geq\mathbf{0} \end{array}\] by iteratively solving the (convex) SOCP problem for a given \(y^{(k)}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \mathbf{w}^{T}\boldsymbol{\mu} - y^{(k)} \left\Vert \boldsymbol{\Sigma}^{1/2}\mathbf{w}\right\Vert_{2}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad \mathbf{w}\geq\mathbf{0}. \end{array}\] where \[y^{(k)} = \frac{\mathbf{w}^{(k)T}\boldsymbol{\mu}}{\sqrt{\mathbf{w}^{(k)T}\boldsymbol{\Sigma}\mathbf{w}^{(k)}}}.\]

# define the inner solver based on an SOCP solver 
# (we will simply use CVXR for convenience, see: https://cvxr.rbind.io/cvxr_functions/)
library(CVXR)

# square-root of matrix Sigma
Sigma_12 <- chol(Sigma)
max(abs(t(Sigma_12) %*% Sigma_12 - Sigma))  # sanity check

R>> [1] 8.673617e-19

# create function for MVP
SOCP_Dinkelbach <- function(y) {
  w <- Variable(nrow(Sigma))
  prob <- Problem(Maximize(t(mu) %*% w - y*cvxr_norm(Sigma_12 %*% w, 2)),
                  constraints = list(sum(w) == 1,
                                     w >= 0))
  result <- CVXR::solve(prob)
  return(as.vector(result$getValue(w)))
}

# initial point (has to satisfy t(w_k) %*% mu>=0)
i_max <- which.max(mu)
w_k <- rep(0, N)  
w_k[i_max] <- 1

# now the iterative Dinkelbach algorithm
k <- 1
while(k == 1 || max(abs(w_k - w_prev)) > 1e-6) {
  w_prev <- w_k
  y_k <- as.numeric(t(w_k) %*% mu / sqrt(t(w_k) %*% Sigma %*% w_k))
  w_k <- SOCP_Dinkelbach(y_k)
  k <- k + 1
}
w_Dinkelbach <- w_k
cat("Number of iterarions:", k-1)

R>> Number of iterarions: 4

# comparison among three solutions
round(cbind(w_nonlinear_solver, w_bisection, w_Dinkelbach), digits = 3)

R>>       w_nonlinear_solver w_bisection w_Dinkelbach
R>>  [1,]              0.000       0.183        0.183
R>>  [2,]              0.161       0.000        0.000
R>>  [3,]              0.000       0.002        0.000
R>>  [4,]              0.000       0.257        0.258
R>>  [5,]              0.839       0.000        0.000
R>>  [6,]              0.000       0.000        0.000
R>>  [7,]              0.000       0.482        0.483
R>>  [8,]              0.000       0.000        0.000
R>>  [9,]              0.000       0.076        0.076

# Sharpe ratio of three solutions
c("nonlinear_solver" = fn_SR(w_nonlinear_solver), 
  "bisection"        = fn_SR(w_bisection),
  "Dinkelbach"       = fn_SR(w_Dinkelbach))

R>> nonlinear_solver        bisection       Dinkelbach 
R>>      -0.07987582       0.07006683       0.07006775

As expected, the Dinkelbach method converges to the optimal solution, like the bisection method (but unlike the general-purpose nonlinear solver).

Maximum Sharpe ratio portfolio via Schaible

The Schaible transform is a generalization of the Charnes-Cooper transform proposed in (Schaible 1974) to solve a concave-convex FP: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}-r_{f}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1, \quad\left(\mathbf{w}\geq\mathbf{0}\right). \end{array}\]
The Schaible transform introduces two new variables (note that \(\mathbf{w} = \tilde{\mathbf{w}}/t\)): \[ \tilde{\mathbf{w}} = \frac{\mathbf{w}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}, \quad t = \frac{1}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}. \]
The original maximum Sharpe ratio portfolio is equivalent to the convex quadratic problem (QP): \[\begin{array}{ll} \underset{\tilde{\mathbf{w}},t}{\textsf{maximize}} & \tilde{\mathbf{w}}^{T}\boldsymbol{\mu} - r_{f} t\\ \textsf{subject to} & \tilde{\mathbf{w}}^{T}\boldsymbol{\Sigma}\tilde{\mathbf{w}} \le 1\\ & t \ge 0\\ & \mathbf{1}^T\tilde{\mathbf{w}}=t, \quad\left(\tilde{\mathbf{w}}\geq\mathbf{0}\right). \end{array}\]

Maximum Sharpe ratio portfolio via Schaible

The previous problem can be simplified by eliminating the variable \(t=\mathbf{1}^T\tilde{\mathbf{w}}\): \[\begin{array}{ll} \underset{\tilde{\mathbf{w}}}{\textsf{maximize}} & \tilde{\mathbf{w}}^T\left(\boldsymbol{\mu} - r_{f}\mathbf{1}\right)\\ \textsf{subject to} & \tilde{\mathbf{w}}^T\boldsymbol{\Sigma}\tilde{\mathbf{w}} \le 1\\ & \mathbf{1}^T\tilde{\mathbf{w}} \ge 0, \quad\left(\tilde{\mathbf{w}}\geq\mathbf{0}\right) \end{array}\] from which we can recover the original solution as \(\mathbf{w} = \tilde{\mathbf{w}}/t = \tilde{\mathbf{w}}/\left(\mathbf{1}^T\tilde{\mathbf{w}}\right)\).
Interestingly, we can get the following slightly different formulation by starting with a minimization of a ratio: \[\begin{array}{ll} \underset{\tilde{\mathbf{w}}}{\textsf{minimize}} & \tilde{\mathbf{w}}^{T}\boldsymbol{\Sigma}\tilde{\mathbf{w}}\\ \textsf{subject to} & \tilde{\mathbf{w}}^{T}(\boldsymbol{\mu}-r_{f}\mathbf{1})=1\\ & \mathbf{1}^{T}\tilde{\mathbf{w}}\geq0, \quad\left(\tilde{\mathbf{w}}\geq\mathbf{0}\right). \end{array}\]

R session: MSRP

The maximum Sharpe ratio portfolio (MSRP) is the nonconvex problem \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\mu}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1, \quad \mathbf{w}\geq\mathbf{0} \end{array}\] that can be rewritten in convex form as \[ \begin{array}{ll} \underset{\tilde{\mathbf{w}}}{\textsf{minimize}} & \tilde{\mathbf{w}}^T\mathbf{\Sigma}\tilde{\mathbf{w}}\\ {\textsf{subject to}} & \tilde{\mathbf{w}}^T\boldsymbol{\mu} = 1\\ & \tilde{\mathbf{w}}\ge\mathbf{0} \end{array} \] and then \(\mathbf{w} = \tilde{\mathbf{w}}/(\mathbf{1}^T\tilde{\mathbf{w}})\).

This is a quadratic problem (QP) and we can conveniently use CVXR (although one is advised to use a specific QP solver like quadprog for speed and stability):

# create function for MSRP
MSRP <- function(mu, Sigma) {
  w_ <- Variable(nrow(Sigma))
  prob <- Problem(Minimize(quad_form(w_, Sigma)),
                  constraints = list(w_ >= 0, t(mu) %*% w_ == 1))
  result <- CVXR::solve(prob)
  w <- as.vector(result$getValue(w_)/sum(result$getValue(w_)))
  names(w) <- colnames(Sigma)
  return(w)
}

# this function can now be used as
w_MSRP <- MSRP(mu, Sigma)

# comparison among solutions
round(cbind(w_nonlinear_solver, w_bisection, w_Dinkelbach, w_MSRP), digits = 3)

R>>      w_nonlinear_solver w_bisection w_Dinkelbach w_MSRP
R>> AAPL              0.000       0.183        0.183  0.183
R>> AMD               0.161       0.000        0.000  0.000
R>> ADI               0.000       0.002        0.000  0.000
R>> ABBV              0.000       0.257        0.258  0.258
R>> AEZS              0.839       0.000        0.000  0.000
R>> A                 0.000       0.000        0.000  0.000
R>> APD               0.000       0.482        0.483  0.483
R>> AA                0.000       0.000        0.000  0.000
R>> CF                0.000       0.076        0.076  0.076

# Sharpe ratio of different solutions
c("nonlinear_solver" = fn_SR(w_nonlinear_solver), 
  "bisection"        = fn_SR(w_bisection),
  "Dinkelbach"       = fn_SR(w_Dinkelbach),
  "Schaible"         = fn_SR(w_MSRP))

R>> nonlinear_solver        bisection       Dinkelbach         Schaible 
R>>      -0.07987582       0.07006683       0.07006775       0.07006775

As expected, the bisection method, Dinkelbach method, and Schaible method give the optimal solution, unlike the general-purpose nonlinear solver which solves a nonconvex problem.

Sharpe ratio portfolio in the efficient frontier

Risk-Based Portfolios (GMVP, IVP, RPP, MDP, MDCP)

Risk-based portfolios

Risk-based portfolios try to bypass the high sensitivity of Markowitz’s mean-variance portfolio to the estimation errors of the expected returns by not making use of the expected returns altogether. They are based only on the covariance matrix (Ardia et al. 2017).

We will explore the following risk-based portfolios:
- global minimum variance portfolio (GMVP)
- inverse volatility portfolio (IVP)
- risk parity portfolio (RPP) or equal risk portfolio (ERP)
- most diversified portfolio (MDP)
- maximum decorrelation portfolio (MDCP).

Global minimum variance portfolio (GMVP)

As previous seen, the global minimum variance portfolio (GMVP) can be seen as a particular case of Markowitz’s mean-variance portfolio when the expected return is totally ignored: \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^T\mathbf{w}=1. \end{array}\]
It is a simple convex QP with solution \[\mathbf{w}_{\sf GMVP}=\frac{1}{\mathbf{1}^{T}\boldsymbol{\Sigma}^{-1}\mathbf{1}}\boldsymbol{\Sigma}^{-1}\mathbf{1}.\]
It is widely used in academic papers for simplicity of evaluation and comparison of different estimators of the covariance matrix \(\boldsymbol{\Sigma}\) (while ignoring the estimation of \(\boldsymbol{\mu}\)).

Inverse volatility portfolio (IVP)

The aim of inverse volatility portfolio (IVP) is to control the portfolio risk (risk parity portfolio being a refined version (Qian 2005)).
The IVP is defined as \[\mathbf{w} = \frac{\boldsymbol{\sigma}^{-1}}{\mathbf{1}^T\boldsymbol{\sigma}^{-1}}\] where \(\boldsymbol{\sigma}^2 = {\sf Diag(\boldsymbol{\Sigma})}\).
Lower weights are given to high volatility assets and higher weights to low volatility assets
IVP is also called “equal volatility” portfolio since the weighted constituent assets have equal volatility: \[{\sf sd}(w_ir_i) = w_i\sigma_i = 1/N.\]
Note that the GMVP when the covariance matrix is diagonal leads to an inverse-variance solution: \[\mathbf{w} = \frac{\boldsymbol{\sigma}^{-2}}{\mathbf{1}^T\boldsymbol{\sigma}^{-2}}.\]

R session: IVP

The Inverse volatility portfolio (IVP) has the simple closed-form solution: \[\mathbf{w} = \frac{\boldsymbol{\sigma}^{-1}}{\mathbf{1}^T\boldsymbol{\sigma}^{-1}}.\]

Its implementation in R is trivial:

# create function for IVP
IVP <- function(Sigma) {
  sigma <- sqrt(diag(Sigma))
  w <- 1/sigma
  w <- w/sum(w)
  return(w)
}

# this function can now be used as
w_IVP <- IVP(Sigma)

Risk parity portfolio (RPP)

The risk parity portfolio (RPP) or equal risk portfolio (ERP) aims at equalizing the risk contribution from the invested assets in the global portfolio risk (Qian 2005).
More sound formulation of the inverse volatility portfolio (IVP), which ignores the asset correlations, by properly taking into account the while covariance matrix.
From “dollar” to risk diversification:

Most diversified portfolio (MDP)

In (Choueifaty and Coignard 2008), it was postulated that markets are risk-efficient, so that investments will produce returns in proportion to their total risk (measured by volatility).
The diversification ratio (DR) was defined analogous to the Sharpe ratio (SR) but substituting the weighted return for the weighted volatility: \[\mathsf{DR} = \frac{\mathbf{w}^{T}\boldsymbol{\sigma}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}.\]
For long-only portfolios, it can be shown that \(\mathsf{DR}\ge1\). For a single stock, \(\mathsf{DR}=1\).
The most diversified portfolio (MDP) is obtained as the maximization of DR (akin to the maximization of the Sharpe ratio): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\sigma}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1. \end{array}\]

Most diversified portfolio (MDP)

The MDP has some interesting properties:
- the correlation of some portfolio \(\mathbf{w}\) with the MDP \(\mathbf{w}_{\sf MDP}\) is proportional to the DR of the portfolio: \[\rho = \frac{\mathsf{DR}(\mathbf{w})}{\mathsf{DR}(\mathbf{w}_{\sf MDP})},\]
- as a consequence, all the assets in the MDP have the same positive correlation to the MDP,
- also, any stock not held by the MDP is more correlated to the MDP than any of the stocks that belong to it (this illustrates that all assets in the universe considered are effectively represented in the MDP, even if the portfolio does not physically hold them);
- if all the stocks in the universe have the same volatility, then the MDP is equivalent to the GMVP;
- the squared \(\mathsf{DR}\) can be interpreted as the effective number of independent risk factors in the portfolio (Choueifaty et al. 2013),
- as a consequence, the MDP has a DR equal to the square root of the effective number of independent risk factors available in the entire market (which is typically larger than the market index).

R session: MDP

The Most diversified portfolio (MDP) is formulated exactly as the maximum Sharpe ratio portfolio (MSRP) but using \(\boldsymbol{\sigma}\) in lieu of \(\boldsymbol{\mu}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{maximize}} & \dfrac{\mathbf{w}^{T}\boldsymbol{\sigma}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1\\ & \tilde{\mathbf{w}}\ge\mathbf{0}. \end{array}\]

Therefore, we can use the same R function created to solve the MSRP:

w_MDP <- MSRP(mu = sqrt(diag(Sigma)), Sigma)

Maximum decorrelation portfolio (MDCP)

The maximum decorrelation portfolio (MDCP) (Christoffersen et al. 2012) is closely related to GMVP and MDP, but applies to the case where an investor believes all assets have similar returns and volatility, but heterogeneous correlations.
The MDCP is formulated as \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\mathbf{C}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1 \end{array}\] where \(\mathbf{C} \triangleq \mathsf{Diag}(\boldsymbol{\Sigma})^{-1/2}\boldsymbol{\Sigma}\mathsf{Diag}(\boldsymbol{\Sigma})^{-1/2}\) is the correlation matrix.
Interestingly, when the weights derived from the MDCP are divided by their respective volatilities and re-standardized so that they sum to 1, we retrieve the MDP weights.
The MDCP happen to
- maximize the DR when all assets have equal volatility and
- maximize the SR when all assets have equal risks and returns.

R session: MDCP

The maximum decorrelation portfolio (MDCP) is formally formulated as the GMVP but using \(\mathbf{C}\) in lieu of \(\boldsymbol{\Sigma}\): \[\begin{array}{ll} \underset{\mathbf{w}}{\textsf{minimize}} & \mathbf{w}^{T}\mathbf{C}\mathbf{w}\\ \textsf{subject to} & \mathbf{1}^{T}\mathbf{w}=1 \end{array}\] where \(\mathbf{C} \triangleq \mathsf{Diag}(\boldsymbol{\Sigma})^{-1/2}\boldsymbol{\Sigma}\mathsf{Diag}(\boldsymbol{\Sigma})^{-1/2}\) is the correlation matrix.

Therefore, we can use the same R function created to solve the GMVP:

# create function for MDCP based on GMVP()
MDCP <- function(Sigma) {
  C <- diag(1/sqrt(diag(Sigma))) %*% Sigma %*% diag(1/sqrt(diag(Sigma)))
  colnames(C) <- colnames(Sigma)
  return(GMVP(Sigma = C))
}

# this function can now be used as
w_MDCP <- MDCP(Sigma)

Comparison of Portfolios

Comparison of portfolios: dollar allocation

Comparison of portfolios: cumulative P&L

Comparison of portfolios: drawdown

Comparison of portfolios over multiple datasets

We have backtested a number of different portfolio designs.
However, we have only used one set of market data and we should not draw any conclusion from that.
Also, we have used a training data window to design the portfolio and evaluated it in a subsequent test data window.
A serious backtesting requires:
- multiple datasets
- evaluation on a rolling-window basis.
In R, the package portfolioBacktest makes this very simple.

Comparison of portfolios over multiple datasets

The following performance table is obtained with the R package portfolioBacktest:

Comparison of portfolios over multiple datasets

The following barplot is obtained with the R package portfolioBacktest:

Comparison of portfolios over multiple datasets

The following boxplot is obtained with the r package portfolioBacktest:

R session: Comparison of MVP, GMVP, IVP, MSRP, MDP, MDCP, as well as the previous heuristic portfolios (EWP, quintile portfolio, and GMRP)

We are now ready to consider several portfolio designs and compare their performance (the underlying constraint is simply \(\mathbf{w}\ge\mathbf{0}\) and \(\mathbf{1}^T\mathbf{w}=1\)).

Recall the three heuristic portfolios we want to compare:

w_heuristic <- cbind("EWP"    = w_EWP, 
                     "QuintP" = w_QuintP[, "QuintP (mu/sigma)"], 
                     "GMRP"   = w_GMRP)

Now we stack the Markowitz-based portfolios we want to compare:

w_Markowitz <- cbind("MVP"  = w_MVP, 
                     "GMVP" = w_GMVP, 
                     "IVP"  = w_IVP, 
                     "MSRP" = w_MSRP, 
                     "MDP"  = w_MDP, 
                     "MDCP" = w_MDCP)

We can now compare the allocations of the portfolios:

w_all <- cbind(w_heuristic, w_Markowitz)
barplot(t(w_all), col = rainbow10equal[1:9], legend = colnames(w_all), beside = TRUE,
        main = "Portfolio allocation", xlab = "stocks", ylab = "dollars")

The performance is (in-sample vs out-of-sample):

# compute returns of all portfolios
ret_all <- xts(X_lin %*% w_all, index(X_lin))
ret_all_trn <- ret_all[1:T_trn, ]
ret_all_tst <- ret_all[-c(1:T_trn), ]

# performance
t(table.AnnualizedReturns(ret_all_trn))

R>>        Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> EWP               0.0309             0.2057                    0.1503
R>> QuintP            0.2268             0.1899                    1.1944
R>> GMRP              0.2167             0.2567                    0.8442
R>> MVP               0.2220             0.1704                    1.3032
R>> GMVP              0.1692             0.1569                    1.0779
R>> IVP               0.1272             0.1639                    0.7761
R>> MSRP              0.2197             0.1658                    1.3249
R>> MDP               0.0750             0.1878                    0.3993
R>> MDCP             -0.1152             0.3294                   -0.3498

t(table.AnnualizedReturns(ret_all_tst))

R>>        Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> EWP               0.3393             0.2764                    1.2273
R>> QuintP            0.1644             0.1978                    0.8314
R>> GMRP              0.1532             0.3080                    0.4973
R>> MVP               0.1335             0.1697                    0.7863
R>> GMVP              0.1411             0.1774                    0.7953
R>> IVP               0.2484             0.2102                    1.1817
R>> MSRP              0.1076             0.1696                    0.6341
R>> MDP               0.3122             0.2564                    1.2175
R>> MDCP              0.4037             0.4254                    0.9490

We can observe that:

as expected, the MSRP achieves the maximum ex ante (in-sample) Sharpe ratio; however, this is not maintained ex post (out-of-sample);
as expected, the EWP achieves the best ex post performance;
the MVP performs poorly;
other top performers include the IVP and MDP.

Let’s plot the wealth evolution (cumulative PnL) over the whole time:

{ chart.CumReturns(ret_all, main = "Cumulative return of portfolios", 
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)
  addEventLines(xts("training", index(X_lin[T_trn])), srt=90, pos=2, lwd = 2, col = "darkblue") }

and let’s zoom in the out-of-sample period:

chart.CumReturns(ret_all_tst, main = "Cumulative return of portfolios (out-of-sample)",
                   wealth.index = TRUE, legend.loc = "topleft", colorset = rich10equal)

Let’s look at the drawdown:

chart.Drawdown(ret_all_tst, main = "Drawdown of portfolios (out-of-sample)", 
               legend.loc = "bottomleft", colorset = rich10equal)

Clearly the MDCP and the GMRP have the worst drawdown and are unacceptable.

Finally, we can plot the (in-sample) return-risk scatter plot along with the efficient frontier:

# first, compute the efficient frontier
w_frontier_trn <- NULL
lmd_sweep <- exp(seq(-6, 6, by = 0.5))
for (lmd in lmd_sweep)
  w_frontier_trn <- cbind(w_frontier_trn, MVP(mu, Sigma, lmd))
ret_frontier_trn <- xts(X_lin_trn %*% w_frontier_trn, index(X_lin_trn))
mu_frontier_trn <- table.AnnualizedReturns(ret_frontier_trn)[1, ]
sd_frontier_trn <- table.AnnualizedReturns(ret_frontier_trn)[2, ]

# plot in-sample sd-mu scatter plot
maxSR <- table.AnnualizedReturns(ret_all_trn[, "MSRP"])[3, ]
chart.RiskReturnScatter(ret_all_trn,
                        main = "Annualized Return and Risk (in-sample)",
                        symbolset = c(rep(21, 3), rep(22, 6)), 
                        colorset = c(rep("darkred", 3), rep("darkblue", 6)),
                        bg = "black",
                        add.sharpe = maxSR)
lines(sd_frontier_trn, mu_frontier_trn)

Observe that this nice return-risk scatter plot totally deforms in an unpredictable way when we evaluate it in the out-of-sample set as we can see:

# compute the efficient frontier again but based on the test data
mu_tst <- colMeans(X_log_tst)
Sigma_tst <- cov(X_log_tst)

w_frontier_tst <- NULL
lmd_sweep <- exp(seq(-6, 6, by = 0.5))
for (lmd in lmd_sweep)
  w_frontier_tst <- cbind(w_frontier_tst, MVP(mu_tst, Sigma_tst, lmd))
ret_frontier_tst <- xts(X_lin_tst %*% w_frontier_tst, index(X_lin_tst))
mu_frontier_tst <- table.AnnualizedReturns(ret_frontier_tst)[1, ]
sd_frontier_tst <- table.AnnualizedReturns(ret_frontier_tst)[2, ]

# plot out-of-sample sd-mu scatter plot
chart.RiskReturnScatter(ret_all_tst,
                        main = "Annualized Return and Risk (out-of-sample)",
                        symbolset = c(rep(21, 3), rep(22, 6)),  
                        colorset = c(rep("darkred", 3), rep("darkblue", 6)),
                        bg = "black",
                        add.sharpe = NA,
                        ylim = c(0, 1))
lines(sd_frontier_tst, mu_frontier_tst)

Again this ex post return-risk scatter plot shows that the winners are EWP, IVP, and MDP.

R session: Slower rebalancing frequencies

In the previous performance analysis of the different portfolios, we were implicitly assuming a daily rebalancing and that’s why computing the return of the portfolio was as easy as ret_portf <- X_lin %*% w. Now, we will consider a lower rebalancing frequency (weekly, monthly, quarterly, and yearly) for a given portfolio (in particular, we will just use the EWP portfolio w_EWP). Observe that on the day of the rebalancing, the portfolio held will be the designed one w_EWP but in the subsequent days without rebalancing such portfolio slowly deviates as the prices of the stocks change (recall that the portfolio denotes the dollar or capital allocation).

The package PerformanceAnalytics has the convenient function Return.portfolio() that allows us to compute the portfolio return with different rebalancing schemes. Let’s start with a sanity check for the daily rebalancing:

# choose portfolio for the comparison
w <- w_EWP

# recall the computation for the daily rebalancing (with daily returns)
ret_daily_rebal <- X_lin %*% w

# we can alternativaly use this function:
ret_daily_rebal_ <- Return.portfolio(X_lin, weights = w, rebalance_on = "days")
norm(ret_daily_rebal - ret_daily_rebal_)  # sanity check

R>> [1] 8.785889e-14

Now, let’s use a yearly rebalancing frequency to observe the slow deviation as the prices change:

# let's observe how the portfolio slowly deviates from its original design
tmp <- Return.portfolio(X_lin, weights = w, rebalance_on = "years", verbose = TRUE)
round(head(tmp$BOP.Weight, 15), digits = 4)

R>>              AAPL    AMD    ADI   ABBV   AEZS      A    APD     AA     CF
R>> 2013-01-03 0.1111 0.1111 0.1111 0.1111 0.1111 0.1111 0.1111 0.1111 0.1111
R>> 2013-01-04 0.1103 0.1099 0.1099 0.1107 0.1121 0.1121 0.1113 0.1127 0.1111
R>> 2013-01-07 0.1063 0.1134 0.1071 0.1085 0.1125 0.1134 0.1119 0.1141 0.1127
R>> 2013-01-08 0.1056 0.1168 0.1073 0.1086 0.1133 0.1125 0.1117 0.1120 0.1122
R>> 2013-01-09 0.1067 0.1177 0.1070 0.1070 0.1124 0.1124 0.1127 0.1129 0.1114
R>> 2013-01-10 0.1043 0.1152 0.1060 0.1069 0.1130 0.1147 0.1135 0.1119 0.1145
R>> 2013-01-11 0.1053 0.1144 0.1069 0.1069 0.1122 0.1152 0.1130 0.1102 0.1159
R>> 2013-01-14 0.1037 0.1155 0.1051 0.1068 0.1186 0.1135 0.1127 0.1089 0.1152
R>> 2013-01-15 0.1008 0.1151 0.1059 0.1083 0.1164 0.1147 0.1130 0.1094 0.1164
R>> 2013-01-16 0.0978 0.1170 0.1055 0.1101 0.1174 0.1141 0.1128 0.1096 0.1157
R>> 2013-01-17 0.1015 0.1183 0.1053 0.1127 0.1127 0.1132 0.1119 0.1087 0.1157
R>> 2013-01-18 0.0998 0.1180 0.1067 0.1144 0.1120 0.1142 0.1122 0.1084 0.1143
R>> 2013-01-22 0.1000 0.1068 0.1075 0.1181 0.1124 0.1158 0.1134 0.1099 0.1161
R>> 2013-01-23 0.1004 0.1058 0.1071 0.1146 0.1122 0.1164 0.1137 0.1118 0.1179
R>> 2013-01-24 0.1007 0.1161 0.1060 0.1172 0.1097 0.1138 0.1108 0.1095 0.1162

chart.StackedBar(tmp$BOP.Weight, main = "Evolution of uniform portfolio with yearly rebalancing",
                 ylab = "w", space = 0, border = NA)

We can now compare the different rebalancing frequencies:

# now different rebalancing frequencies
ret_weekly_rebal <- Return.portfolio(X_lin, weights = w, rebalance_on = "weeks")
ret_monthly_rebal <- Return.portfolio(X_lin, weights = w, rebalance_on = "months")
ret_quarterly_rebal <- Return.portfolio(X_lin, weights = w, rebalance_on = "quarters")
ret_yearly_rebal <- Return.portfolio(X_lin, weights = w, rebalance_on = "years")
ret_allfreqs <- cbind(ret_daily_rebal, ret_weekly_rebal, ret_monthly_rebal, ret_quarterly_rebal, ret_yearly_rebal)
colnames(ret_allfreqs) <- c("daily", "weekly", "monthly", "quarterly", "yearly")
round(head(ret_allfreqs, 25), digits = 4)

R>>              daily  weekly monthly quarterly  yearly
R>> 2013-01-03 -0.0050 -0.0050 -0.0050   -0.0050 -0.0050
R>> 2013-01-04  0.0078  0.0079  0.0079    0.0079  0.0079
R>> 2013-01-07  0.0010  0.0010  0.0010    0.0010  0.0010
R>> 2013-01-08 -0.0073 -0.0073 -0.0073   -0.0073 -0.0073
R>> 2013-01-09  0.0064  0.0062  0.0064    0.0064  0.0064
R>> 2013-01-10  0.0032  0.0033  0.0031    0.0031  0.0031
R>> 2013-01-11  0.0087  0.0087  0.0089    0.0089  0.0089
R>> 2013-01-14 -0.0075 -0.0075 -0.0075   -0.0075 -0.0075
R>> 2013-01-15 -0.0019 -0.0018 -0.0016   -0.0016 -0.0016
R>> 2013-01-16  0.0042  0.0041  0.0035    0.0035  0.0035
R>> 2013-01-17  0.0099  0.0101  0.0100    0.0100  0.0100
R>> 2013-01-18 -0.0070 -0.0070 -0.0075   -0.0075 -0.0075
R>> 2013-01-22  0.0055  0.0055  0.0054    0.0054  0.0054
R>> 2013-01-23  0.0160  0.0158  0.0152    0.0152  0.0152
R>> 2013-01-24 -0.0119 -0.0117 -0.0104   -0.0104 -0.0104
R>> 2013-01-25  0.0055  0.0061  0.0064    0.0064  0.0064
R>> 2013-01-28  0.0034  0.0034  0.0025    0.0025  0.0025
R>> 2013-01-29  0.0198  0.0210  0.0202    0.0202  0.0202
R>> 2013-01-30 -0.0140 -0.0152 -0.0154   -0.0154 -0.0154
R>> 2013-01-31 -0.0106 -0.0114 -0.0116   -0.0116 -0.0116
R>> 2013-02-01  0.0101  0.0101  0.0101    0.0103  0.0103
R>> 2013-02-04 -0.0196 -0.0196 -0.0195   -0.0196 -0.0196
R>> 2013-02-05  0.0102  0.0102  0.0101    0.0092  0.0092
R>> 2013-02-06  0.0032  0.0030  0.0030    0.0033  0.0033
R>> 2013-02-07  0.0071  0.0069  0.0068    0.0071  0.0071

# performance
t(table.AnnualizedReturns(ret_allfreqs))

R>>           Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> daily                0.1151             0.2292                    0.5021
R>> weekly               0.1111             0.2295                    0.4838
R>> monthly              0.0971             0.2265                    0.4287
R>> quarterly            0.0789             0.2063                    0.3825
R>> yearly               0.1235             0.2019                    0.6118

chart.CumReturns(ret_allfreqs, main = "Daily wealth for different rebalancing frequencies",
                 wealth.index = TRUE, legend.loc = "topleft", colorset = rich6equal)

R session: Rolling window portfolios

Until now, we have considered static portfolios in the sense that they are first designed based on a training set and then they remain fixed and are used in the test set. In a more realistic setting, however, one wants to implement this procedure in a rolling-window basis. That is, with some frequency the portfolio is reoptimized (since the dynamics of the asset returns may change with time) and rebalanced.

Recall the procedure for the static portfolio where one estimates \(\boldsymbol\mu\) and \(\boldsymbol\Sigma\) from the training set, designs some portfolio (say, Markowitz mean-variance), and then applies it to the test set with some rebalancing frequency (say, monthly):

# recall the procedure for the static portfolio (Markowitz, for example):
T_trn <- round(0.5*T)
X_log_trn <- X_log[1:T_trn, ]
X_log_tst <- X_log[(T_trn+1):T, ]
X_lin_trn <- X_lin[1:T_trn, ]
X_lin_tst <- X_lin[(T_trn+1):T, ]
mu <- colMeans(X_log_trn)
Sigma <- cov(X_log_trn)
w_MVP_static <- MVP(mu, Sigma, lmd = 2)
ret_static <- Return.portfolio(X_lin_tst, weights = w_MVP_static, rebalance_on = "months")

Now we want to reoptimize the portfolio on a rolling-window basis. R provides some options:

The package PerformanceAnalytics provides two functions to handle the two main rolling-window strategies: apply.rolling() is for a rolling window of fixed length, whereas apply.fromstart() is for an expanding rolling window. However, those only work independently on each column of the multivariate time series, which is not what we want.
The package xts has the function apply.rolling() but it only works with nonoverlapping periods, which is not what we want. What we want is to choose a window of, say, 6 months (about 120 days) and shift it, say, every month.

We will then do it by ourselves with a simple loop:

# create empty portfolio matrix
w_MVP_rolling <- X_log
w_MVP_rolling[] <- NA
head(w_MVP_rolling)

R>>            AAPL AMD ADI ABBV AEZS  A APD AA CF
R>> 2013-01-03   NA  NA  NA   NA   NA NA  NA NA NA
R>> 2013-01-04   NA  NA  NA   NA   NA NA  NA NA NA
R>> 2013-01-07   NA  NA  NA   NA   NA NA  NA NA NA
R>> 2013-01-08   NA  NA  NA   NA   NA NA  NA NA NA
R>> 2013-01-09   NA  NA  NA   NA   NA NA  NA NA NA
R>> 2013-01-10   NA  NA  NA   NA   NA NA  NA NA NA

# find rebalancing indices
rebal_indices <- T_trn + endpoints(X_log_tst, on = "months")
rebal_indices

R>>  [1]  504  523  542  564  585  605  627  649  670  691  713  733  755  774  794  816  837  858  880
R>> [20]  900  923  944  965  986 1007

index(X_log)[rebal_indices]

R>>  [1] "2015-01-02" "2015-01-30" "2015-02-27" "2015-03-31" "2015-04-30" "2015-05-29" "2015-06-30"
R>>  [8] "2015-07-31" "2015-08-31" "2015-09-30" "2015-10-30" "2015-11-30" "2015-12-31" "2016-01-29"
R>> [15] "2016-02-29" "2016-03-31" "2016-04-29" "2016-05-31" "2016-06-30" "2016-07-29" "2016-08-31"
R>> [22] "2016-09-30" "2016-10-31" "2016-11-30" "2016-12-30"

# run the rolling window loop
lookback <- 10*20  # maximum value is: floor(T_trn/20)*20
for (i in 1:length(rebal_indices)) {
  # estimate moments
  X_ <- X_log[(rebal_indices[i] - lookback + 1):rebal_indices[i], ]
  mu <- colMeans(X_)
  Sigma <- cov(X_)
  # design portfolio
  w_MVP_rolling[rebal_indices[i], ] <- MVP(mu, Sigma, lmd = 2)
}
w_MVP_rolling <- na.omit(w_MVP_rolling)
w_MVP_rolling

R>>                     AAPL           AMD           ADI          ABBV          AEZS             A
R>> 2015-01-02  9.902672e-01 -8.809655e-23 -2.317270e-22 -6.915187e-23 -1.642895e-23 -2.385155e-22
R>> 2015-01-30  1.000000e+00  8.749120e-23  1.103227e-22  1.656960e-22  1.623278e-23  2.213092e-22
R>> 2015-02-27  1.000000e+00  1.800556e-22 -5.619622e-23 -5.652326e-23 -4.055263e-23 -6.928859e-25
R>> 2015-03-31  7.476162e-01 -1.287939e-22  9.706476e-24  6.628585e-23 -9.502182e-23  6.793997e-23
R>> 2015-04-30  6.529535e-01  2.172538e-22  2.422831e-23  1.514993e-02 -3.489780e-23  1.299749e-23
R>> 2015-05-29  2.315185e-01  1.751692e-22  2.774514e-01 -1.049608e-22 -4.531772e-24 -1.043353e-22
R>> 2015-06-30  7.501847e-23  2.611801e-22  3.356834e-01 -7.260174e-23  9.279138e-23  6.624630e-23
R>> 2015-07-31 -3.748943e-23 -1.276624e-22  5.745478e-01  2.863951e-01 -8.490323e-23  1.120628e-23
R>> 2015-08-31  6.461645e-24  1.779758e-23  4.926232e-01  1.574022e-23 -1.195098e-22  3.154916e-23
R>> 2015-09-30  5.013905e-01  8.527619e-23  3.395220e-01 -8.172900e-23  3.293372e-23 -2.683616e-23
R>> 2015-10-30  5.959105e-01  3.762220e-23  4.040895e-01  6.571435e-24 -1.218587e-22  1.635751e-23
R>> 2015-11-30  1.018939e-23  1.743025e-23  2.973865e-01  6.789015e-02 -2.143016e-23  6.347233e-01
R>> 2015-12-31  1.760066e-23  7.192453e-02  5.226009e-23  2.178686e-01 -8.130601e-23  7.102069e-01
R>> 2016-01-29 -9.138653e-23 -1.787395e-23  4.935877e-02  2.961526e-01 -7.252258e-23  4.289260e-01
R>> 2016-02-29  3.908607e-23  5.540416e-23 -1.998324e-23  2.066015e-24  2.901166e-23  2.192674e-01
R>> 2016-03-31 -5.427663e-23  1.267904e-01  5.250097e-23  2.927760e-23 -6.096683e-25  2.777073e-01
R>> 2016-04-29 -1.003870e-23  2.687192e-01  3.528642e-23  8.283462e-23 -1.705115e-23  4.798278e-23
R>> 2016-05-31  9.819346e-24  4.300033e-01  5.991553e-23  6.532455e-23 -8.046687e-25  5.699967e-01
R>> 2016-06-30  3.801515e-23  4.381132e-01 -3.979588e-23  2.418519e-23  5.917248e-25  5.618868e-01
R>> 2016-07-29 -2.157959e-23  4.987737e-01  9.284684e-23  4.131992e-02  8.675710e-23  4.599064e-01
R>> 2016-08-31 -3.294276e-23  5.430352e-01 -1.125308e-24 -3.310669e-23  2.047605e-23  4.569648e-01
R>> 2016-09-30  5.512836e-23  4.377919e-01 -5.842564e-23  1.597170e-01 -5.323684e-23  2.965723e-23
R>> 2016-10-31 -2.279900e-23  5.312671e-01  2.115085e-01 -5.208054e-23 -6.482255e-23  2.655544e-23
R>> 2016-11-30 -2.830227e-24  6.461871e-01  3.538129e-01  2.302591e-23  3.948049e-23  2.681669e-23
R>> 2016-12-30 -1.521177e-23  6.751519e-01  2.283323e-01  9.651584e-02  1.521067e-23 -2.849766e-23
R>>                      APD            AA            CF
R>> 2015-01-02 -1.522847e-22  9.732753e-03 -1.845559e-22
R>> 2015-01-30  2.719964e-23  1.090248e-22  8.252438e-23
R>> 2015-02-27 -4.215465e-23 -1.130809e-22 -1.340150e-22
R>> 2015-03-31  1.235549e-22  5.940091e-23  2.523838e-01
R>> 2015-04-30 -1.737530e-23  1.513392e-23  3.318965e-01
R>> 2015-05-29 -4.029524e-23  6.927618e-23  4.910302e-01
R>> 2015-06-30  7.872969e-23  6.565273e-23  6.643166e-01
R>> 2015-07-31 -3.034649e-23  9.156408e-23  1.390571e-01
R>> 2015-08-31  6.387070e-02  1.353959e-23  4.435061e-01
R>> 2015-09-30  1.020361e-01  1.118152e-22  5.705137e-02
R>> 2015-10-30 -8.698029e-24  1.729010e-23 -1.154839e-23
R>> 2015-11-30  7.152586e-23 -3.991975e-22 -1.841869e-22
R>> 2015-12-31  9.769882e-23  7.053615e-23  1.361473e-23
R>> 2016-01-29  2.255626e-01 -1.695643e-22  1.759559e-22
R>> 2016-02-29  7.807326e-01  7.858535e-23 -1.462194e-22
R>> 2016-03-31  5.955022e-01  9.282245e-23  2.050547e-22
R>> 2016-04-29  7.312808e-01  3.957575e-23  2.281804e-22
R>> 2016-05-31  1.285200e-23  4.976627e-23  2.281117e-22
R>> 2016-06-30 -1.839021e-23  3.174632e-25  2.118120e-22
R>> 2016-07-29  6.409784e-23  1.001065e-22 -1.769616e-22
R>> 2016-08-31 -1.908623e-23 -2.097934e-23  1.912610e-22
R>> 2016-09-30  4.024911e-01  3.115110e-23  2.038661e-22
R>> 2016-10-31  2.572244e-01 -5.575573e-23 -3.614350e-23
R>> 2016-11-30  2.415627e-23  6.268017e-23 -2.675248e-23
R>> 2016-12-30 -1.651222e-24 -1.445507e-23 -2.602768e-23

We can now plot the evolution over time of the portfolio:

# compute portfolio returns
tmp <- Return.portfolio(X_lin_tst, weights = w_MVP_rolling, verbose = TRUE)
ret_rolling <- tmp$returns
chart.StackedBar(tmp$BOP.Weight, main = "Evolution of rolling-window Markowitz's mean-variance portfolio",
                 ylab = "w", space = 0, border = NA)

We can now compare the static and rolling window versions:

# performance
ret_MVP <- cbind(ret_static, ret_rolling)
colnames(ret_MVP) <- c("Markowitz MVP - static", "Markowitz MVP - rolling window")

t(table.AnnualizedReturns(ret_MVP))

R>>                                Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%)
R>> Markowitz MVP - static                    0.0212             0.2127                    0.0998
R>> Markowitz MVP - rolling window            0.3581             0.2956                    1.2112

chart.CumReturns(ret_MVP, main = "Daily wealth for static vs rolling window",
                 wealth.index = TRUE, legend.loc = "topleft", colorset = rich6equal)

chart.Drawdown(ret_MVP, main = "Drawdown for static vs rolling window", 
               legend.loc = "bottomleft", colorset = rich6equal)

R session: Assessment of portfolios with the package `portfolioBacktest`

We have backtested a number of different portfolio designs.
However, we have only used one set of market data and we should not draw any conclusion from that.
Also, we have used a training data window to design the portfolio and evaluated it in a subsequent test data window.
A serious backtesting requires:
- multiple datasets
- evaluation on a rolling-window basis.
We will use the R package portfolioBacktest which makes this very simple.

First, let’s load the market data using the package portfolioBacktest:

library(portfolioBacktest)

# load the SP500 assets
data("SP500_symbols")
SP500_YAHOO <- stockDataDownload(stock_symbols = SP500_symbols, from = "2010-12-01", to = "2018-12-01")

# generate 100 random samples each containing 50 random assets over a random window of two years
N_datasets <- 100
mydataset <- stockDataResample(SP500_YAHOO, N = 50, T = 252*2, num_datasets = N_datasets)

Second, let’s define the portfolios that we want to backtest:

library(CVXR)

QuintP <- function(dataset) {
  prices <- dataset$adjusted
  N <- ncol(prices)
  X <- diff(log(prices))[-1]  # returns
  mu <- colMeans(X)
  Sigma <- cov(X)
  
  # portfolio design
  idx <- sort(mu/diag(Sigma), decreasing = TRUE, index.return = TRUE)$ix
  w <- rep(0, N)
  w[idx[1:round(N/5)]] <- 1/round(N/5)
  return(w)
}

IVP <- function(dataset) {
  prices <- dataset$adjusted
  N <- ncol(prices)
  X <- diff(log(prices))[-1]  # returns
  Sigma <- cov(X)

  # portfolio design
  sigma <- sqrt(diag(Sigma))
  w <- 1/sigma
  w <- w/sum(w)
  return(w)
}

GMVP <- function(dataset) {
  prices <- dataset$adjusted
  N <- ncol(prices)
  X <- diff(log(prices))[-1]  # returns
  Sigma <- cov(X)
  
  # portfolio design
  w <- Variable(N)
  prob <- Problem(Minimize(quad_form(w, Sigma)),
                  constraints = list(sum(w) == 1, w >= 0))
  result <- CVXR::solve(prob)
  return(as.vector(result$getValue(w)))
}

MVP <- function(dataset) {
  prices <- dataset$adjusted
  N <- ncol(prices)
  X <- diff(log(prices))[-1]  # returns
  mu <- colMeans(X)
  Sigma <- cov(X)
  
  # portfolio design
  lmd = 0.5
  w <- Variable(N)
  prob <- Problem(Maximize(t(mu) %*% w - lmd*quad_form(w, Sigma)),
                  constraints = list(sum(w) == 1, w >= 0))
  result <- CVXR::solve(prob)
  return(as.vector(result$getValue(w)))
}

portfolio_list <- list("QuintP" = QuintP,
                       "IVP"    = IVP,
                       "GMVP"   = GMVP,
                       "MVP"    = MVP)

Then, let’s proceed with the backtesting based on 100 datasets randomly chosen from market data:

bt_all_port <- portfolioBacktest(portfolio_funs = portfolio_list,
                                 dataset = mydataset,
                                 benchmark = c("uniform", "index"),
                                 T_rolling_window = 252*2/3, optimize_every = 20, rebalance_every = 1, 
                                 show_progress_bar = FALSE,
                                 paral_datasets = 5)
res_summary_median <- backtestSummary(bt_all_port)

Now we are ready to compare the portfolios via tables and plots.

A leaderboard is a table with a comparison:

summaryTable(res_summary_median, type = "DT", order_col = 2, order_dir = "desc")

A barplot provides the same information as the leaderboard but in a visual way:

summaryBarPlot(res_summary_median, measures = c("Sharpe ratio", "max drawdown"))

A boxplot is probably the best way to properly compare the performance of different portfolios with a single performance measure:

backtestBoxPlot(bt_all_port, "Sharpe ratio")

We can confirm the good performance of the EWP and the bad performance of the Markowitz MVP.

Conclusions

Drawbacks of Markowitz’s portfolio

Markowitz’s portfolio has never been fully embraced by practitioners, among other reasons (Zhao et al. 2019) because

variance is not a good measure of risk in practice since it penalizes both the unwanted high losses and the desired low losses: the solution is to use alternative measures for risk, e.g., VaR and CVaR (McNeil et al. 2005),
it is highly sensitive to parameter estimation errors (i.e., to the covariance matrix \(\boldsymbol{\Sigma}\) and especially to the mean vector \(\boldsymbol{\mu}\)): solution is robust optimization (Fabozzi 2007) and improved parameter estimation (Ledoit and Wolf 2004),
it only considers the risk of the portfolio as a whole and ignores the risk diversification (i.e., concentrates risk too much in few assets, this was observed in the 2008 financial crisis): solution is the risk parity portfolio (Qian 2005).

Markowitz’s portfolio vs \(1/N\) portfolio

Markowitz’s mean-variance portfolio based on moments estimated via sample estimates is notorious for producing extreme weights that fluctuate substantially over time and perform poorly out of sample.
The \(1/N\) portfolio has been claimed to outperform Markowitz’s mean-variance portfolio (DeMiguel et al. 2009); although this may not hold if: i) the estimation window is long; ii) the ex-ante (true) Sharpe ratio of the mean-variance portfolio is much higher than that of the \(1/N\) portfolio; and iii) the number of assets is small (less parameters to be estimated).
Markowitz’s mean-variance portfolio is sometimes cynically referred to as “error-maximizer.”
However, other studies show that Markowitz’s mean-variance portfolio is superior to the \(1/N\) portfolio (Kritzman et al. 2010).

Basic references

Textbook on financial data (Ruppert and Matteson 2015):

📘 D. Ruppert and D. Matteson. Statistics and Data Analysis for Financial Engineering: With R Examples. Springer, 2015.

Textbooks on portfolio optimization (Cornuejols and Tütüncü 2006; Fabozzi 2007; Feng and Palomar 2016):

📘 G. Cornuejols and R. Tutuncu. Optimization Methods in Finance. Cambridge University Press, 2006.

📘 F. J. Fabozzi. Robust Portfolio Optimization and Management. Wiley, 2007.

📘 Y. Feng and D. P. Palomar. A Signal Processing Perspective on Financial Engineering. Foundations and Trends in Signal Processing, Now Publishers, 2016.

Thanks

For more information visit: https://www.danielppalomar.com

References

Ardia, D., Bolliger, G., Boudt, K., & Gagnon-Fleury, J.-P. (2017). The impact of covariance misspecification in risk-based portfolios. Annals of Operations Research, 254(1-2), 2–16. https://doi.org/10.1007/s10479-017-2474-7

Bajalinov, E. B. (2003). Linear-fractional programming: Theory, methods, applications and software. Kluwer Academic Publishers.

Charnes, A., & Cooper, W. W. (1962). Programming with linear fractional functionals. Naval Research Logistics Quarterly, 9(3-4), 181–186.

Chopra, V., & Ziemba, W. (1993). The effect of errors in means, variances and covariances on optimal portfolio choice. Journal of Portfolio Management.

Choueifaty, Y., & Coignard, Y. (2008). Toward maximum diversification. Journal of Portfolio Management.

Choueifaty, Y., Froidure, T., & Reynier, J. (2013). Properties of the most diversified portfolio. Journal of Investment Strategies.

Christoffersen, P., Errunza, V., Jacobs, K., & Langlois, H. (2012). Is the potential for international diversification disappearing? A dynamic copula approach. The Review of Financial Studies.

Cornuejols, G., & Tütüncü, R. (2006). Optimization methods in finance. Cambridge University Press.

DeMiguel, V., Garlappi, L., & Uppal, R. (2009). Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The Review of Financial Studies.

Dinkelbach, W. (1967). On nonlinear fractional programming. Manage. Sci., 133(7), 492–498.

Duchin, R., & Levy, H. (2009). Markowitz versus the talmudic portfolio diversification strategies. Journal of Portfolio Management, 35(2), 71.

Fabozzi, F. J. (2007). Robust portfolio optimization and management. Wiley.

Feng, Y., & Palomar, D. P. (2016). A Signal Processing Perspective on Financial Engineering. Foundations; Trends in Signal Processing, Now Publishers.

Jolliffe, I. (2002). Principal component analysis. Springer-Verlag.

Kritzman, M., Page, S., & Turkington, D. (2010). In defense of optimization: The fallacy of 1/n. Financial Analysts Journal, 66(2).

Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2), 365–411.

Lütkepohl, H. (2007). New introduction to multiple time series analysis. Springer.

Markowitz, H. (1952). Portfolio selection. J. Financ., 7(1), 77–91.

McNeil, A. J., Frey, R., & Embrechts, P. (2005). Quantitative risk management: Concepts, techniques and tools. Princeton University Press.

Meucci, A. (2005). Risk and asset allocation. Springer.

Qian, E. (2005). Risk parity portfolios: Efficient portfolios through true diversification. PanAgora Asset Management.

Ruppert, D. (2010). Statistics and data analysis for financial engineering. Springer.

Ruppert, D., & Matteson, D. (2015). Statistics and data analysis for financial engineering (2nd Ed.). Springer.

Schaible, S. (1974). Parameter-free convex equivalent and dual programs of fractional programming problems. Zeitschrift fur Operations Research, 18(5), 187–196.

Sharpe, W. F. (1966). Mutual fund performance. The Journal of Business, 39(1), 119–138.

Stancu-Minasian, I. M. (1992). Fractional programming: Theory, methods and applications. Kluwer Academic Publishers.

Tsay, R. S. (2010). Analysis of financial time series (3rd ed.). John Wiley & Sons.

Tsay, R. S. (2013). Multivariate time series analysis: With r and financial applications. John Wiley & Sons.

von Neumann, J. (1937). Uber ein okonomisches gleichgewichtssystem und eine verallgemeinerung des brouwerschen fixpunktsatzes. Ergebnisse eines Mathematischen Kolloquiums, 8, 73–83.

Zhao, Z., Zhou, R., Palomar, D. P., & Feng, Y. (2019). Portfolio optimization. submitted.