# When is a Backtest Too Good to be True? Part Two.

In the previous post, I went through a simple exercise which, to me, clearly demonsrtates that 60% out of sample guess rate (on daily basis) for S&P 500 will generate ridiculous returns. From the feedback I got, it seemed that my example was somewhat unconvincing. Let’s dig a bit further then.

Let’s add Sharpe ratio and maximum drawdown to the CAGR and compute all three for each sample.

```return.mc = function(rets, samples=1000, size=252) {
require(PerformanceAnalytics)
# The annualized return for each sample
result = data.frame(cagr=rep(NA, samples), sharpe.ratio=NA, max.dd=NA)
for(ii in 1:samples) {
# Sample the indexes
aa = sample(1:NROW(rets), size=size)
# All days we guessed wrong
bb = -abs(rets)
# On the days in the sample we guessed correctly
bb[aa] = abs(bb[aa])
cc = as.numeric(bb)
# Compute the statistics of interest for this sample.
result[ii,1] = Return.annualized(cc,scale=252)
result[ii,2] = SharpeRatio.annualized(bb,scale=252)
result[ii,3] = maxDrawdown(cc)
}
return(result)
}
```

Let’s look at some summary statistics:

```require(quantmod)
gspc = getSymbols("^GSPC", from="1900-01-01", auto.assign=F)

df = return.mc(rets, size=as.integer(0.6*NROW(rets)))
summary(df,digits=2)

#       cagr       sharpe.ratio     max.dd
#  Min.   :0.34   Min.   :1.8   Min.   :0.13
#  1st Qu.:0.45   1st Qu.:2.3   1st Qu.:0.22
#  Median :0.48   Median :2.5   Median :0.26
#  Mean   :0.48   Mean   :2.5   Mean   :0.27
#  3rd Qu.:0.51   3rd Qu.:2.7   3rd Qu.:0.31
#  Max.   :0.67   Max.   :3.5   Max.   :0.63
```

The picture is clearer now. Lowest Sharpe ratio of 1.8 among all samples, and a mean at 2.5? Yeah, right.

The results were similar for other asset classes as well – bonds, oil, etc. All in all, in financial markets, like in a casino, a small edge translates into massive wealth, and most practitioners understand that intuitively.

1. Pat says:

Thanks for sharing. I think you meant to put bb under the performance arguments, not cc (as it does not exist).
Something kind of interesting, is that the actual realized bias was about 54% over the rets period, with a CAGR of about 7%. If you plug 54% under the size argument of the mc function, you find that 7% was in the very low end of the range of results! Imagine some oracle had a crystal ball and could predict 54% daily returns over all those years, and as a result, he might have expected something like 16% Mean CAGR, but only achieved the lowest end of the mc CAGR results, and the high end of the drawdown results — talk about bad luck on top of edge. Maybe that tells us something cautionary about trying to optimize solely around hit rates.

df = return.mc(rets, samples=1000, size=as.integer(0.54*NROW(rets)))

> summary(df,digits=2)
cagr sharpe.ratio max.dd
Min. :0.058 Min. :0.30 Min. :0.21
1st Qu.:0.134 1st Qu.:0.70 1st Qu.:0.33
Median :0.158 Median :0.82 Median :0.38
Mean :0.160 Mean :0.83 Mean :0.40
3rd Qu.:0.185 3rd Qu.:0.96 3rd Qu.:0.46
Max. :0.307 Max. :1.60 Max. :0.74

> Return.annualized(rets,scale=252)
GSPC.Close
Annualized Return 0.0713289
> maxDrawdown(rets)
 0.5677539

1. quintuitive says:

Thanks for finding the “cc” bug – it was meant to be “cc = as.numeric(bb)”.

About the buy and hold – although it seems “lucky” in terms of guess rate (54%), it is quite “unlucky”, as you have observed, in terms of returns. The asymmetry of the returns (returns in a bear vs returns in a bull) is a plausible explanation. Likewise, at a 50% guess rate, one loses money on average.

Last but not least, luck is a huge factor indeed on the Drawdown front as well. Imagine running into the 40% losing days right off the bat – the drawdown will be nearly 100%. The loses are also massive if one guessed correctly, at 60%, the days with the smallest returns.

2. David Zimmermann says:

Thank you for sharing.
I simulated different success rates over at my blog “Data Shenanigans”, mainly to introduce people to parallel computing, but also to show how a succcess rate of roughly 55% results in an exploding value.
If you want to find more, please visit: https://datashenanigan.wordpress.com/2015/09/23/simulating-backtests-of-stock-returns-using-monte-carlo-and-snowfall-in-parallel/
David