Back-testing Rules

by ivannp on November 10, 2012

Nowadays there are many trading strategies shared online with reproducible, decent, results. Have you asked yourself, if the strategies are so profitable, why the author bother even sharing them, when the path to riches is clear – just implement the strategy and use it?

There are people, of course, who are fascinated and challenged by the market mechanics and for them coming up with a theoretical result how to beat the market is all that they are looking for. A quick “fix” it is, but that’s not the entire story.

Another fact of life is that creating a working strategy that can be used in practice is a much, much more complicated, and quite often, a very tedious job. Furthermore, using a strategy to trade hundreds of thousands (let’s not get into the million range) of dollars is a very demanding psychological experience.

When a strategy is back-tested by a “practitioner” (vs by a “theoretician”), every step should be performed keeping in mind the practical application. Let’s start with an example – when do we compute the signal and when do we perform the trades? Quite often both of these are done using the closing price. Although I have shown in another post that this is possible in practice, it has a lot of limitations. In my examples, I assume single instrument, and signal with certain properties and even in this cases, sometimes it is not possible to follow this strategy precisely in real trading.

And what do we find on the internet? One finds a strategy handling 20 instruments (a basket of ETFs for instance), computing the signals at the close and doing the trading at the close. No discussion whatsoever how this works in practice!

So the first rule is to choose realistic points where the signal is computed and trading is performed. When dealing with a basket of ETFs, there are two feasible solutions:

  • Use closing prices to compute the signal, trade on the next day in the morning using the opening price.
  • Use only closing prices, but with lag of one extra day.

My typical choice is the second option. I find it especially useful when I am dealing with something different than price. For instance, for some of my strategies, I use the closing price both for the signal and for the trading using the approach explained in “Trading at the Close – the Mechanics” post. The sizes of the positions, however, I determine based on volatility, and both in the back-testing as well as in real life I use the volatility with a lag of two day rather than one.

The second rule is to pay close attention to what prices are used in the computations. Often data series come with “Adjusted” column, which is the “Close”, but adjusted for splits and dividends. While it makes perfect sense to use this column to compute the returns in back-testing, it makes no sense to use this column to compute the signals. Notice, that this column changes as new dividends are paid, thus, the value for a particular day in the past is not the same at different points in time. For instance the adjusted price for Dec 15, 2001 may be different if observer on Sep 01, 2002 than if observed on Sep 01, 2012. In other words, the back-testing results for Dec 15, 2001 may be different if performed on different dates.

Thus, my second rule is to use “Close” prices adjusted only for splits for the signals. While for computing the returns, the “Close” prices may also be adjusted for dividends.

The third rule is to know and understand the indicators used. Let’s take the 50-day Exponential Moving Average (EMA) for instance. Let’s assume that in the back-testing we tested the S&P 500 over the last 60 years. In trading, however, we are interested to compute the value of one day, thus, to minimize the data loading, we load only the last 200 days of data, which is plenty to compute the value of the last 50-day EMA. Does that sound right to you?

The answer is that one is likely to get two different values for the latest day if the series are of different length. There is no surprise here, this is just a property of the way the EMA is computed. Can this lead to a discrepancy in the positions (back-testing vs real-trading) – absolutely, although probably rarely in the EMA case! Here is a code example for the doubters:

library( quantmod )

# Get S&P 500
getSymbols( "^GSPC", from="2000-01-01" )

# Limit the series to get reproducible results
gspc = GSPC["/2012-09-15"]

# Compute EMA using the entire series
print( last( EMA( Cl( gspc ), 20 ) ) )
# Result is: 2012-09-14 1423.622

# Compute EMA using the last 60 days
print( last( EMA( tail( Cl( gspc ), 60 ), 20 ) ) )
# Result is: 2012-09-14 1423.568

And for all of you who don’t use EMA, do you guys use the Relative Strength Index (RSI)? As in RSI2? ;)

{ 2 comments… read them below or add one }

Christian November 18, 2012 at 17:43

Nice post. I never realized that the EMA was calculated using lags of itself. I was always assuming that it just means using a different weighting vector than the one for an SMA or WMA.
The help file of the TTR package actually includes a warning regarding the way of calculation. I was wondering how long the series should be in order to be sure that the EMA gets calculated with “sufficient” precision. After playing around a bit it looks like the series needs to be around 3 times “n” at least.

## Example of short-term instability of EMA
set.seed(12)
x <- rnorm(500)
tail( EMA(x[450:500],50), 1 )
tail( EMA(x[400:500],50), 1 )
tail( EMA(x[350:500],50), 1 ) #history = 3*n
tail( EMA(x[300:500],50), 1 )
tail( EMA(x[250:500],50), 1 )
tail( EMA(x[ 1:500],50), 1 )

## Example of short-term instability of EMA
set.seed(3485)
x <- rnorm(100)
tail( EMA(x[90:100],10), 1 )
tail( EMA(x[75:100],10), 1 )
tail( EMA(x[60:100],10), 1 )
tail( EMA(x[45:100],10), 1 )
tail( EMA(x[30:100],10), 1 )
tail( EMA(x[ 1:100],10), 1 )

## No instability using WMA
set.seed(3485)
x <- rnorm(100)
tail( WMA(x[90:100],10), 1 )
tail( WMA(x[75:100],10), 1 )
tail( WMA(x[60:100],10), 1 )
tail( WMA(x[45:100],10), 1 )
tail( WMA(x[30:100],10), 1 )
tail( WMA(x[ 1:100],10), 1 )

Reply

ivannp November 19, 2012 at 22:13

Good to know, I also discovered it the hard way and have been trying to avoid using it since. Still, I like to keep an eye on RSI2 and never spent enough time to figure out a WMA replacement for it (TTR allows a call like RSI(Cl(GSPC), n=2, maType=WMA, wts=c(20,80))).

Reply

Leave a Comment