One approach to trading which has been puzzling me lately, is to sit and wait for opportunities. ðŸ™‚ Sounds simplistic, but it is indeed different than, for instance, the asset allocation strategies. In order to be able to even attempt taking advantage of these opportunities, however, we must be able to identify them. Once the opportunities are identified – we can try to explain (forecast) them using historical data.

The first step is to define what an *opportunity* is. It could be anything, for instance we can start with *a strong directional move*, but then we need to define what a strong directional move is. Let’s give it a try.

We can define a strong directional move as a 5% move over 3 days for instance. However, that’s not likely to work for different markets (think futures). 5% move is a lot in treasuries, but not so much in oil. The solution is to normalize the returns using volatility. I will use exponential moving average for that.

Here is a short code snippet:

good.entries = function(ohlc, min.days=3, days.out=15, vola.len=35, days.pos=0.6, stop.loss = 1.5) { stopifnot(days.out > min.days) hi = Hi(ohlc) lo = Lo(ohlc) cl = Cl(ohlc) rets = ROC(cl, type='discrete') arets = sqrt(EMA(rets * rets, n=vola.len)) erets = EMA(rets, n = vola.len) res = rep(0, NROW(ohlc)) days = rep(0, NROW(ohlc)) longs = rep(0, NROW(ohlc)) shorts = rep(0, NROW(ohlc)) for(ii in min.days:days.out) { hh = lag.xts(runMax(hi, n=ii), -ii) ll = lag.xts(runMin(lo, n=ii), -ii) hi.ratio = (hh/cl - 1)/erets lo.ratio = (ll/cl - 1)/erets # days required dd = ceiling(days.pos*ii) longs = hi.ratio > dd & -lo.ratio < stop.loss longs = ifelse(!is.na(longs) & longs == 1, 1, 0) shorts = -lo.ratio > dd & hi.ratio < stop.loss shorts = ifelse(!is.na(shorts) & shorts == 1, -1, 0) both = ifelse(longs == 1, 1, shorts) new.days = ii*as.numeric(res == 0 & both != 0) res = ifelse(res != 0, res, both) days = ifelse(days != 0, days, new.days) } full.df = data.frame(data = index(ohlc), entry = res, days = days, erets = erets, arets = arets) oppo.df = full.df[!is.na(full.df$entry) & full.df$entry != 0, ] return(list(full=full.df, oppo=oppo.df)) }}

The code basically defines an opportunity as a directional move over X days (X is between 3 and 15). The minimum move we would accept is computed by taking a percentage of the days (60% by default), rounding that number up, and multiplying it by the standard deviation of the normalized returns. We also require that the move in the opposite direction is smaller. Hard to explain, but hopefully you get the idea.

The function returns a list of two data frames – one for all days, and one with the opportunities only. There is a label assigned to each day – 0 by default. The opportunities are the 1 and -1 days of course.

Is this trade-able? Absolutely. When an opportunity arises, we enter in the suggested direction. The exit is a profit target, we can use a stop loss of the same magnitude.

Why all this effort again? Simple – these labels can be used as the response, or the dependent variable in an attempt to explain (forecast) the opportunities using historical data. At this point what’s left is to generate what we think might constitute a good set of predictors, align the two sets appropriately (no data snooping) and the rest is standard machine learning.

That’s for another post though.

There may be an issue with the lags at lines 18 and 19. The negative values will shift the max/min vectors back ii-days, implying that we knew those values ii days before they happened. It seems to me that the lag should be in the other direction, i.e. we only learn about the max/min value over the past ii days today.

That’s the idea – knowing the future, we “mark” interesting points in time. Once these points are marked, we need to find predictors to use. For the predictors we shouldn’t be snooping into the future.

Good post! I’m more interested in how to forecast movements in R. (I’m starting a series next week discussing Python for trading).