In this tutorial I am going to share my R&D and trading experience using the well-known from statistics Autoregressive Moving Average Model (ARMA). There is a lot written about these models, however, I strongly recommend Introductory Time Series with R, which I find is a perfect combination between light theoretical background and practical implementations in R. Another good reading is the online e-book Forecasting: principles and practice written by Rob Hyndman, an expert in statistical forecasting and the author of the excellent forecast R package.

Getting Started

In R, I am mostly using the fArma package, which is a nice wrapper with extended functionality around the arima function from the stats package (used in the above-mentioned book). Here is a simple session of fitting an ARMA model to the S&P 500 daily returns:

library( quantmod )
library( fArma )

# Get S&P 500
getSymbols( "^GSPC", from="2000-01-01" )

# Compute the daily returns
gspcRets = diff( log( Cl( GSPC ) ) )

# Use only the last two years of returns
gspcTail = as.ts( tail( gspcRets, 500 ) )

# Fit the model
gspcArma = armaFit( formula=~arma(2,2), data=gspcTail )

For more details, please refer to the literature and the packages, I just want to emphasize on a couple of points:

• We model the daily returns instead of the prices. There are multiples reasons: this way financial series usually become stationary, we need some way to “normalize” a series, etc.
• We use the diff and log function to compute the daily returns instead of percentages. Not only this is a standard practice in statistics, but it also provides a damn good approximation to the discrete returns.

The approach I will present here is a form of walk-forward backtesting. While walking the series day by day, we will use history of certain length to find the best model. Then we will use this model to predict the next day’s return. If the prediction is negative, we assume short position, otherwise we assume a long position.

An example will make things clearer: After the close of June 11th, 2012, we compute the last 500 daily returns. Using these returns we search through the space of ARMA models and select the best-fitting (with respect to some metric and some requirements) model. Finally, we use this model to compute the prediction for the tomorrow’s return and use the sign of the return to decide the appropriate position.

Choosing a Good Model

The first obstacle for this method before it could be useful to us, is to select the model parameters. In the case of ARMA, there are two parameters. In other words, there is an infinite number of choices: (0,1), (1,0), (1,1), (2,1), etc. How do we know what parameters to use?

A common approach in statistics to quantify the goodness of fit test is the AIC (for Akaike Information Criteria) statistic. Once the fitting is done, the value of the aic statistics is accessible via:

xxArma = armaFit( xx ~ arma( 5, 1 ), data=xx )
xxArma@fit\$aic

There are other statistics of course, however, typically the results are quite similar.

To summarize, all we need is a loop to go through all parameter combinations we deem reasonable, for instance from (0,0) to (5,5), inclusive, for each parameter pair fit the model, and finally pick the model with the lowest AIC or some other statistic.

# https://gist.github.com/2913657

armaSearch = function(
xx,
minOrder=c(0,0),
maxOrder=c(5,5),
trace=FALSE )
{
bestAic = 1e9
len = NROW( xx )
for( p in minOrder[1]:maxOrder[1] ) for( q in minOrder[2]:maxOrder[2] )
{
if( p == 0 && q == 0 )
{
next
}

formula = as.formula( paste( sep="", "xx ~ arma(", p, ",", q, ")" ) )

fit = tryCatch( armaFit( formula, data=xx ),
error=function( err ) FALSE,
warning=function( warn ) FALSE )
if( !is.logical( fit ) )
{
fitAic = fit@fit\$aic
if( fitAic < bestAic )
{
bestAic = fitAic
bestFit = fit
bestModel = c( p, q )
}

if( trace )
{
ss = paste( sep="", "(", p, ",", q, "): AIC = ", fitAic )
print( ss )
}
}
else
{
if( trace )
{
ss = paste( sep="", "(", p, ",", q, "): None" )
print( ss )
}
}
}

if( bestAic < 1e9 )
{
return( list( aic=bestAic, fit=bestFit, model=bestModel ) )
}

return( FALSE )
}

Note that sometimes armaFit fails to find a fit and returns an error, thus quitting the loop immediately. armaSearch handles this problem by using the tryCatch function to catch any error or warning and return a logical value (FALSE) instead of interrupting everything and exiting with an error. Thus we can distinguish an erroneous and normal function return just by checking the type of the result. A bit messy probably, but it works.

Some R packages, forecast and rugarch for instance, provide a similar, auto.arima function out of the box. So one can build his infrastructure around one of these instead.

Forecasting

Once the parameters are selected, it’s time to determine the position at the close. One way to do that is by a one day ahead prediction, if the prediction comes negative (remember the series we are operating on is the daily returns) then the desired position is short, otherwise it’s long.

library( quantmod )
library( fArma )

getSymbols( "SPY", from="1900-01-01" )
spyRets = diff( log( Cl( SPY )["/2012-05-29"] ) )
spyArma = armaFit( ~arma(0, 2), data=as.ts( tail( spyRets, 500 ) ) )
as.numeric( predict( spyArma, n.ahead=1, doplot=F )\$pred )
# -0.0004558926

Now, to build an indicator for the back testing, one can walk the daily return series and at each point perform the steps we covered so far. The main loop looks like (shortened on purpose):

# currentIndex is the index of the day we are making a forcast for
# xx is the return series
# history is look-back period to consider at each point
repeat
{
nextIndex = currentIndex + 1

# lags is how many days behind is the data, the default is 1,
# meaning use data up to yesterdays close
forecastLength = nextIndex - currentIndex + lags - 1

# Get the series
yy = xx[index(xx)[(currentIndex-history-lags+1):(currentIndex-lags)]]

# Find the best fit
bestFit = armaSearch(
yy,
minOrder,
maxOrder,
withForecast=TRUE,   # we want the model to have a valid forecast
forecastLength=forecastLength,   # 1 for a dialy forecast
trace=trace,
cores=cores )   # the number of cores to use

if( !is.null( bestFit ) )
{
# Forecast
fore = tryCatch( predict( bestFit, n.ahead=forecastLength, doplot=FALSE ),
error=function( err ) FALSE,
warning=function( warn ) FALSE )
if( !is.logical( fore ) )
{
# Save the forecast
forecasts[currentIndex] = tail( fore\$pred, 1 )

# Save the model order
ars[currentIndex] = order[1]
mas[currentIndex] = order[2]

forecasts[currentIndex] = 0
}

if( nextIndex > len ) break
currentIndex = nextIndex
}
}

Where history is the look-back period to consider at each point, I usually use 500, which is about two years of data. In other words, to determine the position at each individual day (previous day close to the current day close determines the return) we use history of 500 days, lagged by lags day. You will see later how lags comes into play in practice.

Notice, that predict has also to be surrounded by a tryCatch block. armaSearch also has the nice feature to determine whether a model has a forecast or not (predict succeeds or not, this test is controlled via the withForecast parameter).

Improving Performance

The number of computations we have to do adds up quickly. For example, for 10 years of historic data we need to compute about 2,520 trading days. For each day we are going to fit and predict at least 35 (35=6*6-1, 0 to 5 both for the AR and MA component, but excluding the (0,0) combination) models. Multiplying the number of models by the number of days, and we are already looking at more than 88 thousand model fits – that’s a lot of computations.

One way to improve the performance of these necessary computations can be achieved by exploiting multi-core CPUs. My approach is to parallelize the model selection, the armaSearch function in the above code. Although this may not be the most efficient approach, it is certainly the more practical since it will also boost the performance of armaSearch when used independently.

I won’t post the final version of the code here due to it’s length. I will give you the GIST link instead!

Modeling Volatility with GARCH

Financial time series are random in general. One of the few properties they exhibit is Volatility Clustering. This is typically achieved by extending the ARMA forecasting with a GARCH model. Sounds complex, and the theoretical details are complex indeed, but it turns out to be pretty straightforward in R:

library(quantmod)
library(fGarch)

getSymbols("SPY", from="1900-01-01")
spyGarch = garchFit(~arma(0, 2) + garch(1, 1), data=as.ts(tail(spyRets, 500)))
# the actual forecasts are predict(spyGarch, n.ahead=1, doplot=F)[,1]

Of course, we also need to modify all relevant functions, like armaSearch. Calls to garchFit and predict also need to be handled via tryCatch. Notice also that predict returns a matrix for GARCH models.

The full source code is available from a GitHub Gist.

S&P 500 Performance

Let’s start with the equity curve of applying the ARMA+GARCH strategy over the full 60 years (since 1950) of S&P 500 historic data.

It looks fantastic! In fact, it impressed me so much that I looked for bugs in the code for quite some time. ðŸ™‚ Even on a logarithmic chart the performance of this method is stunning – CAGR of 18.87%, and the ARMA+GARCH strategy achieves this performance with a comparable maximum drawdown of 56%.

To compute the ARMA strategy growth, we first need the daily indicator (this indicator takes about two days to compute with all optimizations I covered in this post).

The first column is the date, the second the position for this day: 1 for long, -1 for short, 0 for none. Note, the position is already aligned with the day of the return (it is computed at the close of the previous day), in other words, the indicator is aligned properly with the returns – no need to shift right via lag. The indicator, the first column, needs to be multiplied with the S&P 500 daily returns. The rest of the columns are irrelevant and hopefully self-explanatory.

Let’s wrap up the post with the code that loads the indicator and plots the graphic:

library(quantmod)
library(lattice)
library(timeSeries)

getSymbols("^GSPC", from="1900-01-01")

# The maximum draw down

# The largest dropdawn is:
#         From     Trough         To      Depth Length ToTrough Recovery
# 1 2007-10-10 2009-03-09 2012-09-28 -0.5677539   1255      355       NA

# Filter out only the common indexes
mm = merge( gspcArmaInd[,1], gspcRets, all=F )
gspcArmaRets = mm[,1] * mm[,2]

# The maximum draw down
# The largest dropdawn is:
#          From     Trough         To      Depth Length ToTrough Recovery
# 1  1987-10-26 1992-10-09 1997-10-27 -0.5592633   2531     1255     1276

gspcArmaGrowth = log( cumprod( 1 + gspcArmaRets ) )

gspcBHGrowth = log( cumprod( 1 + mm[,2] ) )

gspcAllGrowth = merge( gspcArmaGrowth, gspcBHGrowth, all=F )

xyplot( gspcAllGrowth,
superpose=T,
col=c("darkgreen", "darkblue"),
lwd=2,
key=list( x=.01,
y=0.95,
lines=list(lwd=2, col=c("darkgreen", "darkblue"))))

1. nikke says:

Hi!

I have been enyoing your very informative blog. If possible, I would be very interested in the full source. I would like to see if I could modify it to see how it would perform in a backtest using the quantstrat package.

Thanks!

1. ivannp says:

Hi,

For backtesting, I don’t use quantstrat for various reasons. From what I remember when I last considered using it, my plan was to generate the indicator (this step takes time) and then use the computed indicator as an input argument and simply copy out the position.

2. zach says:

Who do I email for the source code for the GARCH search?

1. ivannp says:

That’s the right place, the site is still being built and the feedback form is yet to come. ðŸ™‚

Re the code – I am not sure I want to publish it completely yet. It also requires some computational resources to perform a full simulation.

3. Vinod Devasia says:

Hi,

Recently started using R to do some stock analysis, and stumbled on your excellent blogs and got some very useful information. Is it possible to email me the full source and i want to study it and test it,,

Regards

Vinod

Likewise, if you are open to it i’d love to stress-test the code on my end. Very interesting approach.

5. lkubota says:

Hello! Just out of curiosity here, the results you posted were produced by examining daily returns over a given lookback period and then trying to predict the next day return. Have you tried out your ARMA strategy on weekly returns? How the results stack up against the strategy where daily returns are fed into your model instead? Also, itÂ´d be interesting to see some other numbers such as winners % for example. Are you currently using this model to trade real money? Great post and do keep up the good work!

6. ivannp says:

Hi. I haven’t tried weekly returns, probably worth looking into it, although for weekly returns I’d prefer to use a model taking into account other features besides returns. More suitable for an SVM or a Neural Network.

Yes, I have been using the ARMA+GARCH strategy to trade a single financial instrument (not the SPY) for more than a year now. This is the main reason why I am reluctant to share the code.

Last, I am looking into updating the post with some more trading summaries and statistics, but haven’t done it so far, because I couldn’t come up with a satisfying (I am picky) format.:)

7. Prabin says:

Hi ivannp,
I am extremely thankful to you for putting up such useful r codes and info for quantitative analysis. I haven’t seen such organized procedures and codes for R for quant analysis anywhere else. I have been visiting your blog for a long time. I am trying to follow the codes here but i am afraid i am definitely missing some steps here. armasearch function gives me arma(5,2) for ‘SPY’ but you are using arma(0,2) for garchfit. May i know why ?. If am missing something please guide me and can you please mail me the full code to prabinseth@gmail.com. Thanks in advance

1. ivannp says:

Hi Prabin, always happy to hear from people who enjoy the blog, inspires me to not neglect it.:)

The code you are referring to, is just an illustration how to use garchFit. The (0,2) is completely random – I just choose some numbers. For real life use, one needs to create a garchSearch function, similar to the shown armaSearch. It is similar, but there are difference: the possible models consist of four elements, the first two are (AR,MA), but there are two GARCH components as well, garchFit replaces armaFit and also the results from garchFit are a bit more detailed (an array vs a number).

The code is not fully functional as it is. The reason I don’t want to post the full code is that I use it daily. The results of running it daily on the SPY are available on the S&P 500 page. It has both the daily position based on ARMA+GARCH, as well as, the action table for the end of day.

That’s the state about ARMA+GARCH, but I promise I won’t do the same for new stuff (SVMs are coming). I will publish fully functional version of the code, although I won’t keep updating it with improvements.

8. Pete says:

Hi, Very interesting post. I have a question regarding the armaComputeForecasts function that produces rolling forecasts. When this produces a forecast does the date of the forecaset (i.e. the index in the corresponding xts row) correspond to the date it was created or the date it is forecasting into, i.e. would i need to lag the forecase as usual with an indicator or is this already taken care of?

1. ivannp says:

It corresponds to the date it is forecasting into. No need to lag it further, just align with the return series.

9. Ronaldo says:

Hi ivannp,
IÂ´m using the mixture of ARMA+GARCH but sometimes the garch fails to predict and return NA (Bad Model). In that case, what do you do? You repeat the previous value or try to search again?

Just sharing: IÂ´m comparing the functions garch and garchFit to compute GARCH(1,1) and the garch funcition itÂ´s much more faster than garchFit.

Best,

1. ivannp says:

Hi Ronaldo,
My approach is to cycle through all models between (0,0,1,1) and (5,5,1,1), ignore the once that don’t converge and choose the one with the lowest AIC. If none of the models (36 in total) converges, than the prediction is 0, out of the market.

10. Miguel says:

Hi ivannp,

Maybe I’m wrong, but adding garch to an arma model only improves the confidence intervals, not the prediction. Do you use this info to size your position? Have you tried aparch instead of garch in order to tackle the asymmetry of volatility vs returns?

1. ivannp says:

Hi Miguel,
I canâ€™t argue on the theoretical implications of adding garch to an arma mode, but it definitely improves the predictions from my experiments. Notice that I donâ€™t measure predictions as an absolute error, but more as a true/false value (correct guess for the direction).
The fGarch package supports using skewed distributions (sged,sstd) and they too seem to improve the predictions. Right now I am running out of resources to test anything new, but may give aparch a try sometime in the future. Thanks for suggesting it.

1. Miguel says:

That is interesting. It could be that adding garch increases the parameters and this affects the final model selected by the AIC in a way that improves the prediction.

11. Raman says:

Thanks. Very educational.

12. michaelv2 says:

Since the ARMA strategy outperformance looks quite time period-specific (the vast majority of excess returns appear to be generated between 1965-75), it would be much more useful to see a chart of rolling cumulative returns for each strategy (i.e. over 3 or 5 years). Also, ARMA returns are presumably gross of t-cost here, so strategy turnover is another very important consideration (are you able to share what it was?).

1. ivannp says:

Hi, in my old blog (http://theaverageinvestor.wordpress.com/2011/07/), I mentioned that there was one trade on average every 2.35 days. I remember counting the trades and dividing by the days.
The indicator for the series is available here: http://www.quintuitive.com/wp-content/uploads/2012/08/gspcInd3.csv. It needs to be matched against the S&P 500 cash index, no lagging, but then one can get all kinds of stats. I am certainly going to do that one day, just not sure when.
With this strategy, I am not too worried about transaction costs. Using a regular, retail account on Interactive Brokers, one can trade a share of SPY for \$0.005. At the current price of \$140, that’s negligible, unless done a few times a day.

13. survi says:

Hi,
Your post is not only interesting to read but also acts as a guide to people new to the field of quantitative finance.Being a beginner in this field,your blog seems to be a gold mine.I,have a few questions,however,i have used your Armasearch code on a specific instrument and found that with the indicators,it did not give a better performance than buy and hold,so,i have been trying to fit in the garchFit code using garch(1,1) as the garch errors, could you kindly guide me so that i would be able to do this?Relevant examples or links would be very helpful.
Also,i did not understand from your code,how exactly to execute the trade,i.e,entry and exit points,could you kindly guide me in the same?

14. survi says:

Hi,
Your blog is not only interesting but also informative for people new to the world of quantitative finance.I have a few questions,I have used the armasearch function for a certain instrument and upon backtesting found the results to be inferior to buy and hold,so i am trying to fit garch(1,1),could you kindly guide me regarding how to do the same?
Also,could you help me regarding entry and exit points for the indicator generated by you above?

1. ivannp says:

Hi, this is my best effort (without providing the source code itself) to explain how to use garchFit. You may want to try first other arma approaches, I would recommend the forecast package and his author’s book (http://otexts.com/fpp/), or the rugarch package. Both these packages provide more scientific and advanced approach for arma model selection.

To apply the ideas on this blog in practice requires a significant amount of additional work. My only advise, which I have outlined in other posts, is to think about applying in real practice at every step.

2. Itay says:

Hi Ivan,

Thank you very much for the great introductions you provide for beginners (as myself) in quantitative finance.

In your work, you are walking the time series day by day, finding the best ARMA model – ARMA(p,q)
and then use the model to predict the next day’s direction.

Then to improve performance, you use the best arma paremeters (p,q) for that time
with GARCH(1,1) to create a new model and use it to predict next day’s direction.

So you have a model with 4 parameters used in garchFit.

I am using a different GARCH library (not in R, it is in C#) and in it
the parameters for the model are only 2 (instead of 4) :

the number of auto-regressive (AR) parameters and the number of moving average (MA) parameters.

(as always creating a GRACH(1,1) without considering the ARMA(P,Q) is different).

1. faifeu says:

Hello Itay

It would seem that the reason you have only 2 parameters for your model is because you are trying to fit your date to an ARMA model without the heteroskedasticity component
The GarchFit method within the fGarch library in R allows to fit on a generalized autoregressive model (hence the 4 parameters)

Quick (related) question for you: could you point me to the C# library you are referring to? I, myself, am rather fond of C# (as I have a whole architecture built around it) and I would like to incorporate a data fitting library that allows to call for an ARMA model.

Thank you

15. MMP says:

Hi,

Your posts are really great and have a lot of valuable information. I tried looking at the daily indicator csv but it’s no longer up. Would I be able to have a copy to inspect? I’m currently testing the full arma code and want to know how to evaluate the results correctly before moving onto trying to implement the GARCH component.

1. ivannp says:

Updated the link – thanks for let me know.

1. MMP says:

In your other post regarding backtesting rules, you made the good observation about signaling against prices inclusive of splits but not dividends while backtesting against the fully-adjusted prices (inclusive of splits AND dividends). How can you do the former using getSymbols and Yahoo as a datasource? It’s my impression that you can only adjust directly for both rather than just one.

1. ivannp says:

adjustOHLC from quantmod does that: adjustOHLC(x, adjust=”split”, use.Adjusted=FALSE). Use the symbol.name argument if the variable has a different name than the actual symbol.

1. MMP says:

I have specific questions about the GARCH implementation that you probably don’t want discussed in the comments section. If you can see my e-mail in the WP admin area, would you be open to discussing them privately?

16. Anthony says:

Hi,

Very interesting post, But source code seems to be not available anymore… Does anyone can send it to me? my email a dress : dentelle55@yahoo.fr

Thks a lot

1. ivannp says:

Hi, what source is not available anymore? Send me the link that is expired and I will update it.

17. Dansan says:

Hi!
This is really great work! But I have some questions about your model.
This approach is commonly used for volatility models (arma(p,q) + garch(1,1)). What is the difference between your model and volatility models? Most of them are forecasting the volatility and not like you the returns on the next day, aren’t they?… I dont get the difference so far… Have you ever considered to use an EGARCH or TGARCH model?

1. ivannp says:

Hi. I have wondered the same too, but I don’t see a reason why we cannot use the mean forecasts too. From my experiments ARMA+GARCH forecasts are superior in terms of predictive power compared to only ARMA forecasting.
One paper I know, which uses similar method (among other things) is “Technical Trading, Predictability and Learning in Currency Markets” … I haven’t used EGARCH/TGARCH models.

18. Alex says:

Hi, Ivan,
I ve a question regarding the choosing the model. You run throung all combinations in (0,0) to (5,5) and choose the best based on AIC.
But what if this “best” results insignificant coefficients of arma or garch ? And that happens quite often.Is that because of in-sample data then?
Thanks for the blog and the answer, realy nice informations in the blog of how to use academia in practice as before I was only developing models for university.

1. ivannp says:

Zero-ing insignificant coefficients is one way to address this. I am sceptical how much improvement it provides, but I haven’t tested it seriously. A further question would be what to do when all coefficients are insignificant? Throw the model and use the next best based on AIC, or exit the market?

1. Alex says:

Indeed, it is often possible to proceed to another model with almost the same AIC. Actually the model estimation very much depends on the size of the in-sample data (depending on stage of volatility). One may repeat your estimation and forecasting procedure for different sizes of in-sample data expecting robust results.
Everywhere above you only mention the forecasting on n steps ahead. Have you tried simulations to price derivatives for instance? Do you think also it is possible to model short interest rate with arma-garch estimated on overnight or weekly data such that it would fit the current term structure?

1. ivannp says:

Running multiple windows simultaneously is a very interesting idea. I will definitely consider running a test. In fact, instead of choosing by AIC, one can probably use voting between all models that provide a prediction …
I never used ARMA/GARCH for pricing derivatives, but my understanding is that pricing derivatives is its main application. Weekly/monthly data is useful unless too volatile, even on daily, I have seen the models having troubles with some more volatile futures.

19. Yuri says:

Hi Ivan, how you doing?

Ive been trying to replicate the spreadsheet with signals (that youÂ´ve posted above) but i wasnÂ´t succeded. Are you just running these Arma-garch model (information period of 500 trading days) and using the forecast of the fitted model to define you trade position of the following day? About the specification of your model…. related to the inovation terms … which distribution are you using? “Generalized Error”??

Thanks for your attention and I would like to congrat the work that youÂ´ve been done at your blog…. IÂ´ve been following for some time your work and the discussions here are very educative/constructive!

1. ivannp says:

Glad to hear the blog is interesting and people find useful. Are you using mine code, from the web site? I rarely use anything but the skewed generalized error distribution (“sged” is the garchFit parameter for it). If you see differences, send me a repro and I will take a look.

1. Yuri says:

Ivan… is there any email that I can send you an spreadsheet containing the backtest that i made? Send it to yuriverges@globo.com

Thanks!

20. sayantan says:

HI,
I am new to R, and this is a very helpful and very informative blog. Thanks.
Can you please give the dataset you used to determine the order of ARMA process (xx dataset). I ran the armaSearch function in the R console using my dataset but it did not give back any results.

(sayantanp@igidr.ac.in)

21. Nick says:

No need for the above, I can work it out. Are you running these simulations on a linux machine? As windows doesnt seem to allow one to use multi-core. Are you aware of any packages that will help one do that? I’ve found a few but not sure which one is best for fGarch.

1. ivannp says:

I opened the source a bit later – https://gist.github.com/ivannp/5198580. It’s in the post, probably a bit hard to find.
Yes, I am using linux, and yes, the parallel package used to have some issues on windows, but I think these have been cleared up. No?
Hope this helps!

22. Srini says:

Hi,

Thanks for the lovely post. It is well written and pretty useful for some one looking to branch out in this area.

BTW, this link no longer works.

Many Thanks and Regards,
Sundar

1. ivannp says:
23. Andy says:

Hi,

I am using the ARMA(P,Q)/GARCH(p,q) model in my dissertation but I don’t know how to choose my P,Q,p,q values. Under simple ARMA I know I just have to look at ACF/PACF of the Time Series but I’m lost for ARMA/GARCH model.

Thanks

1. ivannp says:

Hi, the method I use is to cycle through a set of models and to choose the “best” based on one of the well known stats – AIC/BIC/HIC/etc. I learned this approach for ARMA models from the “Introductory Time Series with R” (http://www.amazon.com/Introductory-Time-Series-Paul-Cowpertwait/dp/0387886974). The source code for my approach is from this post: http://www.quintuitive.com/2013/03/24/automatic-armaâ€¦on-in-parallel/.

1. Andy says:

I saw in papers that the garch model use error terms to calibrate itself but how do I get the error term when I don’t even have a model for the mean equation?

Thanks

24. Andy says:

Should garch model be applied on the residuals(Et) of a series or on the series itself(Xt)?

I saw in the book that you suggested that it applies the garch function to simulated errors then it applies it to the SP500 data, I got confused by that.

1. ivannp says:

I am pretty sure they are not in error on how to use GARCH, but you may want to check with the literature. In the chapter of selecting ARMA model (no GARCH) however they do cycle through various models and select the one based on AIC. The forecast package by Rob Hyndman has similar approach for ARMA. I simply am using the same approach for ARMA+GARCH. Cycling through a predefined set of models gives you an opportunity to look and compare other metrics too – confidence intervals for instance.

1. Andy says:

I tried using the fGarch package but I need to specify the parameters, isn’t there a function is the package that looks for the best ARMA-GARCH model?like the auto.arima do
What about you, what data do you feed into the GARCH model?residuals?or the returns?

thanks

1. ivannp says:

My autoGarch function is available here: https://gist.github.com/ivannp/5198580. It is based on what I described in the “ARMA models for Trading” post: http://www.quintuitive.com/2012/08/22/arma-models-for-trading/. Hope this helps.

1. Andy says:

Does it fit the ARMA first then use the residuals to calculate the GARCH?or does it do it in parallel?

I’m new to R so I don’t understand very much what is written in the codes.

25. lucas says:

Very interesting, thank you.

26. I do not know if it’s just me or if perhaps everyone else experiencing problems with your site.
It appears as if some of the text within your content are running
off the screen. Can someone else please provide feedback and let me know if this is happening to them as well?

This may be a issue with my internet browser
because I’ve had this happen previously. Thanks

1. ivannp says:

This is the first time I am hearing such complaint. I will keep an eye for similar reports.

27. Galapagos says:

Hi Ivan,

I loved reading your blog on this. I used the alternative auto.arima() function instead of your (much slower and more expensive) ARMAsearch function but that one gave drastically different backtests and performed worse than Buy-and-Hold. It didn’t replicate your results based on your ARMAsearch, but it did however capture a lot of profits around the ’08 crisis, much like your ARMAsearch did, but it still doesn’t really compare. That was interesting to me. For the moment I am reading the auto.arima() source code and comparing it to your ARMAsearch. It appears you did a grid search; auto.arima() does a local search (which explains the speed).

May I ask what sorts of hardware are you using nowadays? Do you do any GPU computations?

1. ivannp says:

Hello, glad you like my blog. For my use, I find the Intel CPUs to give sufficient performance and parallelization. The hardware I use is quad-core i7 with hyperthreading, which makes it “almost” 8-way. On such machine, an ARMA+GARCH backtest takes less than a day (if my memory is correct) for about 50 years of data. It does all the work for forecasting on-close decisions for a specific day (i.e. the work needed to prepare for a trading day) in about a couple of hours.

Indeed you are right, the auto.arima function uses a different algorithm, which doesn’t analyze all outcomes. From my experience it’s not straightforward to replicate 100% results between packages. Especially when one involves the distribution of the residuals. I noticed the same when, at some point, I tried briefly the rugarch package.

28. gwwc says:

Hi Ivan,
I am a newbie to mathematical finance. I was just discussing with my professor about the use of ARMA model in real trading last week. I found your detail model very interesting. So I try to study it line by line. I have tried to print out the standard errror along with the prediction and found that the magnitude of the standard error far greater than the prediction. I was thinking if that would post much risk on individual decision, limiting the model to function on large number of decisions only, and perhaps not so when using the strategy for a short period of time.
Hope can get your idea. Thanks.

1. ivannp says:

That’s a problem and it has been discussed in other comments already. If one doesn’t want to use such method because of lack of statistical merits – so be it. An alternative approach would be to develop a system that uses a method while “it works”.

29. Bill says:

Hey ivannp,
Great blog, thanks. I have been using your code for some research… would you be willing to post the source code for creating the indicator matrix? Thanks.

1. ivannp says:

Hi, is this link https://gist.github.com/ivannp/5198580 what you are looking for? It’s a stripped down and older version of what I actually use.

1. Bill says:

Thanks… Only thing that isn’t clear to me…in the garchautotryfit, what is “ll” represent? Thanks!

1. ivannp says:

mclapply takes models, a list of all the models (and each model is also a list, thus, we have a list of lists) we want to compute as its first argument, then it calls garchAutoTryFit for each individual model from this list, passing the model as it’s first argument.

The following line adds a new model to the list in garchAuto:

models[[length( models ) + 1]] = list( order=c( p, q, r, s ), dist=dist )

Each model is also a list, containing the order (accessed via \$order) and the distribution (accessed via \$dist).

Now I feel it’s a bit of an ugly way to do things, but it gets the work done.:)

1. Bill says:

Ok… that makes sense to me, but what is actually building the ll? garchAutoTryFit and garchAuto are allowing you to optimize the parameters for the prediction you make with garchfit… I know that the “data” or “xx” in the code is the return series, but I don’t see how to execute the functions without an initial ll. Thanks!

2. ivannp says:

ll is constructed inside garchAuto, using min.order, max.order and a few other parameters passed to the routine by the user. If min.order is (0,0,1,1) and max.order is (5,5,1,1), garchAuto constructs an ll which contains all possible variations within these limits, for instance, it will contain (0,0,1,1), (0,1,1,1), etc. By default, the routine chooses the best model within (0,0,1,1) and (5,5,1,1).

3. Bill says:

Ok… thanks. I have been trying to run garchAuto using a return series as the xx input but only receive NULL

30. statisfied says:

Very informative blog! I am planning to use a similar strategy using auto.arima(), without success so far – just starting though.
– What was your approximative CAGR using only ARIMA models without Garch?
– How do you decide which position to take: do you buy as soon as the forecast on the return is positive and sell if – negative, or do you implement minimal thresholds (to avoid selling or buying if the difference is too small)? If so, how do you define these thresholds?
– Could you please cite some of the reasons why you don’t forecast on the original series? Is it a critical condition IYO?
– Can you advise on how I could proceed with my (currently) unsuccessful auto.arima() strategy?

Thanks!

1. quintuitive says:

ARIMA without GARCH is not very good on the SPY. Neither on other ETFs. Even with GARCH, it needs additional work to come up with something trade-able.

I assume I am able to execute the trades at the close, which is achievable in real life. Easiest is to trade the futures (open 24/7) however one needs to backtest it properly.

ARMA/GARCH are used on stationary time series. The returns are stationary, the closing prices are not.

31. Julien says:

IVANNP,

I am a novice trader looking to apply a degree in stats to the world of financial markets. I saw that you didn’t want to share the code a few years back, but if there is any form/script I could look through and use to better learn R, then I would be more than grateful if you could send it my way. Thanks again for the post, it was excellent.