Trading with SVMs: Performance

To get a feeling of SVM performance in trading, I run different setups on the S&P 500 historical data from … the 50s. The main motif behind using this decade was to decide what parameters to vary and what to keep steady prior to running the most important tests. Treat it as an “in-sample” test to avoid (further;)) over-fitting. First the performance chart:

S&P 500 Trading Performance

Very nice! Using the 5 lagged daily returns shows similar performance to the ARMA+GARCH strategy, which I found very promising. If you wonder why I am so excited about this fact, it’s because here we are in the area where ARMA+GARCH is best, and yet, SVMs show comparable performance.

The indicators for the tested methods are: the ARMA+GARCH indicator, the SVM with statistics indicator, the SVM on the last 5 daily returns indicator and the SVM on the last 5 daily returns with greedy feature selection indicator.

The statistics are also impressive:

Buy and Hold ARMA+GARCH SVM Lags SVM Lags Greedy SVM Stats
Cumulative Return: 95.93% 260.97% 218.96% 284.04% 274.21%
Annualized Return: 15.14% 30.88% 27.53% 32.59% 31.87%
Sharpe Ratio: 1.33 2.86 2.44 2.89 2.83
Winning Pct: 52.37% 51.4% 54.77% 54.93% 54.13%
Annualized SD: 0.1137 0.1078 0.113 0.1127 0.1127
Max Drawdown: -14.82% -15.44% -16.18% -11.85% -10.27%
Avg Drawdown: -3.43% -1.61% -2.08% -1.77% -1.63%

While writing this post, I found another effort to use SVMs in trading by Quantum Financier. His approach uses RSI of different length as input to the SVM, but it also uses classification (maps the returns to two values, short or long) instead of regression. Since I was planning to try classification anyways, his post inspired me to implement it and run an additional comparison, regression vs classification:

S&P 500 SVM Trading – Regression vs Classification

What can I say – they both seem to work perfectly. As a reader suggested in the comments, the Classification does exhibit more consistent returns.

Regression Classification
Cumulative Return: 214.47% 214.42%
Annualized Return: 26.05% 26.05%
Sharpe Ratio: 2.32 2.32
Winning Pct: 56.54% 54.93%
Annualized SD: 0.1124 0.1124
Max Drawdown: -16.18% -8.22%
Avg Drawdown: -2.18% -1.72%

Looking at the table, the classification cut in half the maximum drawdown, but interestingly, it didn’t improve the Sharpe ratio significantly. Nothing conclusive here though, it was a quick run of the fastest (in terms of running time) strategies.

There is still a long list of topics to explore, just to give you an idea, in no particular order:

  • Add other features. Mostly thinking of adding some Fed-related series, this data goes back to 1960, so it’s coming soon.:)
  • Try other svm parameters: other regressions, other classifications, other kerenls, etc. This is more like a stability test.
  • Try other error functions. The default is to use the mean square error, but in the case of regression, why not use Sharpe Ratio (in-sample)? The regression case is simpler, since we have the actual returns – check the input of tune.control.
  • Try longer periods instead of days. Weekly is a start, but ideally I’d like to implement two or three day periods.
  • Vary the loopback period.
  • Use more classes with classification: large days, medium days, etc.

This will take time. As always, feedback and comments are welcome.

12 Response(s) for “Trading with SVMs: Performance

  • MC says:

    Great work.

    1) Do you have Sharpe Ratios for SVM regression vs SVM classification? By eyeballing the chart, classification seems to give better risk adjusted returns.

    2) Have you heard of the Caret package ( It seems to have already incorporated a lot of the work I see you use in your existing code. Another huge benefit is that you can easily swap ML learning algo (e.g. neural networks) without having to recode everything.

    Very interesting blog!

    1. ivannp says:

      1) Good point, it cut the drawdown, but not the Sharpe ratio significantly – I updated the post.
      2) Thanks for bringing up the Caret package, this is the second time I hear about it, so about time to take a closer look.:) Looks quite promising, certainly a lot to learn from it.

  • Adrian says:

    Do you share your code? The results are really impressive. I’m trying to build an SVM classifier doing something similar, but I want to use more parameters than pricing. Although, maybe that isn’t so helpful, because it seems prices alone provide predictive value. Thanks.

    1. ivannp says:

      Hi Adrian,

      I am also planning to use more than just prices, but that type of data is not available for the 50s. In general, the 50s on the S&P 500 are quite predictive. More complex models are likely to be needed onwards.

      Check the previous post ( in the series – there is a link to the code I used based on the e1071 package. Since I posted the code, I moved to the caret package which gives a unified interface to many models. Looks pretty good so far too.

  • Adrian says:

    Sounds good, thanks for sharing. All the best.

  • Rgis says:


    I am also trying to use SVM-SVR to predict close price of the stocks i.e :index value as CAC40 DJ etc..

    My idea is very simplest and easy, i download the data on a broker website, i have got access of 3 years of theses data. The response is the close value of the index, i assume that the features of the previous day have an impact of the close value of the next days ie : highest value smallest value opening value for monday are features to predict, to explain the closing value of tuesday, i build my dataset with this assumptions, so i use feature with lag1, obviously i can add others feature like lag2,lag3. I put a sample of my data structure here :

    openinglag1 highestlag1 smallestlag1 closing(response) volume lag1
    3950.59 3959.2 3936.33 4013.97 589818

    Finally i have a 764 data set, all the data set i using to train the svr, and i predict the next days as i mentionned above.

    My questions are how can i predict for example the next 5days? Is my data structure right?


    1. ivannp says:

      Hi Rgis,

      Not happy with the rolling forecast in svmComputeForecasts? See what it does for modelPeriod=’weeks’ for instance.

      An alternative is to do weekly forecasts right upfront. In other words, summarize the data into weeks (or three/four day chunks), and call svmComputeForecasts using ‘days’ on this set. Each prediction applies for the entire period.

      As far as I know, one cannot just do a five day ahead probabilistic prediction with SVMs (this is doable with ARMA techniques).

      Hope this helps,

  • IT says:

    I’ve been enjoying your posts and have a question. I was wondering what kind of improvement you found when moving from the simple ARMA model to the ARMA-GARCH model? Have you tested any other rolling window training parameters? Also did you find that the short side made much of a difference (i.e is it much/better or worse) than long only?


    1. ivannp says:

      Hi, I addressed some of these questions in a later post: Adding more statistics to the ARMA+GARCH tutorial is certainly on my list, but it will take time. One can do all these analysis by using the indicator ( together with the GSPC from Yahoo. The indicator is already aligned – no need to lag.

  • ozooha says:

    Very impressive. However have you tried using random forest –
    it claims to be superior to SVM as it allows for implicit non-linear effects and interaction terms among the exogenous variables. Also it whittles down the exogenous variable to the most important play-makers and its rather fast as well especially with your dataset.

  • IT says:

    Hi and thanks for the earlier reply. One thing I’m a bit confused about is that on the ARMA+GARCH post ( you mention 18.87% CAGR and B&H looks to be about 7% CAGR from eyeballing the chart. Yet, in the table above you show, 30.88% and 15.4% for ARMA+GARCH and BH, respectively.
    Is it a different time-frame or am I missing something? Thanks again.

  • IT says:

    I see… it is only a 5 yr sample here. I was unable to delete the comment. Do you have CAGR for all systems across all years?

  • Leave a Reply