Deep Learning for the Walk-Forward Loop

In the previous posts in these series (here, here and here) I used conventional machine learning to forecast the trading opportunities. Lately however I have been trying to move more and more towards deep learning. My first attempt was to extend the walk-forward loop to support neural networks, the building blocks of deep learning.

To experiment with a neural network, I could have simply used the Multi-Layer Perceptron from the scikit-learn python package. That would have sufficed to experiment with neural networks for this specific purpose. While this would have been sufficient to get started, my feeling is that in the long run I would prefer to use deep learning more, thus, I decided to go a bit more advanced right off the bat.

There are quite a few deep learning network frameworks out there, but for a new comer Google’s Tensorflow seemed to make most sense. Since I work at Microsoft (hence a disclaimer: my opinions about Microsoft’s products may be biased), the only other framework I considered was Microsoft’s Cognitive Toolkit (also known as CNTK). I decided to go with the latter, but I will cover both in what follows.

Take a look at the walk-forward implementation from my original post. There are four calls to the classifier (QDA in that example). Namely:

  • the constructor
  • fit
  • predict
  • predict_proba

Hence, I need to implement these four to get something working. Thus, I created the CntkClassifier. The classifier builds upon CNTK’s time series example. If you are new to deep learning, these python notebook tutorials are a great place to get started.

Without going into too much details, my classifier first builds the computational graph (that’s why it’s deep learning – the depth and complexity of the algorithm are exposed to the user of the framework):

# The input and output tensors
input = Input(x.shape[1])
label = Input(yy.shape[1])

# Setup some default hidden layers
if self.hidden_layer_sizes is None:
    self.hidden_layer_sizes = np.full(2, x.shape[1]*2)

# Build the model
hh = input
for ii in range(len(self.hidden_layer_sizes)):
    hh = Dense(self.hidden_layer_sizes[ii], init = glorot_uniform(), activation = C.relu)(hh)
hh = Dense(yy.shape[1], init = glorot_uniform(), activation = None)(hh)

loss = C.cross_entropy_with_softmax(hh, label)
label_error = C.classification_error(hh, label)
lr_per_minibatch = learning_rate_schedule(self.learning_rate_init, UnitType.minibatch)
trainer = cntk.Trainer(hh, loss, label_error, [sgd(hh.parameters, lr=lr_per_minibatch)])

The rest of the fit code performs the training.

# Train our neural network
tf = np.array_split(x, num_batches)
tl = np.array_split(yy, num_batches)

for ii in range(num_batches*self.num_passes):
    features = np.ascontiguousarray(tf[ii % num_batches])
    labels = np.ascontiguousarray(tl[ii % num_batches])

    # Specify the mapping of input variables in the model to actual minibatch data to be trained with
    trainer.train_minibatch({input : features, label : labels})

Last, predict and predict_proba implement the predictions:

def predict(self, xx):
    probs = self.predict_proba(xx)
    # Select the highest probability, and map to the user specified class
    return self.classes_[np.argmax(probs, 1)]

def predict_proba(self, xx):
    # Get the probabilities
    out = C.softmax(self.model)
    # Deal with tensors like [3,1,13] - get rid of the middle axes
    res = np.squeeze(out.eval({self.input: xx}))
    # Add a dimension if we squeezed too much (if we are predicting a single row)
    if len(res.shape) == 1:
        res = np.reshape(res, (1,-1))

That’s pretty much all boilerplate code needed. 🙂

Getting started with Tensorflow might have been even easier – instead of building the graph ourselves, we could have used skflow, which is a scikit-like interface for simpler networks. Again, my goal was to get started with arbitrary deep network models, thus, I wanted to experience the full API.

Leave a Reply