Convolutional Neural Network for Time Series

Neural networks have been around for a while, but it’s fair to say that many successful practical applications use at least one convolutional layer. Naturally, convolutions make sense for time series, so I went and added a few to the Walk-Forward Analysis.

To make the code easier to use, I ended up creating a self-contained GitHub repository.

CNTK‘s code to create the network layers is trivial:

    hh = cntk.layers.Sequential([
             cntk.layers.Convolution1D(3, 32, activation=cntk.ops.relu, pad=True, reduction_rank=0),
             cntk.layers.MaxPooling((3, 1), 3),
             cntk.layers.Convolution1D(3, 32, activation=cntk.ops.relu, pad=True),
             cntk.layers.MaxPooling((3, 1), 3),
             cntk.layers.Dense(128, activation=cntk.ops.relu),
             cntk.layers.Dense(128, activation=cntk.ops.relu),
             cntk.layers.Dense(yy.shape[1], activation=None)
             ])(input)

The above code uses the primitives from cnkt.layers, which provide higher-level abstractions. For a lower-level experience, similar to Tensorflow, one should use the functions provided by cntk.ops.

So I created the network, and run the training. Boy, it was slow – about 16-20 secs per point (single day). And I want to run 8,000 of these.

CNTK can train in parallel, both on GPUs and CPUs, but I decided to stay with CPUs for now
(that’s how one creates desire for future posts) and to not use CNTK’s CPU parallelism. The latter sounds weird at first (actually both do), but as we have discussed previously here (see the old ARMA/GARCH posts), the bigger the chunks for each task, the better the parallel performance. In other words, running all days in parallel is better than running the days sequentially and parallelizing the computations for that day. That’s what worked for me in R, and now, in Python.

The multiprocessing package had everything I needed. The main bump to overcome was the sharing of a global lock between processes – we need to synchronize the database access and the log file access.

When I start with a pool of four processes (controlled by the pool_size parameter below):

ml = WalkForwardLoop('cntk_conv_self', log_file='ml.log', db_url='sqlite:///ml.sqlite')
ml.run(features, response, fl, verbose=False, pool_size=4)

I see the following in the process explorer:

Pretty. Mission accomplished.

Leave a Reply