Synchronization for R with the flock Package

Have you tried synchronizing R processes? I did and it wasn’t straightforward. In fact, I ended up creating a new package – flock.

One of the improvements I did not too long ago to my R back-testing infrastructure was to start using a database to store the results. This way I can compute all interesting models (see the “ARMA Models for Trading” series for an example) once and store the relevant information (mean forecast, variance forecast, AIC, etc) into the database. Then, I can test whatever I want without further heavy lifting.

Easy said, but took me some time to implement in practice. My database of choice was SQLite, which doesn’t require a database server and the entire database is stored within a single file. Choosing SQLite was probably part of the problem – it turned out that SQLite doesn’t support simultaneous database updates – i.e. the synchronization is left to the user. Here is a stripped down version of a program, which often FAILS on multi-core system:

require(RSQLite)
require(parallel)

db.path <- tempfile()
con <- dbConnect(RSQLite::SQLite(), dbname=db.path)
df <- data.frame(value=0)
dbWriteTable(con, "test", df)
dbDisconnect(con)

write.one.value <- function(val) {
   con <- dbConnect(RSQLite::SQLite(), dbname=db.path)
   dbWriteTable(con, "test", data.frame(value=val), append=TRUE)
   dbDisconnect(con)
}

mclapply(1:100, write.one.value, mc.cores=2)

After some generous debugging I found out that the problem is the parallel writing occurring in write.one.value.

My first impulse was to find some scripting solution, using R’s system call, but that didn’t work very well. My next plan was to add support to RSQLite, but that seemed too limiting (synchronization can be useful elsewhere) and complicated. So, I moved to the next solution, which was to execute the write.one.value code within a critical section (take an exclusive lock at the beginning, unlock at the end).

Before unrolling my own package, I decided to check what’s on CRAN, and I discovered the synchronicity package. A bit heavy for my taste (using the C++ library boost), but so what. It seemed to work at first, so I run some of my long-ish simulations.

A day later, I observed something strange – the number of processes has reduced, and the simulation seemed to have hanged. Some more debugging revealed that the underlying boost libraries were throwing an exception, and things were going sour afterwards. Fed up with debugging and the time wasted, I went back to approach of writing my own package.

The result was the flock package. The working version of the above code follows:

require(RSQLite)
require(parallel)
require(flock)

db.path <- tempfile()
con <- dbConnect(RSQLite::SQLite(), dbname=db.path)
df <- data.frame(value=0)
dbWriteTable(con, "test", df)
dbDisconnect(con)

write.one.value <- function(val, lock.name) {
   # Take an exclusive lock
   ll = lock(lock.name)

   con <- dbConnect(RSQLite::SQLite(), dbname=db.path)
   dbWriteTable(con, "test", data.frame(value=val), append=TRUE)
   dbDisconnect(con)

   # Release the lock
   unlock(ll)
}

lock.name = tempfile()
# or lock.name = "~/file.lock"

mclapply(1:100, write.one.value, mc.cores=2)

With RStudio and Rcpp the package development was a breeze. Besides for database access, I started using the package to protect logging to files, and other similar tasks. It’s new, but by the time you are reading this post I have probably executed millions of synchronizations, so it should be pretty stable. The only downside – Windows is not supported, I simply cannot afford the time at the moment to do that (it may work out-of-the-box, but that’s far from certain).

To install the package:

install.packages("flock", repos="http://R-Forge.R-project.org")

Comments

  1. Michael Kane says:

    Rather than submitting a bug report you’ve re-created a subset of the functionality already provided by another package and you support fewer software platforms.

    1. ivannp says:

      From my experience synchronicity doesn’t have this functionality, didn’t you read my post? The fact that basic mutex synchronization failed, which most likely is a problem with the package rather than with boost, was simply a red flag for me. I wasted more than a day on synchronicity, sorry, didn’t have more time available.

      1. Michael Kane says:

        I did read your post, it is summarized in my original comment.

        Your vague description of the problem in synchronicity is not a bug report. Synchronization is often tricky and inexperience users often don’t understand the different types of synchronization mechanisms that are available or when to use them. As a result they often read exceptions thrown by the library as problems in the library and not their own code. On the other hand it may be a bug and if I receive a bug report, I’m happy to fix it.

  2. Ryan says:

    A simple flock package seems like it could be useful in general, but I’m quite surprised that you needed it in this case, because SQLite should already be handling proper file locking for concurrent database operations: http://www.sqlite.org/faq.html#q5

    Was your SQLite database on a network filesystem or something, where file locking doesn’t work? If so, then the additional locking provided by flock probably isn’t guaranteed to eliminate the race condition, especially if the temp file being used for the lock is on a different file system (which is frequently the case for the system temporary directory). Although the locking and unlocking may take long enough to eliminate the race in practice for your particular case.

    1. ivannp says:

      “Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however.” – pretty clear that only a single process can update (commit). From my debugging, the first process locks and if a second process tries to commit to the database before the first is done – the second process gets SQLITE_BUSY as a return.

      As for the network file systems, on Linux, the package is using “fcntl” which is guaranteed to work over NFS. Granted, I haven’t tried it over a NFS. On Windows, I am using LockFileEx/UnlockFileEx, but can’t remember seeing any network specifics. Thus, no idea whether it would work there.

Leave a Reply