Adding Comments to CSV Files

by ivannp on January 11, 2013

Various of my R scripts produce csv files as output. For instance, I run a lengthy SVM back test, the end result is a csv file containing the indicator with some additional information. The problem is that over time one loses track what exactly the file contained and what parameters were used to produce it.

I have considered various solutions. First, I do store some information in the file name. However, this has limited applications. Next I considered using file system attributes (getfattr, setfattr, etc). Although an exciting feature, it doesn’t seem to integrate easily with git.

The solution I ended up using is to add comments to the csv file. The implementation is surprisingly easy. R’s read.table supports comments starting ‘#’ by default. I only needed to create a decent function to write the comments:

writeIndicator = function( indicator, fileName, comments=c() )
{
   require( quantmod, quietly=TRUE )

   # First write the comments
   append = FALSE
   for( comment in comments ) { 
      line = paste( sep="", "# ", comment )
      write( line, file=fileName, append=append )
      append = TRUE
   }   

   # Write the series, but suppress warnings to avoid the following:
   #    Warning message:
   #    In write.table(dx, file = file, row.names = row.names, col.names = col.names,  :
   #    appending column names to file
   suppressWarnings( write.zoo( indicator, quote=F, row.names=F, sep=",", file=fileName, append=append ) ) 
}

{ 1 comment… read it below or add one }

Henrik B January 11, 2013 at 17:11

FYI, see writeDataFrame() in the R.utils package. From it’s help page:

writeDataFrame(data, file, path=NULL, sep=”\t”, quote=FALSE, row.names=FALSE, col.names=TRUE, …, header=list(), createdBy=NULL, createdOn=format(Sys.time(), format = “\%Y-\%m-\%d \%H:\%M:\%S \%Z”), nbrOfRows=nrow(data), headerPrefix=”# “, headerSep=”: “, append=FALSE, overwrite=FALSE)

Argument ‘header’ allows you to specify (named) header comments. Some common header comments are automatically added, e.g. createdOn and createdBy (if given), nbrOfRows..

Related, with the TabularTextFile class of R.filesets, you can parse any tabular text file and its header comments (tedious arguments such as ‘sep’ are automatically inferred), e.g.

df <- readDataFrame(TabularTextFile("foo.csv"))

My $.02

Reply

Leave a Comment