Note that there are some explanatory texts on larger screens.

plurals
  1. POR: Improving workflow and keeping track of output
    primarykey
    data
    text
    <p>I have what I think is a common enough issue, on optimising workflow in R. Specifically, how can I avoid the common issue of having a folder full of output (plots, RData files, csv, etc.), without, after some time, having a clue where they came from or how they were produced? In part, it surely involves trying to be intelligent about folder structure. I have been looking around, but I'm unsure of what the best strategy is. So far, I have tackled it in a rather unsophisticated (overkill) way: I created a function <code>metainfo</code> (see below) that writes a text file with metadata, with a given file name. The idea is that if a plot is produced, this command is issued to produce a text file with exactly the same file name as the plot (except, of course, the extension), with information on the system, session, packages loaded, R version, function and file the metadata function was called from, etc. The questions are:</p> <p>(i) How do people approach this general problem? Are there obvious ways to avoid the issue I mentioned?</p> <p>(ii) If not, does anyone have any tips on improving this function? At the moment it's perhaps clunky and not ideal. Particularly, getting the file name from which the plot is produced doesn't necessarily work (the solution I use is one provided by @hadley in <a href="https://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script">1</a>). Any ideas would be welcome!</p> <p>The function assumes git, so please ignore the probable warning produced. This is the main function, stored in a file <code>metainfo.R</code>:</p> <pre><code>MetaInfo &lt;- function(message=NULL, filename) { # message - character string - Any message to be written into the information # file (e.g., data used). # filename - character string - the name of the txt file (including relative # path). Should be the same as the output file it describes (RData, # csv, pdf). # if (is.null(filename)) { stop('Provide an output filename - parameter filename.') } filename &lt;- paste(filename, '.txt', sep='') # Try to get as close as possible to getting the file name from which the # function is called. source.file &lt;- lapply(sys.frames(), function(x) x$ofile) source.file &lt;- Filter(Negate(is.null), source.file) t.sf &lt;- try(source.file &lt;- basename(source.file[[length(source.file)]]), silent=TRUE) if (class(t.sf) == 'try-error') { source.file &lt;- NULL } func &lt;- deparse(sys.call(-1)) # MetaInfo isn't always called from within another function, so func could # return as NULL or as general environment. if (any(grepl('eval', func, ignore.case=TRUE))) { func &lt;- NULL } time &lt;- strftime(Sys.time(), "%Y/%m/%d %H:%M:%S") git.h &lt;- system('git log --pretty=format:"%h" -n 1', intern=TRUE) meta &lt;- list(Message=message, Source=paste(source.file, ' on ', time, sep=''), Functions=func, System=Sys.info(), Session=sessionInfo(), Git.hash=git.h) sink(file=filename) print(meta) sink(file=NULL) } </code></pre> <p>which can then be called in another function, stored in another file, e.g.:</p> <pre><code>source('metainfo.R') RandomPlot &lt;- function(x, y) { fn &lt;- 'random_plot' pdf(file=paste(fn, '.pdf', sep='')) plot(x, y) MetaInfo(message=NULL, filename=fn) dev.off() } x &lt;- 1:10 y &lt;- runif(10) RandomPlot(x, y) </code></pre> <p>This way, a text file with the same file name as the plot is produced, with information that could hopefully help figure out how and where the plot was produced.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload