StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POR: Improving workflow and keeping track of output
primarykey
Id
16864196
data
AcceptedAnswerId
0
AnswerCount
4
ClosedDate
CommentCount
3
CommunityOwnedDate
CreationDate
2013-05-31T19:08:22.190
FavoriteCount
2
LastActivityDate
2017-09-15T12:44:54.217
LastEditDate
2017-05-23T12:23:47.037
LastEditorUserId
-1
OwnerUserId
2441565
ParentId
0
PostTypeId
1
Score
1
ViewCount
643
LastEditorDisplayName
text
Body
I have what I think is a common enough issue, on optimising workflow in R. Specifically, how can I avoid the common issue of having a folder full of output (plots, RData files, csv, etc.), without, after some time, having a clue where they came from or how they were produced? In part, it surely involves trying to be intelligent about folder structure. I have been looking around, but I'm unsure of what the best strategy is. So far, I have tackled it in a rather unsophisticated (overkill) way: I created a function <code>metainfo</code> (see below) that writes a text file with metadata, with a given file name. The idea is that if a plot is produced, this command is issued to produce a text file with exactly the same file name as the plot (except, of course, the extension), with information on the system, session, packages loaded, R version, function and file the metadata function was called from, etc. The questions are: (i) How do people approach this general problem? Are there obvious ways to avoid the issue I mentioned? (ii) If not, does anyone have any tips on improving this function? At the moment it's perhaps clunky and not ideal. Particularly, getting the file name from which the plot is produced doesn't necessarily work (the solution I use is one provided by @hadley in <a href="https://stackoverflow.com/questions/1815606/rscript-determine-path-of-the-executing-script">1</a>). Any ideas would be welcome! The function assumes git, so please ignore the probable warning produced. This is the main function, stored in a file <code>metainfo.R</code>: <pre><code>MetaInfo <- function(message=NULL, filename) { # message - character string - Any message to be written into the information # file (e.g., data used). # filename - character string - the name of the txt file (including relative # path). Should be the same as the output file it describes (RData, # csv, pdf). # if (is.null(filename)) { stop('Provide an output filename - parameter filename.') } filename <- paste(filename, '.txt', sep='') # Try to get as close as possible to getting the file name from which the # function is called. source.file <- lapply(sys.frames(), function(x) x$ofile) source.file <- Filter(Negate(is.null), source.file) t.sf <- try(source.file <- basename(source.file[[length(source.file)]]), silent=TRUE) if (class(t.sf) == 'try-error') { source.file <- NULL } func <- deparse(sys.call(-1)) # MetaInfo isn't always called from within another function, so func could # return as NULL or as general environment. if (any(grepl('eval', func, ignore.case=TRUE))) { func <- NULL } time <- strftime(Sys.time(), "%Y/%m/%d %H:%M:%S") git.h <- system('git log --pretty=format:"%h" -n 1', intern=TRUE) meta <- list(Message=message, Source=paste(source.file, ' on ', time, sep=''), Functions=func, System=Sys.info(), Session=sessionInfo(), Git.hash=git.h) sink(file=filename) print(meta) sink(file=NULL) } </code></pre> which can then be called in another function, stored in another file, e.g.: <pre><code>source('metainfo.R') RandomPlot <- function(x, y) { fn <- 'random_plot' pdf(file=paste(fn, '.pdf', sep='')) plot(x, y) MetaInfo(message=NULL, filename=fn) dev.off() } x <- 1:10 y <- runif(10) RandomPlot(x, y) </code></pre> This way, a text file with the same file name as the plot is produced, with information that could hopefully help figure out how and where the plot was produced.
Tags
<r><workflow><metadata>
Title
R: Improving workflow and keeping track of output
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. UStcam
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POR: Improving workflow and keeping track of output
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POR: Improving workflow and keeping track of output
 UserUserId
 USAnthony
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.