StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow do you combine "Revision Control" with "Workflow" for R?
primarykey
Id
2286831
data
AcceptedAnswerId
2290194
AnswerCount
5
ClosedDate
CommentCount
4
CommunityOwnedDate
CreationDate
2010-02-18T06:59:11.197
FavoriteCount
26
LastActivityDate
2012-07-20T06:46:49.757
LastEditDate
2017-05-23T12:18:27.453
LastEditorUserId
-1
OwnerUserId
256662
ParentId
0
PostTypeId
1
Score
22
ViewCount
5004
LastEditorDisplayName
text
Body
I remember coming across R users writing that they use "Revision control" (<a href="https://stackoverflow.com/questions/1056912/source-control-vs-revision-control">e.g: "Source control"</a>), and I am curious to know: How do you combine "Revision control" with your statistical analysis workflow? Two (very) interesting discussions talk about how to deal with the workflow. But neither of them refer to the revision control element: <ul> <li><a href="https://stackoverflow.com/questions/1266279/how-to-organize-large-r-programs">How to organize large R programs?</a></li> <li><a href="https://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing">Workflow for statistical analysis and report writing</a></li> </ul> A Long Update To The Question: Following some of the people's answers, and Dirk's question in the comment, I would like to direct my question a bit more. After reading the Wiki article about "<a href="http://en.wikipedia.org/wiki/Revision_control" rel="nofollow noreferrer">revision control</a>" (which I was previously not familiar with), it was clear to me that when using revision control, what one does is to build a development structure of his code. This structure either leads to a "final product" or to several branches. When building something like, let's say, a website. There is usually one end product you work towards (the website), with some prototypes along the way. But when doing a statistical analysis, the work (to my view) is different. Sometimes you know where you want to get to. But more often, you explore. Explore cleaning the dataset. Explore different methods for statistical analysis, and ask various questions of your data (and I am writing this, knowing how Frank Harrell, and other experience statisticians feels about <a href="http://en.wikipedia.org/wiki/Data_dredging" rel="nofollow noreferrer">Data dredging</a>). That is why the workflow question with statistical programming is (in my view) a serious and deep question, raising many issues, The simpler ones are technical: <ul> <li>Which revision control software do you use (and why) ?</li> <li>Which IDE do you use(and why) ? The more interesting question are about work process:</li> <li>How do you structure your files?</li> <li>What do you keep as a separate file and what as a revision? or asking in a different way - What should be a "branch" and what should be a "sub project" in your code? For example: When starting to explore your data, should a plot be creating and then erased because it didn't lead any where (but kept as a revision) or should there be a backup file of that path?</li> </ul> How you solve this tension was my initial curiosity. The second question is "what might I be missing?". What rules (of thumb) should one follow so to avoid common pitfalls doing statistical programming with version control? In my intuition, I feel that statistical programming is inherently different then software development (I am writing this without being a real expert in statistical programming, and even less so in software development). That's way I am unsure which of the lessons I have read here about version control would be applicable. Thanks a lot, Tal
Tags
<version-control><r><workflow><statistics>
Title
How do you combine "Revision Control" with "Workflow" for R?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. USTal Galili
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
2. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
3. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
2. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
3. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow do you combine "Revision Control" with "Workflow" for R?
 UserUserId
 USdalloliogm
 VoteTypeVoteTypeId
 VTFavorite
2. VO
 singulars
 PostPostId
 POHow do you combine "Revision Control" with "Workflow" for R?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POHow do you combine "Revision Control" with "Workflow" for R?
 UserUserId
 USPaulHurleyuk
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.