Note that there are some explanatory texts on larger screens.

plurals
  1. POR and version control for the solo data analyst
    text
    copied!<p>Many data analysts that I respect use version control. For example: </p> <ul> <li><a href="http://github.com/hadley/" rel="noreferrer">http://github.com/hadley/</a></li> <li>See comments on <a href="http://permut.wordpress.com/2010/04/21/revision-control-statistics-bleg/" rel="noreferrer">http://permut.wordpress.com/2010/04/21/revision-control-statistics-bleg/</a></li> </ul> <p>However, I'm evaluating whether adopting a version control system such as git would be worthwhile.</p> <p><b>A brief overview:</b> I'm a social scientist who uses R to analyse data for research publications. I don't currently produce R packages. My R code for a project typically includes a few thousand lines of code for data input, cleaning, manipulation, analyses, and output generation. Publications are typically written using LaTeX.</p> <p>With regards to version control there are many benefits which I have read about, yet they seem to be less relevant to the solo data analyst.</p> <ul> <li><b>Backup:</b> I have a backup system already in place. </li> <li><b>Forking and rewinding:</b> I've never felt the need to do this, but I can see how it could be useful (e.g., you are preparing multiple journal articles based on the same dataset; you are preparing a report that is updated monthly, etc)</li> <li><b>Collaboration:</b> Most of the time I am analysing data myself, thus, I wouldn't get the collaboration benefits of version control.</li> </ul> <p>There are also several potential costs involved with adopting version control:</p> <ul> <li>Time to evaluate and learn a version control system</li> <li>A possible increase in complexity over my current file management system</li> </ul> <p>However, I still have the feeling that I'm missing something. General guides on version control seem to be addressed more towards computer scientists than data analysts.</p> <p>Thus, specifically <b>in relation to data analysts</b> in circumstances similar to those listed above:</p> <ol> <li>Is version control worth the effort?</li> <li>What are the main pros and cons of adopting version control? </li> <li>What is a good strategy for getting started with version control for data analysis with R (e.g., examples, workflow ideas, software, links to guides)?</li> </ol>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload