Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>As I said in <a href="https://stackoverflow.com/questions/2286831/how-do-you-combine-revision-control-with-workflow-for-r/2292913#2292913">my response to your other question</a>, what you're describing <em>is</em> programming. So the short answer is: there is no difference. The slightly longer answer is that statistical and scientific computing should require even more controls around development than other programming. </p> <p>A certain percentage of statistical analysis can be done using Excel, or in a point-and-click approach using SPSS, SAS, Matlab, or S-Plus (for instance). A more sophisticated analysis done using one of those programs (or R) that involves programming is clearly a form of software development. And this kind of statistical computing can benefit immensely from following all the best practices from software development: source control, documentation, a project plan, scope document, bug tracking/change control, etc. </p> <p>Moreover, there are different kinds of statistical analyses that can follow different approaches, as with any programming project:</p> <ul> <li>Exploratory data analysis should follow an iterative methodology, like <a href="http://en.wikipedia.org/wiki/Agile_software_development" rel="nofollow noreferrer">the Agile methodology</a>. In this case, when you don't know explicity the steps involved up front, it's critical to use a development methodology that is adaptive and self-reflective.</li> <li>A more routine kind of analysis (e.g. an government annual survey such as the Census) could follow a more traditional methodology such as the <a href="http://en.wikipedia.org/wiki/Waterfall_model" rel="nofollow noreferrer">waterfall</a> approach since it would be following a very clear set of steps that are mostly known in advance.</li> </ul> <p>I would suggest that any statistician would benefit from reading a book like <a href="http://rads.stackoverflow.com/amzn/click/0735619670" rel="nofollow noreferrer">"Code Complete"</a> (look at <a href="https://stackoverflow.com/questions/1711/what-is-the-single-most-influential-book-every-programmer-should-read">the other top books in this post</a>): the more organized you are with your analysis, the greater the likelihood of success.</p> <p>Statistical analysis in some sense requires <em>even more</em> good practices around version control and documentation than other programming. If your program is just serving some business need, then the algorithm or software used is really of secondary importance so long as the program functions the way the specifications require. On the other hand, with scientific and statistical computing, <strong><em>accuracy</em></strong> and <strong><em>reproducibility</em></strong> are paramount. This is one of <a href="http://stat.stanford.edu/~jmc4/" rel="nofollow noreferrer">John Chambers</a>' (the creator of the S language) major emphases in <a href="http://rads.stackoverflow.com/amzn/click/0387759352" rel="nofollow noreferrer">"Software for Data Analysis"</a>. That is another reason to add literate programming (e.g. with <a href="http://en.wikipedia.org/wiki/Sweave" rel="nofollow noreferrer">Sweave</a>) as an important tool in the statistician's toolkit. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload