Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>I'm a software engineer embedded in a team of research geophysicists and we're currently (as always) working to improve our ability to reproduce results upon demand. Here are a few pointers gleaned from our experience:</p> <ol> <li>Put everything under version control: source code, input data sets, makefiles, etc</li> <li>When building executables: we embed compiler directives in the executables themselves, we tag a build log with a UUID and tag the executable with the same UUID, compute checksums for executables, autobuild everything and auto-update a database (OK, it's just a flat file really) with build details, etc</li> <li>We use Subversion's keywords to include revision numbers (etc) in every piece of source, and these are written into any output files generated.</li> <li>We do lots of (semi-)automated regression testing to ensure that new versions of code, or new build variants, produce the same (or similar enough) results, and I'm working on a bunch of programs to quantify the changes which do occur.</li> <li>My geophysicist colleagues do analyse the programs sensitivities to changes in inputs. I analyse their (the codes, not the geos) sensitivity to compiler settings, to platform and such like. </li> </ol> <p>We're currently working on a workflow system which will record details of every job run: input datasets (including versions), output datasets, program (incl version and variant) used, parameters, etc -- what is commonly called provenance. Once this is up and running the only way to publish results will be by use of the workflow system. Any output datasets will contain details of their own provenance, though we haven't done the detailed design of this yet.</p> <p>We're quite (perhaps too) relaxed about reproducing numerical results to the least-significant digit. The science underlying our work, and the errors inherent in the measurements of our fundamental datasets, do not support the validity of any of our numerical results beyond 2 or 3 s.f.</p> <p>We certainly won't be publishing either code or data for peer-review, we're in the oil business.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload