Main content



Loading wiki pages...

Wiki Version:
<p><strong>From the introduction chapter of <em>Implementing Reproducible Research</em>:</strong> Literate statistical programming is a concept introduced by Rossini () that builds on the idea of literate programming as described by Donald Knuth. With literate statistical programming, one combines the description of a statistical analysis and the code for doing the statistical analysis into a single document. Subsequently, one can take the combined document and produce either a human-readable document (i.e. PDF) or a machine readable code file. An early implementation of this concept was the Sweave system of Leisch which uses R as its programming language and LATEX as its documentation language (). Yihui Xie describes his knitr package which builds substantially on Sweave and incorporates many new ideas developed since the initial development of Sweave. Along these lines, Tanu Malik and colleagues describe the Science Object Linking and Embedding framework for creating interactive publications that allow authors to embed various aspects of computational research in document, creating a complete research compendium.</p> <p>There have been a number of systems developed recently that are designed to track the provenance of data analysis outputs and to manage a researcher's workflow. Juliana Freire and colleagues describe the VisTrails system for open source provenance management for scientific workflow creation. VisTrails interfaces with existing scientific software and captures the inputs, outputs, and code that produced a particular result, even presenting this workflow in flowchart form. Andrew Davison and colleagues describe the Sumatra toolkit for reproducible research. Their goal is to introduce a tool for reproducible research that minimizes the disruption to scientists' existing workflows, therefore maximizing the uptake by current scientists. Their tool serves as a kind of "backend" to keep track of the code, data, and dependencies as a researcher works. This allows for easily reproducing specific analyses and for sharing with colleagues.</p> <p>Philip Guo takes the "backend tracking" idea one step further and describes his Code, Data, Environment (CDE) package, which is a minimal "virtual machine" for reproducing the environment as well as the analysis. This package keeps track of all files used by a given program (i.e. a statistical analysis program) and bundles everything, including dependencies, into a single package. This approach guarantees that all requirements are included and that a given analysis can be reproduced on another computer.</p> <p>Peter Murray-Rust and Dave Murray-Rust introduce The Declaration, a tool for the precise mapping of mathematical expressions to computational implementations. They present an example from materials science, de fining what reproducibility means in this fi eld, in particular for unstable dynamical systems.</p> <p><a href="">Return to Table of Contents</a></p> <p><a href="">View Tools chapters for download</a></p>
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.