Introduction

This R extension package provides features to bundle an R analysis together with the required runtime environment in so called software containers, more specifically Docker. The intention of this package is to provide a building block to support reproducible and archivable research. Development is supported by the DFG-funded project Opening Reproducible Research (http://o2r.info).

The core functionality is to create a Dockerfile from a given R session, script, or workspace directory. This Dockerfile contains all the R packages and their system dependencies required by the R workflow to be packaged.

The Dockerfiles are based on rocker (on Docker Hub). Eventually it should/could be possible to create images from scratch?

Dockerfile generation relies on the sysreqs package.

To build images and run containers, this package integrates with the harbor package.

For nitty gritty things like reading/loading/installing the exact versions, including system dependencies, internal and external libraries etc., this project is focused on the geospatial domain.

tl;dr

Load the package, do your analysis, and create a Dockerfile.

## 
## Attaching package: 'containerit'
## The following object is masked from 'package:base':
## 
##     Arg

container <- dockerfile(from = sessionInfo())
## Warning in FUN(X[[i]], ...): Failed to identify a source for package containerit. Therefore the package cannot be installed in the Docker image.
## Warning in FUN(X[[i]], ...): Failed to identify a source for package harbor. Therefore the package cannot be installed in the Docker image.
## INFO [2018-03-26 22:22:25] Going online? TRUE  ... to retrieve system dependencies (sysreq-api)
## INFO [2018-03-26 22:22:25] Trying to determine system requirements for the package(s) 'sp,gstat,Rcpp,futile.logger,harbor,plyr,xts,futile.options,sys,digest,jsonlite,evaluate,memoise,lattice,rlang,rstudioapi,commonmark,yaml,pkgdown,stringr,roxygen2,xml2,knitr,desc,fs,rprojroot,spacetime,R6,rmarkdown,lambda.r,semver,magrittr,intervals,backports,scales,htmltools,MASS,assertthat,colorspace,stringi,munsell,FNN,crayon,zoo,remotes' from sysreqs online DB
## INFO [2018-03-26 22:22:27] Adding CRAN packages: assertthat, backports, colorspace, commonmark, crayon, desc, digest, evaluate, FNN, fs, futile.logger, futile.options, gstat, htmltools, intervals, jsonlite, knitr, lambda.r, lattice, magrittr, MASS, memoise, munsell, plyr, R6, Rcpp, remotes, rlang, rmarkdown, roxygen2, rprojroot, rstudioapi, scales, semver, sp, spacetime, stringi, stringr, sys, xml2, xts, yaml, zoo
## INFO [2018-03-26 22:22:27] Adding GitHub packages: r-lib/pkgdown@5e4825875751c009444c56ce43d06324ec53e910
## INFO [2018-03-26 22:22:27] Created Dockerfile-Object based on sessionInfo

The Dockerfile object can be saved to a file or printed out.

## FROM rocker/r-ver:3.4.4                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## LABEL maintainer="daniel"                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
## RUN export DEBIAN_FRONTEND=noninteractive; apt-get -y update \
##   && apt-get install -y git-core \
##  libapparmor-dev \
##  libxml2-dev \
##  make \
##  pandoc \
##  pandoc-citeproc                                                                                                                                                                                                                                                                                                   
## RUN ["install2.r", "assertthat", "backports", "colorspace", "commonmark", "crayon", "desc", "digest", "evaluate", "FNN", "fs", "futile.logger", "futile.options", "gstat", "htmltools", "intervals", "jsonlite", "knitr", "lambda.r", "lattice", "magrittr", "MASS", "memoise", "munsell", "plyr", "R6", "Rcpp", "remotes", "rlang", "rmarkdown", "roxygen2", "rprojroot", "rstudioapi", "scales", "semver", "sp", "spacetime", "stringi", "stringr", "sys", "xml2", "xts", "yaml", "zoo"]
## RUN ["installGithub.r", "r-lib/pkgdown@5e4825875751c009444c56ce43d06324ec53e910"]                                                                                                                                                                                                                                                                                                                                                                                                         
## WORKDIR /payload/                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## CMD ["R"]
## INFO [2018-03-26 22:22:28] Writing dockerfile to /tmp/RtmpCaJgPD/file259d4fd715d8

Dockerfile examples

The following demos use the rgdal package because it has system library dependencies, namely GDAL and PROJ.4. Code snippets are taken from the sp gallery.

Here is some regular R code loading a file and plotting it.

library("rgdal")
library("maptools")

nc <- rgdal::readOGR(system.file("shapes/", package="maptools"), "sids", verbose = FALSE)
proj4string(nc) <- CRS("+proj=longlat +datum=NAD27")
plot(nc)

Create Dockerfile from session

## [1] "sessionInfo"
containerit::dockerfile(from = sessionInfo(), env = ls())
## Warning in FUN(X[[i]], ...): Failed to identify a source for package containerit. Therefore the package cannot be installed in the Docker image.
## Warning in FUN(X[[i]], ...): Failed to identify a source for package harbor. Therefore the package cannot be installed in the Docker image.
## INFO [2018-03-26 22:22:30] Going online? TRUE  ... to retrieve system dependencies (sysreq-api)
## INFO [2018-03-26 22:22:30] Trying to determine system requirements for the package(s) 'Rcpp,futile.logger,harbor,plyr,xts,futile.options,sys,digest,gstat,jsonlite,evaluate,memoise,lattice,rlang,rstudioapi,commonmark,rgdal,yaml,pkgdown,stringr,roxygen2,xml2,knitr,desc,fs,rprojroot,spacetime,R6,foreign,rmarkdown,sp,lambda.r,semver,magrittr,maptools,intervals,backports,scales,htmltools,MASS,assertthat,colorspace,stringi,munsell,rjson,FNN,crayon,zoo,remotes' from sysreqs online DB
## INFO [2018-03-26 22:22:32] Adding CRAN packages: assertthat, backports, colorspace, commonmark, crayon, desc, digest, evaluate, FNN, foreign, fs, futile.logger, futile.options, gstat, htmltools, intervals, jsonlite, knitr, lambda.r, lattice, magrittr, maptools, MASS, memoise, munsell, plyr, R6, Rcpp, remotes, rgdal, rjson, rlang, rmarkdown, roxygen2, rprojroot, rstudioapi, scales, semver, sp, spacetime, stringi, stringr, sys, xml2, xts, yaml, zoo
## INFO [2018-03-26 22:22:32] Adding GitHub packages: r-lib/pkgdown@5e4825875751c009444c56ce43d06324ec53e910
## INFO [2018-03-26 22:22:32] Created Dockerfile-Object based on sessionInfo
## An object of class "Dockerfile"
## Slot "image":
## An object of class "From"
## Slot "image":
## [1] "rocker/r-ver"
## 
## Slot "postfix":
## An object of class "Tag"
## [1] "3.4.4"
## 
## 
## Slot "maintainer":
## An object of class "Label"
## Slot "data":
## $maintainer
## [1] "daniel"
## 
## 
## Slot "multi_line":
## [1] FALSE
## 
## 
## Slot "instructions":
## [[1]]
## An object of class "Run_shell"
## Slot "commands":
## [1] "export DEBIAN_FRONTEND=noninteractive; apt-get -y update"                                                                                                            
## [2] "apt-get install -y gdal-bin \\\n\tgit-core \\\n\tlibapparmor-dev \\\n\tlibgdal-dev \\\n\tlibproj-dev \\\n\tlibxml2-dev \\\n\tmake \\\n\tpandoc \\\n\tpandoc-citeproc"
## 
## 
## [[2]]
## An object of class "Run"
## Slot "exec":
## [1] "install2.r"
## 
## Slot "params":
##       assertthat        backports       colorspace       commonmark 
##     "assertthat"      "backports"     "colorspace"     "commonmark" 
##           crayon             desc           digest         evaluate 
##         "crayon"           "desc"         "digest"       "evaluate" 
##              FNN          foreign               fs    futile.logger 
##            "FNN"        "foreign"             "fs"  "futile.logger" 
##   futile.options            gstat        htmltools        intervals 
## "futile.options"          "gstat"      "htmltools"      "intervals" 
##         jsonlite            knitr         lambda.r          lattice 
##       "jsonlite"          "knitr"       "lambda.r"        "lattice" 
##         magrittr         maptools             MASS          memoise 
##       "magrittr"       "maptools"           "MASS"        "memoise" 
##          munsell             plyr               R6             Rcpp 
##        "munsell"           "plyr"             "R6"           "Rcpp" 
##               49            rgdal            rjson            rlang 
##        "remotes"          "rgdal"          "rjson"          "rlang" 
##        rmarkdown         roxygen2        rprojroot       rstudioapi 
##      "rmarkdown"       "roxygen2"      "rprojroot"     "rstudioapi" 
##           scales           semver               sp        spacetime 
##         "scales"         "semver"             "sp"      "spacetime" 
##          stringi          stringr              sys             xml2 
##        "stringi"        "stringr"            "sys"           "xml2" 
##              xts             yaml              zoo 
##            "xts"           "yaml"            "zoo" 
## 
## 
## [[3]]
## An object of class "Run"
## Slot "exec":
## [1] "installGithub.r"
## 
## Slot "params":
##                                                  pkgdown 
## "r-lib/pkgdown@5e4825875751c009444c56ce43d06324ec53e910" 
## 
## 
## [[4]]
## An object of class "Workdir"
## Slot "path":
## [1] "/payload/"
## 
## 
## 
## Slot "entrypoint":
## NULL
## 
## Slot "cmd":
## An object of class "Cmd"
## Slot "exec":
## [1] "R"
## 
## Slot "params":
## [1] NA
## 
## Slot "form":
## [1] "exec"

Workspace

Go through all .R files in a directory and create a Dockerfile with a runtime environment which can run all of them.

## [1] "character"
## [1] TRUE

sessionInfo()

## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.4 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] containerit_0.3.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.16         compiler_3.4.4       futile.logger_1.4.3 
##  [4] harbor_0.2.0         plyr_1.8.4           xts_0.10-2          
##  [7] futile.options_1.0.0 tools_3.4.4          sys_1.5             
## [10] digest_0.6.15        gstat_1.1-5          jsonlite_1.5        
## [13] evaluate_0.10.1      memoise_1.1.0        lattice_0.20-35     
## [16] rlang_0.2.0          rstudioapi_0.7       commonmark_1.4      
## [19] rgdal_1.2-16         yaml_2.1.18          pkgdown_0.1.0.9000  
## [22] stringr_1.3.0        roxygen2_6.0.1       xml2_1.2.0          
## [25] knitr_1.20           desc_1.1.1           fs_1.2.2            
## [28] rprojroot_1.3-2      grid_3.4.4           spacetime_1.2-1     
## [31] R6_2.2.2             foreign_0.8-69       rmarkdown_1.9       
## [34] sp_1.2-7             lambda.r_1.2         semver_0.2.0        
## [37] magrittr_1.5         maptools_0.9-2       intervals_0.15.1    
## [40] backports_1.1.2      scales_0.5.0         htmltools_0.3.6     
## [43] MASS_7.3-49          assertthat_0.2.0     colorspace_1.3-2    
## [46] stringi_1.1.6        munsell_0.4.3        rjson_0.2.15        
## [49] FNN_1.1              crayon_1.3.4         zoo_1.8-1