Motivation

R packages are, at their core, standardized collections of functions, data sets, and/or data formats. The advantage to packages is that they can be used across all platforms, meaning that your analyses or visualizations can be easily shared among collaborators (including your future self). If you’re reading this, then there is a good chance that you have written code that performs an analysis, does visualization, or helps you manage data. It is very likely that these tools you have authored will be of benefit to the larger population genetics community. This short vignette will point you to the tools you will need in order to get started on writing an R package for the population genetics community and submit it to CRAN.

Tools/Resources

Recently, many resources have emerged for developing R packages. We will use Hadley Wickham’s R packages for reference, but it’s valuable to check out other tutorials.

Tools

Writing R packages has become much easier with the advent of the IDE Rstudio and the package devtools. Using both of these in conjunction will make developing your package much easier as they are designed to work together.

You can install devtools by typing in your R console:

install.packages("devtools", repos = "http://cran.rstudio.com/")

You might also need some background noise to avoid distraction. I recommend the sound of a space ship control room.

Creating your new package

What is in an R package?

There are four files/folders needed to create an R package:

  • DESCRIPTION
  • R/
  • NAMESPACE
  • man/

These are detailed below.

DESCRIPTION

This file simply lists all the metadata needed for the package. It lists things such as Authors, Packages it needs to run, etc. You can view any package’s DESCRIPTION file by typing packageDescription("package_name")

packageDescription("poppr")
## Package: poppr
## Type: Package
## Title: Genetic Analysis of Populations with Mixed Reproduction
## Version: 2.3.0
## Date: 2016-11-16
## Authors@R: c(person(c("Zhian", "N."), "Kamvar", role = c("cre",
##        "aut"), email = "zkamvar@gmail.com"), person(c("Javier",
##        "F."), "Tabima", role = "aut", email =
##        "tabimaj@onid.orst.edu"), person(c("Sydney", "E."),
##        "Everhart", role = c("ctb", "dtc"), email =
##        "everhart@unl.edu"), person(c("Jonah", "C."), "Brooks",
##        role = "aut", email = "brookjon@onid.orst.edu"),
##        person(c("Stacy", "A."), "Krueger-Hadfield", role = "ctb",
##        email = "kruegersa@cofc.edu"), person(c("Erik"), "Sotka",
##        role = "ctb", email = "sotkae@cofc.edu"), person(c("Brian",
##        "J."), "Knaus", role = "ctb", email =
##        "briank.lists@gmail.com"), person(c("Niklaus", "J."),
##        "Grunwald", role = "ths", email =
##        "grunwaln@science.oregonstate.edu"))
## Maintainer: Zhian N. Kamvar <zkamvar@gmail.com>
## Encoding: UTF-8
## URL: http://github.com/grunwaldlab/poppr,
##        http://grunwaldlab.github.io/Population_Genetics_in_R/,
##        http://grunwaldlab.cgrb.oregonstate.edu/poppr-r-package-population-genetics
## Description: Population genetic analyses for hierarchical analysis
##        of partially clonal populations built upon the architecture
##        of the 'adegenet' package.
## MailingList: http://groups.google.com/group/poppr
## BugReports: https://github.com/grunwaldlab/poppr/issues
## Depends: R (>= 2.15.1), adegenet (>= 2.0.0)
## Imports: stats, graphics, grDevices, utils, vegan, ggplot2,
##        phangorn, ape (>= 3.1-1), igraph, methods, ade4, pegas,
##        reshape2, dplyr (>= 0.4), boot, shiny, magrittr
## Suggests: testthat, knitr, rmarkdown, knitcitations, polysat,
##        poweRlaw, cowplot
## License: GPL-2 | GPL-3
## VignetteBuilder: knitr
## RoxygenNote: 5.0.1
## NeedsCompilation: yes
## Packaged: 2016-11-23 04:11:40 UTC; zhian
## Author: Zhian N. Kamvar [cre, aut], Javier F. Tabima [aut], Sydney
##        E. Everhart [ctb, dtc], Jonah C. Brooks [aut], Stacy A.
##        Krueger-Hadfield [ctb], Erik Sotka [ctb], Brian J. Knaus
##        [ctb], Niklaus J. Grunwald [ths]
## Repository: CRAN
## Date/Publication: 2016-11-23 17:45:12
## Built: R 3.3.2; x86_64-pc-linux-gnu; 2017-01-04 10:24:27 UTC; unix
## 
## -- File: /usr/local/lib/R/site-library/poppr/Meta/package.rds

R/

This is the folder in which all of your .R files live. There are different philosophies on how to order your files in this folder. Some like to have a separate R file for each function, while others like to have all functions in a single R file. However you want to organize the files in this directory is up to you.

NAMESPACE

The NAMESPACE file lists the functions exported by your package (eg export("myFun")) and functions your package imports from other packages (eg importFrom("pegas", "amova")).

Exporting functions is important. If an R package is like a car, exported functions would be things like the steering wheel and pedals while non- exported functions would be anything under the hood. Having non-exported functions make it easier to improve your package. In keeping with the analogy, a driver will not notice if you change the alternator, but s/he will certainly notice the steering wheel is different.

If you choose to use modern tools such as roxygen2 for documentation, you should not need to touch this file.

man/

All R packages need documentation. This folder contains all of the documents that will become manual pages for your R functions. They are written in the Rd format, which is similar to LaTeX. If you choose to do so, using the roxygen2 package will help immensely with writing documentation.

Others

There are other folders/files that you can include, but they are completely optional. Files like NEWS or ChangeLog give updates to changes in your package, and folders like inst/ and src/ allow you to include extra data or compiled code, respectively. See CRAN’s description for details. A good way to see what you can put in your package is to browse through the CRAN GitHub repository and see what other packages have.

Creating a package

Packages can be automatically created with the function create() from devtools, the function kitten() from pkgKitten, or package.skeleton(), which comes in default R. As a rule, I avoid things that remind me of my own mortality, so with this tutorial, I will use create().

Setting up your information

Devtools will create a pre-set package for you, but it will be easier for you if you set it up with your name and information, so that the DESCRIPTION file is automatically generated with the right information. Below are the default options for devtools. You can use ?devtools to find out more information:

library("devtools")
options()[grep("devtools", names(options()))]
## $devtools.desc
## list()
## 
## $devtools.desc.author
## [1] "person(\"First\", \"Last\", email = \"first.last@example.com\", role = c(\"aut\", \"cre\"))"
## 
## $devtools.desc.license
## [1] "What license is it under?"
## 
## $devtools.install.args
## [1] ""
## 
## $devtools.name
## [1] "Your name goes here"
## 
## $devtools.path
## [1] "~/R-dev"
## 
## $devtools.revdep.libpath
## [1] "/tmp/RtmpIRd2aU/R-lib"

After looking at the documentation, we know that we want to change the values devtools.desc.author, devtools.desc.license, devtools.name. We will also add a “Maintainer” field to the DESCRIPTION file (tells users who to blame). The others don’t matter for now. We can add them in thusly (Note, I am using my name; you should obviously use your own):

authors_at_r <- paste0(
  "'",
  person(
    "Zhian N.", 
    "Kamvar", 
    email = "kamvarz@science.oregonstate.edu", 
    role  = c("aut", "cre")
    ),
  "'"
)

options(devtools.desc.author = authors_at_r)
options(devtools.name = "Zhian N. Kamvar")
options(devtools.desc.license = "GPL-3")
options(devtools.desc = list("Maintainer" = "'Zhian N. Kamvar' <kamvarz@science.oregonstate.edu>"))
options()[grep("devtools", names(options()))]
## $devtools.desc
## $devtools.desc$Maintainer
## [1] "'Zhian N. Kamvar' <kamvarz@science.oregonstate.edu>"
## 
## 
## $devtools.desc.author
## [1] "'Zhian N. Kamvar <kamvarz@science.oregonstate.edu> [aut, cre]'"
## 
## $devtools.desc.license
## [1] "GPL-3"
## 
## $devtools.install.args
## [1] ""
## 
## $devtools.name
## [1] "Zhian N. Kamvar"
## 
## $devtools.path
## [1] "~/R-dev"
## 
## $devtools.revdep.libpath
## [1] "/tmp/RtmpIRd2aU/R-lib"

It is much easier to edit the DESCRIPTION file by hand afterward. This simply demonstrates how you can get started.

Now we can create a new package called myFirstPackage in our current working directory by typing:

library("devtools")
create("myFirstPackage")
## Package: myFirstPackage
## Title: What the Package Does (one line, title case)
## Version: 0.0.0.9000
## Authors@R: 'Zhian N. Kamvar <kamvarz@science.oregonstate.edu> [aut, cre]'
## Description: What the package does (one paragraph).
## Depends: R (>= 3.3.2)
## License: GPL-3
## Encoding: UTF-8
## LazyData: true
## Maintainer: 'Zhian N. Kamvar' <kamvarz@science.oregonstate.edu>

You can also use check() to see if your package passes R CMD check:

check("myFirstPackage")

Now all you have to do is add in your R files and document your functions.

Conclusions

This chapter describes the first steps to creating an R package and introduces further resources for doing so. One of the best ways to learn how to write R packages is to look at the source code for R packages that have already been published on CRAN. It’s easy to do this by searching the CRAN GitHub repository

Contributors

Session Information

This shows us useful information for reproducibility. Of particular importance are the versions of R and the packages used to create this workflow. It is considered good practice to record this information with every analysis.

options(width = 100)
devtools::session_info()
## Session info ---------------------------------------------------------------------------------------
##  setting  value                       
##  version  R version 3.3.2 (2016-10-31)
##  system   x86_64, linux-gnu           
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  tz       <NA>                        
##  date     2017-01-04
## Packages -------------------------------------------------------------------------------------------
##  package   * version date       source        
##  backports   1.0.4   2016-10-24 CRAN (R 3.3.2)
##  devtools  * 1.12.0  2016-12-05 CRAN (R 3.3.2)
##  digest      0.6.10  2016-08-02 CRAN (R 3.3.2)
##  evaluate    0.10    2016-10-11 CRAN (R 3.3.2)
##  htmltools   0.3.5   2016-03-21 CRAN (R 3.3.2)
##  knitr       1.15.1  2016-11-22 CRAN (R 3.3.2)
##  magrittr    1.5     2014-11-22 CRAN (R 3.3.2)
##  memoise     1.0.0   2016-01-29 CRAN (R 3.3.2)
##  Rcpp        0.12.8  2016-11-17 CRAN (R 3.3.2)
##  rmarkdown   1.3     2016-12-21 CRAN (R 3.3.2)
##  rprojroot   1.1     2016-10-29 CRAN (R 3.3.2)
##  stringi     1.1.2   2016-10-01 CRAN (R 3.3.2)
##  stringr     1.1.0   2016-08-19 CRAN (R 3.3.2)
##  whisker     0.3-2   2013-04-28 CRAN (R 3.3.2)
##  withr       1.0.2   2016-06-20 CRAN (R 3.3.2)
##  yaml        2.1.14  2016-11-12 CRAN (R 3.3.2)