this ⟵ R project
Most of the ideas presented here come from Jenny Bryan’s blog post, Project-oriented workflow.
Similar ideas can be found in chapter nine of R4DS (2e).
crooked paths in R
Imagine trying to cook dinner in a CDC lab…
That’s EXACTLY what it’s like working on multiple research projects in the same programming work-space.
Seriously, don’t do it.
cross-contamination in R
solution ⟵ R project
A typical project folder might look like this:
📁 my-r-project
If you open this in the RStudio IDE, the working directory will automatically be set to “root/path/to/my-r-project”.
📁 my-r-project
Need to distinguish the essentials from the inessentials!
x is product iff x can run without error
head(cars)
mean(cars$dist) + 1
# don't forget to do your laundry!
i <- sample(1:nrow(cars), size = 25, replace = FALSE)
cars2 <- cars[i, ]
plot(cars2)
Sys.Date()
bb8 <- lm(dist ~ speed, data = cars2)
summary(bb8)
But all this will 🏃🏃🏃…
x is product iff the goal requires x ✔
head(cars)
mean(cars$dist) + 1
# don't forget to do your laundry!
i <- sample(1:nrow(cars), size = 25, replace = FALSE)
cars2 <- cars[i, ]
plot(cars2) # <--- is this necessary?
Sys.Date()
bb8 <- lm(dist ~ speed, data = cars2)
summary(bb8)
But teleology means it just depends… 🤷
consider what details you’d include when giving directions
your code is like that, but from your raw data to your results
“Wyman’s overpopulated universe is in many ways unlovely. It offends the aesthetic sense of us who have a taste for desert landscapes, but this is not the worst of it. Wyman’s slum of possibles is a breeding ground for disorderly elements.”
On What There Is (1948)
Translation: trust your R script! and be ruthless with your use of rm()
!
here()
I amNote that here()
finds the path to the project folder, though RStudio will do this, too…
here()
, however, will also reference the top project directory no matter where you are in the project.
library(here)
# on blake's computer, in the R folder
here("data", "elevation.tiff")
#> [1] "C:/Users/blake/rstuff/our-r-project/data/elevation.tiff"
# on bob's computer, in the figures folder
here("data", "elevation.tiff")
#> [1] "C:/Users/bob/likes/subfolders/our-r-project/data/elevation.tiff"
# on simon's computer, in the _misc folder
here("data", "elevation.tiff")
#> [1] "?????/our-r-project/data/elevation.tiff"
📁 my-r-project
etc., etc., etc.
and we haven’t even gotten to drafts of our R scripts! hmmm… 🤔
Once you have git and Github setup, RStudio makes version control super super easy.
See happy git with r for details.
“but I want to share data across projects,” you will inevitably find yourself saying
and now you’re on the cutting edge 🔪🔪🔪
pin()
your data?“The pins package publishes data, models, and other R objects, making it easy to share them across projects and with your colleagues.”
- From the package website
This looks promising, but I don’t have much experience with it. Need buy in from the collabs on using projects first…
let’s make an R project!