Good Code

What is good code?

[10 minute discussion]

Automatic formatting

Use automatic formatting to do some automatic cleaning of your code

For mac: Cmd + Shift + A
For windows: Ctrl + Shift + A

Helps with many things, but not a magic bullet..

# before automatic formatting
a_random_function <- function(something, something_else){result <- do_something(something) %>% 
  do_something_new(something_else)}

# after automatic formatting
a_random_function <-
  function(something, something_else) {
    result <- do_something(something) %>%
      do_something_new(something_else)
  }

The {styler} package

Tip

You can automatically style code and entire scripts using the {styler} package.
This is especially handy for using .rmd or .qmd files. Here, you can simply specify the tidier as a knitr chunk option.

knitr::opts_chunk$set(tidy = "styler")

Note that this doesn’t affect the source document (i.e. the script), but only affects the knitted document.

an_unstylized_example <- "asdf"

Naming conventions

Consistency is key!

Use one system and stick to it. This will help not only with readibility, but also writing code to for example select key variables of interest.

Compare:

my_data %>%
  select(
    "item_1",
    "Item_2",
    "This_is_the_3_item",
    "Yet_another_item",
    "What.is.this.item?"
  )

my_data %>%
  select(starts_with("scale_name"))

My best practices for naming stuff

I like the following:

small case for all variables
snake_case
nouns for variables and datasets, verbs for functions

Tip: the `clean_names()` function from the {janitor} package

data_with_bad_names <- tibble(
  BAD_NAME = 1,
  `really bad name` = 2,
  `1 - another bad (name)` = 3
)

data_with_bad_names %>%
  janitor::clean_names() %>%
  names()

[1] "bad_name"            "really_bad_name"     "x1_another_bad_name"

# also works with other cases, if you prefer those
# (see ?sankecase::to_any_case)
data_with_bad_names %>%
  janitor::clean_names(case = "upper_camel") %>%
  names()

[1] "BadName"          "ReallyBadName"    "X1AnotherBadName"

Commenting code

Code commenting practices

more is not always better
general advice: comments shouldn’t focus on the how, but the why

# load in data
df <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-17/artists.csv")

# load tidytuesday dataset on artists
# see documentation: https://github.com/rfordatascience/tidytuesday/blob/master/data/2023/2023-01-17/readme.md
df <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv")

Code commenting exercise (~15 min?)

Go back to an old script (e.g. for data cleaning, …) of yours (preferably older than 6 months) and take a look at it

What have you commented, what haven’t you commented?
Which comments make sense to you? Which don’t?
Show the code to your neighbor without explaining it. What can they understand, what don’t they understand?

Script etiquette

Workflow

General logic: all data cleaning / manipulation first, then analyses

For more complex projects, keep data cleaning and analysis in separate scripts.

Load all necessary packages at the start of the script

makes it easier to understand which packages are needed
if you only need a function from a specific function once, do not necessary load in the package, you can also just call a function with reference to its namespace using the :: notation

MASS::bcv()

The {conflicted} package

Dealing with name conflicts

The {conflicted} package helps navigate name conflicts of functions from different packages (functions having the same name, such as base::filter() and dplyr::filter())

The conflicted::conflict_prefer() function lets you set defaults for which function you prefer in this case.

If you deal with packages that have naming conflicts, loading the {conflicted} package at the start of your R scripts is a good idea.

Good Code

What is good code?

Automatic formatting

The {styler} package

Naming conventions

Consistency is key!

My best practices for naming stuff

Tip: the clean_names() function from the {janitor} package

Commenting code

Code commenting practices

Code commenting exercise (~15 min?)

Script etiquette

Workflow

Load all necessary packages at the start of the script

The {conflicted} package

Tip: the `clean_names()` function from the {janitor} package