Reproducible Manuscripts in R

Boston College

Jason Geller, PH.D.(he/him)

2025-04-02

What is reproduciblity?



Analysis / Data Same Different
Same Analysis Reproducible Replicable
Different Analysis Robust Generalisable
  • If I take the same analysis pipeline and apply it to the same dataset, I get the same results

    • This is reproducibility

What is open science?


  • Openly sharing your research

    • Open publications

    • Open experimental protocols

    • Open software

    • Open data


Open Science + Reproducibility

  • Today we are going to focus on reproducible manuscripts

    • Text

    • Code/analyses

    • Citations

Packages

library(palmerpenguins) # penguins
library(quarto) # qmd 
library(rmarkdown) # markdown
library(tidyverse) # data wrangling

Packages

install.packages('tinytex') # for use with pdf 
tinytex::install_tinytex()
# to uninstall TinyTeX, run tinytex::uninstall_tinytex() 
  • You should also have Zotero installed along with Better BibTeX (nice, but not necessary)

Typical workflow

  1. Do your analyses
  2. Open a program (Word)
  3. Copy-paste results and figures/tables
  4. Manually format your results and citations

The Problem

Word

Inside

Word issues

  • A .docx file is a compressed folder with lots of files
    • Your text is buried in with a lot of formatting information
  • Not reproducible
    • Code is divorced from writing
  • Difficult to maintain
    • Errors!
  • What do I share?
    • Lack of transparency

What do we want?

  • Combine narrative with code

  • Automatically generate figures and tables

  • Automatically render results in text

  • Format the content into a scientific paper (including citations!)

  • Something that looks pretty!

  • Rinse & repeat

Hello Quarto!

  • Authoring framework for data science, designed for reprodicibility
  • Unify and extends the R Markdown ecosystem.
  • Develop and Switch formats without hassle.

The Quarto hexagon logo.

Hello Quarto!

Big universe

  • RMarkdown for EVERYONE

What is a Quarto?

How Quarto Works

Quarto handles literate programming by using a series of programs:

How Quarto Works (Source)

  • knitr executes all code chunks and creates a new markdown (.md) file
  • pandoc takes the markdown file generated and converts it to the desired format.
  • Render inside of RStudio handles the interaction.

Advantages

(1) Eliminate/reduce human error

We found that half of all published psychology papers that use null-hypothesis significance testing (NHST) contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion)

Nuijten et al., 2016

https://michelenuijten.shinyapps.io/statcheck-web/

In the Wild: Data Science Gone Wrong

  • Retraction Watch

  • One such case of a paper being retracted due to an Excel error was the Growth in a Time of Debt by Reinhart & Rogoff (2010)

    • Found by graduate student Thomas Herndon and co-authors Michael Ash, and Robert Pollin.

Advantages

(2) Easy revisions and specification of desired figures and tables

When revisions are requested, one might have to tweak tables and figures by hand constantly, leading to a major incentive never to rerun analyses because it would mean re-pasting and re-illustrating all the numbers and figures in a paper.

Advantages

(3) Promote computational reproducibility

  • Easy verification and reproducilbility of research findings

  • While programming environments may seem counter-intuitive for writing papers, they ultimately prevent mistakes and save time.

Let’s Get Started!

Getting started

Note

Always start a new project folder!

  • Start from scratch
    • Creating a Quarto manuscript

      • RStudio: New Project > New Directory > Quarto Manuscript

Overview of a Quarto Document

Create a Quarto Document

In the top left, click the White Plus and select “Quarto Document…”

Drop down menu containing Quarto Document creation button

Creating a new Quarto Document

In the new prompt, enter a title, author name, and press “Create”

New quarto document wizard allowing a title and author information to be set.

New Document Options

Source vs. Visual Mode

Figure showing what a Quarto document looks like in Source Editing Mode.

Source Editing Mode

Figure showing what a Quarto document looks like in Visual Editing Mode.

Visual Editing Mode

Getting started

  • Go to the Getting Started section of website and complete each part
05:00

Annotated Quarto Document

Annotated figure that describes the different sections of a Quarto document while in the source editor mode.

Annotated sections of the “Hello Quarto” document related to document information, text formatting, and code execution

Output of a Quarto Document

Image showcasing how the source code of the document translated over into the rendered product.

Annotated source to output of the “Hello Quarto” document

Metadata & Header (YAML)

  • The YAML header contains basic metadata and rendering instructions
---
title: My Reproducible Manuscript
authors:
  - name: Norah Jones
    affiliation: The University
    roles: writing
    corresponding: true
bibliography: references.bib
format: html
---
  • Wait… what’s the YAML acronym?

    • Originally: “Yet Another Markup Language”

    • Later: “YAML Ain’t Markup Language”

  • Set global manuscript options with key-value pairs

Code

```{r}
#| eval: true
1 + 1
```
[1] 2

Text

Section

This is a simple placeholder for the manuscript's main document [@knuth84].

Writing in Markdown (NEXT)