How to draw a growth curve in R

In a growth curve of a bacteria, we will have the optical density measurements against time. For each time point, we will have a few technical replicates. Let’s take the following data which has time, and replicates A, B, and C.

We have saved this table in a excel file. The file is named as growth_curve.xlsx and is stored in a folder/directory called datasets, relative to my current working directory.

First thing we would do is load the tidyverse library, which will have the ggplot2 library we will use for plotting the growth curve.

library(tidyverse)

We will first read the data from excel into tibble and see how it looks like.

data <- readxl::read_excel('datasets/growth_curve.xlsx')
data

Then, we will see the whole code to process and plot the data and later explain it step by step.

# calculate mean and standard deviations
data <- data %>%
  rowwise() %>%
  mutate(
    means = mean(c(A, B, C)),
    stdev = sd(c(A, B, C))
  )

# plot the means and standard deviations.
ggplot(data=data) +
  geom_point(mapping = aes(x = time, y = means), size=3) +
  geom_line(mapping = aes(x = time, y = means),
            linewidth=0.6) +
  geom_errorbar(mapping = aes(x=time, 
                              ymin=(means - stdev),
                              ymax=(means + stdev)),
                width = 0.2,
                linewidth=0.6) +
  xlab('Time (hr)') +
  ylab('O.D. 600nm') +
  theme_light()

Explanation of the code

We performed following steps to make the plot:

  • Read the data from excel file into a tibble.
  • Add two new columns representing the mean and standard deviation of the replicates.
  • Draw the figure

We will go through each of these steps one by one.

Read the data

We used the read_excel function from readxl library.

data <- readxl::read_excel('datasets/growth_curve.xlsx')

Calculate mean and standard deviations

Typically, when plotting growth curve, the means and standard deviation are plotted as a line plot. The error bars represent the standard deviations and the data points represent the means of the replicates.

rowwise function indicates that the operations need to done one row at a time for each row.

mutate function changes the tible (dataframe) to add new columns.

mean and sd functions calculate the means and standard deviations, respectively.

data <- data %>%
  rowwise() %>%
  mutate(
    means = mean(c(A, B, C)),
    stdev = sd(c(A, B, C))
  )

Plotting the data

We first make a basic scatter plot.

ggplot(data=data) +
  geom_point(mapping = aes(x = time, y = means), size=3)

The size argument defines size of the marker point.

A layer of line plot is added over the scatter plot.

geom_line(mapping = aes(x = time, y = means),
            linewidth=0.6)

Then we add the layer of error bars.

ymax and ymin define the range of error bar. The are means +/- standards deviations, respectively.

The width argument defines the width of the whiskers of the error bars.

The linewidth defines the thickness of the error bar line.

geom_errorbar(mapping = aes(x=time, 
                              ymin=(means - stdev),
                              ymax=(means + stdev)),
                width = 0.2,
                linewidth=0.6)

Finally, we add the labels for x and y axes.

xlab('Time (hr)') +
ylab('O.D. 600nm')

Also, in the end you may select a theme. We had selected the theme_light.



Popular posts from this blog

Principal Coordinate analysis in R and python

Principal Coordinate Analysis (PCoA) in R