Plotting growth curve using python

Data plotting can be easily done in excel. Excel is a very easy and efficient tool for calculations and plotting of biological data and most people including me prefer it. With excel one has to plot the data and do all customization each and every time for a new data set. Therefore, when it comes to plotting multiple datasets of similar nature over and over again, using a programming language is more efficient. Once a template code for a plot is ready, one can plot any number of data sets with it in a few seconds. Here we will see how to plot a simple scatter plot by taking an example of growth profile (i.e. data of time vs O.D.) of a cell culture. The reading are from three experiments. The O.D.s were taken from 0 to 6 hours at an interval of one hour.

Initially we need to import the packages we will need for plotting the data. The matplotlib package is useful for plotting the data and the pandas is useful for reading the data from excel sheet. The pandas package has functions to plot the data through matplotlib but here we will not use the pandas functions to plot our data. We will directly use the matplotlib package. Below, the pyplot domain from matplotlib is imported and is named as "plt". So, anywhere in the program, plt would mean pyplot. Similarly, the pandas package is imported as "pd". You can import and name the package/domain as you like but for consistency, the naming convention of pyplot and pandas is "plt" and "pd" repsectively. Anyways, following are the commands for importing these packages.

from matplotlib import pyplot as plt
import pandas as pd

Now we have read our data from the excel sheet. I have saved the data in excel sheet named "growth_profile.xlsx". The screenshot of the excel file is shown below.

Following is the code for reading the excel sheet. the "read_excel" function reads the data from excel sheet and converts into a 'pandas data-frame'. We will name this data-frame as "readings". It looks similar to the excel sheet where the data is arranged in columns. Each column is named based on the labels written in the first row of the excel sheet.

readings = pd.read_excel('growth_profile.xlsx')
print(readings)

	Time	rep1	rep2	rep3
0	0	0.027	0.031	0.032
1	1	0.063	0.059	0.057
2	2	0.125	0.131	0.133
3	3	0.246	0.254	0.255
4	4	0.512	0.502	0.498
5	5	1.121	1.136	1.034
6	6	1.873	1.759	1.985

To plot the data, we need to define separate the data that would represent the x-axis (the time) from the data that represents the y-axis (the O.D.s). Here we will select the first column from the "readings" and store it as "x_data".

x_data = readings['Time']
print(x_data)

0    0
1    1
2    2
3    3
4    4
5    5
6    6
Name: Time, dtype: int64

Similarly, we will select all other columns which represent the O.D.s and name it y_data.

y_data = readings[readings.columns[1::]]
print(y_data) 

	rep1	rep2	rep3
0	0.027	0.031	0.032
1	0.063	0.059	0.057
2	0.125	0.131	0.133
3	0.246	0.254	0.255
4	0.512	0.502	0.498
5	1.121	1.136	1.034
6	1.873	1.759	1.985

Note that for selecting the x_data, we have used the column name and for selecting the y_data, we have used the indices of the columns to be selected. In python, the indices begin with 0 (zero). Therefore the index of the first column is be 0 and the second column is 1.

For plotting we will use the pyplot domain which we had earlier imported as "plt". Then we will customize the plot by adding the title and names of the axes.

  
plt.plot(x_data, y_data)
plt.title('Growth curve', fontsize=16)
plt.xlabel('Time (h)', fontsize=14)
plt.ylabel('O.D. 600nm', fontsize=14)
plt.show()
  

The above plot shows the growth curve of individual experiments independently. However, in real life we have to plot the means and standard deviation of the independent experiments for making a report. We will now calculate the mean and standard deviation of the ODs and store as separate columns in the "readings" data-frame.

readings['mean'] = y_data.mean(axis=1)
readings['std'] = y_data.std(axis=1)
print(readings)

	Time	rep1	rep2	rep3	mean	std
0	0	0.027	0.031	0.032	0.030000	0.002646
1	1	0.063	0.059	0.057	0.059667	0.003055
2	2	0.125	0.131	0.133	0.129667	0.004163
3	3	0.246	0.254	0.255	0.251667	0.004933
4	4	0.512	0.502	0.498	0.504000	0.007211
5	5	1.121	1.136	1.034	1.097000	0.055073
6	6	1.873	1.759	1.985	1.872333	0.113001

And finally we will plot the means and standard deviation. Also, the figure generated must be saved so that it could be shared with others. Here we save the file as "growth_curve.png".

plt.errorbar(readings['Time'],readings['mean'],

             yerr=readings['std'],

             fmt='-o',

             capsize=5)

plt.title('Growth curve',fontsize=16)

plt.xlabel('Time (h)',fontsize=14)

plt.ylabel('O.D. 600nm',fontsize=14)

plt.savefig('growth_curve.png',dpi=200)

plt.show()

It should be noted that the excel file and the python file in which the above code is written should be present in the same folder. The ".png" file generated will also be saved in the same folder.

Search This Blog

The Dry Lab Stuff

Plotting growth curve using python

Popular Posts

Principal Coordinate analysis in R and python

Principal Coordinate Analysis (PCoA) in R