Posts

Showing posts from March, 2024

Fill missing values using SimpleImputer

Data often would contain missing values. Sometime it makes sense to fill the missing values with some appropriate value. For example we may want to fill the missing value with, say, mean of the available values. We can fill such missing values by calculating the mean of the column and using the fillna() function. However, if several columns have missing values then we might have to repeat this process several times or write a loop. Scikit-learn offers functionality called as SimpleImputer to easily fill the missing values . 

Divide numerical data into categories

Sometimes we need to categories numerical values into different categories. For example, the population of town might be needed to be categorized into different income groups. Or, the marks of students might be needed to be categorized into different grade levels. Pandas’ cut() method can be used to categorize the numerical values very easily.

Python classes - Inheritance

In the previous post we saw how to create python classes and methods under them. We create a DNA class representing a DNA sequence. However, we can treat a DNA sequence as a string. A special kind of string that consists of only four letters, namely, ‘A’, ‘T’, ‘G’, and ‘C’ representing the nucleotides adenine, thiamine, guanine and cytosine, respectively. The DNA sequence should not contain any other characters. For ease of use we will allow entry of small and capital case letters which would be converted to capital case letter inside the class definition. Here we will create a class that is inherits properties from the built-in str class. class subclass(parent_class): # class definition To do so we just have to put the parent class in brackets while defining our current class. We can create as many subclasses that are themselves inherited from other subclasses in this way. class subclass(parent_class): # class definition class subclass_2(subclass): # class def...

Python classes : introduction

Classes are user-defined objects. Python has several built-in object types such as integers, float and strings. The programmers can create objects required for their program. We have seen some of the python objects previously. For example: a = 3 print ( type (a)) <class 'int'> Here, a is an int type of object.

Chi-square distribution and acceptable range.

Image
When to use χ 2 test? χ 2 test is used to check the goodness of fit of data points calculated by a function against the observed data points. One of the applications I know of it is when unknown parameters when passed to a function give an observed data. function(parameters) --> data points In such cases we can back calculate the parameters from observed data points. The way to solve these problems is to optimize the parameters by passing them to the functions and trying to minimize the sum of squared differences (SSR) of the calculated and observed values. Following is a rough pseudo-code for this.

Functions in python - args and kwargs

Earlier we saw how to write functions in python and how to call them in our program. For those functions, the number of inputs or arguments were defined. They took fixed number of positional or keyword arguments or had a default value assigned to one or more of the arguments. 

Functions in python

Functions are sets of statements that perform a specific task. They can be called by their names, more than once in a program. Inputs can be given to functions based on which they will perform a task and can give back the result. Functions avoid same code to be written over and over again, thus, reducing the redundancy of the code. They also modularize programs by assigining one task for one function. It also makes a program easy to correct. We would need to correct the code at only one place when a function is not working as desired as compared to situation where all the places where that set of code was wirtten has to be edited. The basic syntax of a function is as follows: def func(inputs): statments 1 . . . return result For example, a function can be written that return the result of addition of two numbers. It takes two numbers, a and b as input and returns the result. It can be written in a few ways: def add_num(a,b): c = a + ...

How to plot product concentrations in different strains using python?

Image
The most common type of graphs that we, as experimental biologists make, are bar graphs. When we want to compare: - the amount of a product secreted by different conditions or cells - The enzyme activity in different conditions or cells or similar cases when we want to compare the value of an observation at different conditions we typically plot a bar graph. Also, with replicates of experiments we plot the mean and standard deviations of the experiment. Excel is perhaps the quickest way to draw a single such graph but in case you want to make similar graphs for several observation or plot two or more such graphs in one figure as subplots, using python may be a better choice unless we want to spend time in adjusting ech graph into a powerpoint slide of in inkscape to make a collage. Here we will see how to plot these kind of graphs using python. We will use numpy , pandas and matplotlib packages to do this. We will take an example of observations depicting the concentration (g/l) ...