Python classes - Inheritance
In the previous post we saw how to create python classes and methods
under them. We create a DNA
class representing a DNA
sequence.
However, we can treat a DNA sequence as a string. A special kind of string that consists of only four letters, namely, ‘A’, ‘T’, ‘G’, and ‘C’ representing the nucleotides adenine, thiamine, guanine and cytosine, respectively.
The DNA sequence should not contain any other characters. For ease of use we will allow entry of small and capital case letters which would be converted to capital case letter inside the class definition.
Here we will create a class that is inherits properties from the
built-in str
class.
class subclass(parent_class):
# class definition
To do so we just have to put the parent class in brackets while defining our current class. We can create as many subclasses that are themselves inherited from other subclasses in this way.
class subclass(parent_class):
# class definition
class subclass_2(subclass):
# class definition
class subclass_3(subclass_2):
# class definition
For the purpose of this write-up, we will only make one subclass,
DNA
which is inherited from class str
.
class DNA(str):
def __init__(self, seq):
self.seq = seq.upper()
We will then create an instance of DNA
.
= DNA('atgcttaacggcattggcat') seq1
To see what methods it has inherited from the parent class, we will
use the dir()
function.
print(dir(seq1))
['__add__', '__class__', '__contains__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__module__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'seq', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
We can see that, even though we have not defined any methods in
DNA
class like we did previously, a lot of methods are
available for use with this newly created class.
All these methods are inherited from the parent class
str
.
This is called inheritance of class and is very powerful method while constructing our own software with custom classes. Code for methods of parent classes can be used in child classes and there is no need to write the codes again thus reducing the redundancy in code.
Validity of DNA
class
An instance of DNA
can be created using above code, but
the sequence can still have all the characters that could be used inside
a string.
= DNA('fghr$@fhakf_fmg')
seq2 seq2.seq
'FGHR$@FHAKF_FMG'
But, a DNA sequence can have only four of the characters, A, T, G and C. Validation of the DNA sequence can be easily implemented as below.
class DNA(str):
def __init__(self, seq):
self.seq = seq.upper()
self.check_validity()
def check_validity(self):
if self.seq.count('A') + self.seq.count('T') + self.seq.count('G') + self.seq.count('C') != len(self.seq):
raise Exception('The DNA sequence is not valid')
See how it works with the same sequence, seq2
.
= DNA('fghr$@fhakf_fmg') seq2
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
Cell In[26], line 1
----> 1 seq2 = DNA('fghr$@fhakf_fmg')
Cell In[25], line 6, in DNA.__init__(self, seq)
3 def __init__(self, seq):
4 self.seq = seq.upper()
----> 6 self.check_validity()
Cell In[25], line 10, in DNA.check_validity(self)
8 def check_validity(self):
9 if self.seq.count('A') + self.seq.count('T') + self.seq.count('G') + self.seq.count('C') != len(self.seq):
---> 10 raise Exception('The DNA sequence is not valid')
Exception: The DNA sequence is not valid
The check_validity
method raises an exception if the
sequence contains any letter other than A, T, G or C.
Similarly, we can write other relevant methods for the
DNA
class. Few of which that come into my mind would serve
following purpose:
- GC content
- Complement
- Reverse compliment
- Transcribed RNA sequence.
- Translated sequence
- Find recognition sites and count them.