Posts

Showing posts from 2023

How to convert categorical text data into numerical data using OneHotEncoder

Image
 Machine learning algorithms handle numerical data better than text data. A dataset can contain categorical data in text form such a gender, food_type, taxonomic_class, etc. In order to better utilize the power of machine learning algorithms we would have to convert the categorical data in text form into numerical form. This can be done using encoders. There are a few types of encoders in scikit-learn that convert the categorical data into either binary or numerical data. Here we will learn about OneHotEncoder in scikit-learn. OneHotEncoder converts  the categorical data into binary data in which each category in  dataframe column is converted into one separate column where the value of the column is 1 in rows where that particular category is present. For example, if the category of gender in row number 12 in a dataset is 'male'. Then the column corresponding to 'male' category created by OneHotEncoder will have 1 in row number 12. We will see an example how to encod...

Reading and writing files using python

Image
Python can be used to open, read and write files. Let's take an example of a simple csv file. csv stands for comma separated values. csv files are basically text files with lines of text wherein each element in a line is separated by a comma. Below is an example of a csv file open in notepad.  

Iterating over lists in Python

Iterating over lists in Python The range(len()) method utilizes the range() and len() functions in python. The len() function gets the number of elements in the list and range() function generates a range of values that are same as the indices of the list elements. We can then iterate over the indices using a for loop. In [1]: # range(len()) # create lists a = [ 1 , 2 , 3 , 4 , 5 ] b = [ 4 , 2 , 7 , 1 , 9 ] c = [ 9 , 8 , 4 , 7 , 6 ] #create empty list to store results d = [] #loop for i in range ( len ( a )): d . append ( a [ i ] * b [ i ] * c [ i ]) print ( f ' { d =} ' ) d=[36, 32, 84, 28, 270] The zip() method utilizes the zip function. It basically createsa tuple from any iterables, in this example a list and returns the tuple.