Hello Everyone!!! Welcome to my blog. If you are dealing with the data, I am 110% certain that you have seen data with categories for instance Gender (Male, Female) or Education (Ph.D., Master’s, Bachelor’s). Since we are dealing with a mathematical model in machine learning it is significant that we can convert this category into numeric numbers prior to utilizing it for training our model.

In this blog, we’ll look at what categorical variables are and the various types of them, as well as different approaches to handling categorical data with code samples.

All the code samples and datasets are…

Hello Everyone!!!! The most important phase in Feature Engineering is handling outliers because it ensures that our model is trained on accurate data which leads to accurate models.

Today we’ll look at what outliers are, their causes and consequences, various ways to identifying them, and finally various methods for dealing with them using code samples.

The code sample and dataset for this article are available here.

What is an Outlier

A data point that varies greatly from other results is referred to as an outlier.

An outlier may also be described as an observation in our data that is incorrect or abnormal as compared…

Ashutosh Sahu

Learning, Implementing and Sharing

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store