Hello Everyone!!! Welcome to my blog. If you are dealing with the data, I am 110% certain that you have seen data with categories for instance Gender (Male, Female) or Education (Ph.D., Master’s, Bachelor’s). Since we are dealing with a mathematical model in machine learning it is significant that we can convert this category into numeric numbers prior to utilizing it for training our model.
In this blog, we’ll look at what categorical variables are and the various types of them, as well as different approaches to handling categorical data with code samples.
Hello Everyone!!!! The most important phase in Feature Engineering is handling outliers because it ensures that our model is trained on accurate data which leads to accurate models.
Today we’ll look at what outliers are, their causes and consequences, various ways to identifying them, and finally various methods for dealing with them using code samples.
The code sample and dataset for this article are available here.
A data point that varies greatly from other results is referred to as an outlier.
An outlier may also be described as an observation in our data that is incorrect or abnormal as compared…