null values

 

Null Values

You need to do something about the null values. There are several popular choices when dealing with null values:

  1. Eliminate the rows: A great approach if null values are a very small percentage, such as 1% of the total dataset. However, if data is limited, it is not wise to lose the precious data.
  2. Replace with a significant value, such as the median or the mean: A great approach if the rows are valuable, and the column is reasonably balanced. However, all the data with the missing values are inputted with the same value which is practically not true.
  3. Replace with the most likely value, perhaps a 0 or 1: It's preferable to option 2 when the median might be useless. The median can often work here.
  4. Other advance techniques include the estimating the data using machine learning based model such as KNN, Random Forest, and so on, and that is out of the scope of this book.
  5. This process of filling the null values with an appropriate number is known as imputation of the data.
df_housing.shape()
df_housing.isnull().any()

Comments

Popular posts from this blog

spealized the work. Be ready for the future

lest just create a basic bot operation in python

scatterplot/ violon plot /histogram /boxplot