In: Computer Science
Write a complete program in python to impute null values in a data frame with mean and median . Please explain logic also because i am not understanding
To impute null values in Dataframe we use fillna(value ) method
This method will replace NA/NAN in the database with the Value provided in it
A simple program is
import numpy as np
import pandas as pd
# make a data having somve nan values in it
data = {"Column1": [np.nan, 7 , 610 , 757, np.nan,
49, 57, np.nan, 27, 503],
"Column2": [50, 71, 48, 232, 951 ,
97 , np.nan, 994, 29, np.nan ]}
# create a datavase of the above code
df = pd.DataFrame(data)
print(df)
# Now replace all NAN Values with mean of respective column
values
df_mean= df.fillna(df.mean())
df_mean
#so we saw that all NAN values of column 1 are replaced by
287.142857 and Column 2 by 309.0
# Now replace all NAN Values with mean of respevtive column
values
df_median= df.fillna(df.median())
df_median
#so we saw that all NAN values of column 1 are replaced by 57.0 and
Column 2 by 84.0
Loigc is
df.mean() and df.median ()method computes means of whole data frame column wise And then when we use them in fillna() method it replaces corresponding column values with the mean or median of the that column only
See Code Screen Shot for Code after each parts
So we saw that to impute null values in data frame we can simply use fillna() method
Thank You
If u like the answer then Do Upvote the answer and have any douby comment it