is the most frequent value in the column.
import seaborn
from sklearn.preprocessing import LabelEncoder
df = seaborn.load_dataset("titanic")
df["embark_town"].fillna(value="Southampton", inplace=True)
print("Missing values after filling missing values: \n", df.isnull().sum())
After data imputation, the number of missing values in the embark town column becomes zero.
Missing values after filling in missing values: survived 0 pclass 0 sex 0 age 177 sibsp 0 parch 0 fare 0 embarked 2 class 0 who 0 adult_male 0 deck 688 embark_town 0 alive 0 alone 0 dtype: int64
Now, let’s start with label encoding. We can use the following Python code for label encoding.
import seaborn
from sklearn.preprocessing import LabelEncoder
df = seaborn.load_dataset("titanic")
df["embark_town"].fillna(value="Southampton", inplace=True)
print("Value counts before label encoding: \n", df.embark_town.value_counts())
label_encoder = LabelEncoder()
df["embark_town"] = label_encoder.fit_transform(df["embark_town"])
print("Value counts after label encoding: \n", df.embark_town.value_counts())
Here, we are using the LabelEncoder class from the sklearn.preprocessing module to perform the label encoding. The …








































0 Comments