fit_transform() method learns from the data about different categorical values in the column, encodes the categorical values, and then, transforms the categorical values into integers.
The output of the above program will be:
Value counts before label encoding: Southampton 646 Cherbourg 168 Queenstown 77 Name: embark_town, dtype: int64 Value counts after label encoding: 2 646 0 168 1 77
So, as we can see, Southampton has been replaced with 2, Cherbourg is replaced with 0, and Queenstown is replaced with 1.
Please note that we can also use the inverse_transform() method to replace the encoded integers with the original categorical values.
import seaborn
from sklearn.preprocessing import LabelEncoder
df = seaborn.load_dataset("titanic")
df["embark_town"].fillna(value="Southampton", inplace=True)
print("Value counts before label encoding: \n", df.embark_town.value_counts())
label_encoder = LabelEncoder()
df["embark_town"] = label_encoder.fit_transform(df["embark_town"])
print("Value counts after label encoding: \n", df.embark_town.value_counts())
df["embark_town"] = label_encoder.inverse_transform(df["embark_town"])
print("Value counts after inverse transform: \n", df.embark_town.value_counts())
The output of the above program will be:








































0 Comments