import seaborn
from sklearn.preprocessing import StandardScaler
df = seaborn.load_dataset("titanic")
standard_scaler = StandardScaler()
standard_scaler.fit(df[["age"]])
print("Mean age: ", standard_scaler.mean_)
print("Variance: ", standard_scaler.var_)
df[["age"]] = standard_scaler.transform(df[["age"]])
print(df.head())
print("Mean age after standardization: ", df["age"].mean())
print("Variance after standardization: ", df["age"].var())
Here, we are using the StandardScaler class from the sklearn.preprocessing module to standardize data. The output after standardization will be:
Mean age: [29.69911765] Variance: [210.72357975] survived pclass sex age ... deck embark_town alive alone 0 0 3 male -0.530377 ... NaN Southampton no False 1 1 1 female 0.571831 ... C Cherbourg yes False 2 1 3 female -0.254825 ... NaN Southampton yes True 3 1 1 female 0.365167 ... C Southampton yes False 4 0 3 male 0.365167 ... NaN Southampton no True [5 rows x 15 columns] Mean age after standardization: 2.388378943731429e-16 Variance after standardization: 1.0014025245441796
As we can see, after standardization, the mean age is very close to 0 and the variance is close to 1.








































0 Comments