What is min-max scaling?
As we discussed before, if a dataset has a large range of numerical values in one column and a small range of numerical values in another column, then the large range of values may dominate over the small range of values in certain machine learning models. To address that problem, we need to use feature scaling. Min-max scaling is a feature scaling technique in machine learning.
How to perform min-max scaling on data in a column of a dataset?
Let’s say a column of a dataset contains numerical values. The minimum value is min(x) and the maximum value in the column is max(x). In min-max scaling, we take each value from a column and perform the following operation:
In other words, we take each value from the column, subtract the minimum value of the column from the number, and then, we divide the result by the difference between the maximum and minimum number of the column. The resulting value after the min-max scaling will be a value between 0 and 1.
How to perform min-max scaling using sklearn?
Let’s read the titanic dataset. Let’s say we want to perform min-max scaling on the age column of the dataset. We can use the following Python code for that purpose.
import seaborn from sklearn.preprocessing import MinMaxScaler df = seaborn.load_dataset("titanic") min_max_scaler = MinMaxScaler() df[["age"]] = min_max_scaler.fit_transform(df[["age"]]) print(df.head())
Here, we are using the MinMaxScaler class from the sklearn.preprocessing module to perform the min-max scaling. We are using …






0 Comments