What is Robust Scaler?
Let’s say there is a numerical column in a dataset. And there are outliers in the column. In that case, we can use a robust scaler to scale the features. Robust scaler uses statistics that are robust to outliers.
In the case of robust scaling, the median value of a numerical column is subtracted from each value of the column. The result is used as the numerator. And the Inter Quartile Range (IQR) or the difference between the third and first quartile is used as the denominator. The numerator is divided by the denominator and the result is used to replace the original value of the column.
How to use Robust Scaler in sklearn?
Let’s look at an example. Let’s read the titanic dataset. The age column of the dataset contains the age of the passengers. The numerical column has some outliers also. We can use the following Python code to see the outliers in the column.
import seaborn from matplotlib import pyplot df = seaborn.load_dataset("titanic") seaborn.boxplot(data=df, x="age") pyplot.show()
The output plot will be like the following:






0 Comments