A heatmap can be used to plot the correlation between numeric columns in a dataset. Let’s take the example of the titanic dataset. The dataset contains various numerical information, such as the age of passengers, fare, the number of relatives or siblings who are on board with the passenger, etc.
We can use the following Python code to calculate the correlation matrix involving the columns that contain numerical values. And then, we can use the correlation matrix to plot a heatmap.
import seaborn from matplotlib import pyplot df = seaborn.load_dataset("titanic") corr_values = df.corr() seaborn.heatmap(corr_values, annot=True) pyplot.show()
Here, we are using the seaborn.heatmap() function to plot the heatmap. The annot=True parameter indicates that the heatmap will show annotation.
Please note that the correlation coefficient between any two variables X and Y is a number between -1 to +1. The negative sign indicates the variables are negatively related, i.e. if we increase X, Y will decrease. And the positive sign indicates the variables are positively related, i.e. if we increase X, Y will also increase. And if the magnitude of the number is closer to 1, that means the variables are strongly correlated.
The resulting heatmap in this example looks like the following:
0 Comments