In a hexagonal plot, the range of X and Y are first divided into hexagons that look like honeycomb. After that, the hexagons are colored depending on how many values fall within that region. When there are hundreds of thousands or millions of observations, then scatter plots become less convenient. It looks like overplotted. To address that overplotting problem, hexagonal plots are used.
Let’s look at an example. Let’s look into the “tips” dataset. The dataset contains various data along with the total bill amount and the tip amount. In this example, we will plot a hexagonal plot, where the x-axis represents the total bill and the y-axis represents the tip amount.
import pandas from matplotlib import pyplot df = pandas.read_csv("tips.csv") df.plot.hexbin(x="total_bill", y="tip", gridsize=20) pyplot.savefig("hexagonal-plot.png")
Here, df.plot.hexbin(x=”total_bill”, y=”tip”, gridsize=20) plots the hexagonal plot. The x-axis represents the total bill and the y-axis represents the tip amount. We are also increasing the gridsize to be 20, so that it is easier to see the relationship between the total bill and tip from the plot.
The hexagonal plot looks like the following:






0 Comments