A scatter plot is often used to see the relationship between two variables. For example, let’s say we are reading the tips dataset. The dataset contains various information, such as the total bill amount, the tip amount, etc. Now, we want to know whether the tip amount has any relationship with the total bill amount. To know that, we can use a scatter plot.
Please note that the line plot is often used when the data is distributed evenly along the horizontal axis. And a scatter plot is used to compare numeric values when the data is not distributed evenly along the x-axis. We can use the scatter plot to see the relationships between two variables. Usually, when for each value on the x-axis there are multiple values of the y-axis, we use a scatter plot.
Now, let’s look back at our example. We can use the following Python code to plot a scatter plot between the total bill amount and the tip amount.
import pandas from matplotlib import pyplot df = pandas.read_csv("tips.csv") print(df.info()) df.plot.scatter(x="total_bill", y="tip") pyplot.savefig("tips-scatter.png")
Here, the DataFrame.plot.scatter() function is used to plot a scatter plot. The x parameter indicates the values on the x-axis and the y parameter indicates the values on the y-axis. The scatter plot will look like the following:






0 Comments