A scatter plot is often used to see the relationship between two variables. A line plot is often used when the data is distributed evenly along the horizontal axis. And a scatter plot is used to compare numeric values when the data is not distributed evenly along the x-axis. Usually, when for each value on the x-axis there are multiple values on the y-axis, we use a scatter plot.
Let’s look at an example. Let’s read the “tips” dataset. The dataset contains various information, such as the total bill amount, tip amount, etc. Let’s say we want to see the relationship between the total bill amount and the tip amount using a scatter plot. We can use the following Python code for that purpose:
import pandas from matplotlib import pyplot df = pandas.read_csv("tips.csv") pyplot.scatter(x=df["total_bill"], y=df["tip"]) pyplot.xlabel("Total Bill") pyplot.ylabel("Tips") pyplot.savefig("tips-scatter-matplotlib.png") pyplot.close()
Here, the pyplot.scatter() function is used to plot a scatter plot. The first parameter indicates the x values and the second parameter indicates the y values. The pyplot.xlabel() and pyplot,ylabel() functions are used to label the x-axis and the y-axis, respectively.
The output scatter plot will look like the following:






0 Comments