distribution plot.
import pandas from matplotlib import pyplot df1 = pandas.read_csv("titanic.csv") df2 = df1[["age", "survived"]] print("Mean Age: ", df2["age"].mean()) print("Median Age: ", df2["age"].median()) fig = pyplot.figure() ax = fig.add_subplot(111) df2["age"].plot.kde() lines, labels = ax.get_legend_handles_labels() ax.legend(lines, labels, loc="best") pyplot.savefig("titanic-age-kde.png") pyplot.close()
The output KDE plot will look like the following:
Now, let’s add two columns to the DataFrame (How to add columns in a DataFrame?). The “mean_age” column will fill in the missing values with the mean age and the “median_age” column will fill in the missing values with the median age. After that, we will plot the KDE plots of the age, mean_age, and …






0 Comments