distribution plot.
import pandas
from matplotlib import pyplot
df1 = pandas.read_csv("titanic.csv")
df2 = df1[["age", "survived"]]
print("Mean Age: ", df2["age"].mean())
print("Median Age: ", df2["age"].median())
fig = pyplot.figure()
ax = fig.add_subplot(111)
df2["age"].plot.kde()
lines, labels = ax.get_legend_handles_labels()
ax.legend(lines, labels, loc="best")
pyplot.savefig("titanic-age-kde.png")
pyplot.close()
The output KDE plot will look like the following:

Now, let’s add two columns to the DataFrame (How to add columns in a DataFrame?). The “mean_age” column will fill in the missing values with the mean age and the “median_age” column will fill in the missing values with the median age. After that, we will plot the KDE plots of the age, mean_age, and …








































0 Comments