same:
import pandas
import numpy
df = pandas.read_csv("titanic.csv")
print(df.info())
print(“Percentage of missing values: \n”, df.isnull().mean()*100)
The output of the above program will be:
RangeIndex: 891 entries, 0 to 890 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 survived 891 non-null int64 1 pclass 891 non-null int64 2 sex 891 non-null object 3 age 714 non-null float64 4 sibsp 891 non-null int64 5 parch 891 non-null int64 6 fare 891 non-null float64 7 embarked 889 non-null object 8 class 891 non-null object 9 who 891 non-null object 10 adult_male 891 non-null bool 11 deck 203 non-null object 12 embark_town 889 non-null object 13 alive 891 non-null object 14 alone 891 non-null bool dtypes: bool(2), float64(2), int64(4), object(7) memory usage: 92.4+ KB None Percentage of missing values: survived 0.000000 pclass 0.000000 sex 0.000000 age 19.865320 sibsp 0.000000 parch 0.000000 fare 0.000000 embarked 0.224467 class 0.000000 who 0.000000 adult_male 0.000000 deck 77.216611 embark_town 0.224467 alive 0.000000 alone 0.000000 dtype: float64
As we can see the age column contains 19.865320% missing values and the deck column contains 77.216611% missing values, and so on.








































0 Comments