Let’s say a column in a DataFrame has some duplicate values. We want to know how many unique values the column has and what the unique values are. We can use the nunique() and unique() functions in the pandas Python library to find out the same.
Let’s take the example of the iris dataset for this purpose.
import pandas df = pandas.read_csv("iris.csv") print(df.head())
The output shows:
sepal_length sepal_width petal_length petal_width species 0 5.1 3.5 1.4 0.2 setosa 1 4.9 3.0 1.4 0.2 setosa 2 4.7 3.2 1.3 0.2 setosa 3 4.6 3.1 1.5 0.2 setosa 4 5.0 3.6 1.4 0.2 setosa
So, the dataset contains the sepal length, sepal width, petal length, and petal width of some flowers. And based on these features the type of flowers can be determined. The species column shows the type of flowers here.
Now, let’s say we want to know how many different species of flowers are there and what those different species are. We can find out the same using the nunique() and unique() functions.
import pandas df = pandas.read_csv("iris.csv") print(df.head()) number = df["species"].nunique() unique_values = pandas.unique(df["species"]) print("Total number of unique values: ", number) print("The unique values: ", unique_values)
Here, df[“species”] returns a …
0 Comments