0 1 2
0 1 4 7
1 2 5 8
2 3 6 9
3 2 5 8
4 10 12 14
Now, we can remove duplicate rows from the DataFrame df.T using DataFrame.drop_duplicates() function. And after removing the duplicate rows, we can again take a transpose of the DataFrame.
df2 = df.T.drop_duplicates(keep="first", ignore_index=False).T
Please also note that the parameter keep=”first” indicates that if two or more rows are duplicates, then the first duplicate row will be kept and the rest of the duplicate rows will be removed. If we provide keep=”last”, then the last duplicate row will be kept and the rest of the duplicate rows will be removed. And if we provide keep=False, then all the duplicate rows will be removed.
So, the output of the given program will be:
df before removing duplicate columns: 0 1 2 3 4 0 1 2 3 2 10 1 4 5 6 5 12 2 7 8 9 8 14 df2 after removing duplicate columns: 0 1 2 4 0 1 2 3 10 1 4 5 6 12 2 7 8 9 14
Please also note that we have provided ignore_index=False. As a result, the columns are not relabeled after removal of the duplicate columns. If we want to relabel the columns after removing duplicate columns, we can use the following Python code:
import pandas
list1 = [[1, 2, 3, 2, 10], [4, 5, 6, 5, 12], [7, 8, 9, 8, 14]]
df = pandas.DataFrame(list1)
print("df before removing duplicate columns: \n", df)
df3 = df.T.drop_duplicates(keep="first", ignore_index=True).T
print("df3 after removing duplicate columns: \n", df3)
The output of the above program will be:








































0 Comments