Sometimes, we want to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.
In this article, we’ll look at how to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.
How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?
To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates
method.
For instance, we write
df.drop_duplicates(subset='A', keep="last")
to call drop_duplicates
on the df
data frame with the subset
argyments to remove the items in A
, while keeping the last values by setting keep
to 'last
‘.
Conclusion
To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates
method.