Sometimes, we want to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.
In this article, we’ll look at how to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.
How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?
To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates method.
For instance, we write
df.drop_duplicates(subset='A', keep="last")
to call drop_duplicates on the df data frame with the subset argyments to remove the items in A, while keeping the last values by setting keep to 'last‘.
Conclusion
To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates method.