Categories
Python Answers

How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?

Spread the love

Sometimes, we want to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.

In this article, we’ll look at how to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.

How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?

To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates method.

For instance, we write

df.drop_duplicates(subset='A', keep="last")

to call drop_duplicates on the df data frame with the subset argyments to remove the items in A, while keeping the last values by setting keep to 'last‘.

Conclusion

To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates method.

By John Au-Yeung

Web developer specializing in React, Vue, and front end development.

Leave a Reply

Your email address will not be published. Required fields are marked *