How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?

Spread the love

Sometimes, we want to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.

In this article, we’ll look at how to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas.

How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?

To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates method.

For instance, we write

df.drop_duplicates(subset='A', keep="last")

to call drop_duplicates on the df data frame with the subset argyments to remove the items in A, while keeping the last values by setting keep to 'last‘.

Conclusion

To remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas, we an use the drop_duplicates method.

How to remove duplicates by columns A, keeping the row with the highest value in column B with Python Pandas?

Conclusion

Related Posts

By John Au-Yeung

Leave a Reply Cancel reply