Categories
Python Answers

How to select rows in Python Pandas MultiIndex DataFrame?

Spread the love

We can select rows from a MultiIndex DataFrame in Pandas using the .loc[] accessor.

We need to provide the index labels for each level of the MultiIndex.

For instance, we write

import pandas as pd

# Create a MultiIndex DataFrame
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second'))
df = pd.DataFrame({'data': [1, 2, 3, 4]}, index=index)

# Select rows for first level 'A'
rows_A = df.loc['A']
print("Rows for first level 'A':")
print(rows_A)

# Select rows for first level 'B' and second level 1
rows_B_1 = df.loc[('B', 1)]
print("\nRows for first level 'B' and second level 1:")
print(rows_B_1)

This code will output the selected rows based on the specified index labels.

Keep in mind that you can use different indexers based on your requirements. .loc[] is primarily label-based, while .iloc[] is integer-based.

In the case of MultiIndex DataFrames, .loc[] is generally more convenient.

By John Au-Yeung

Web developer specializing in React, Vue, and front end development.

Leave a Reply

Your email address will not be published. Required fields are marked *