We can select rows from a MultiIndex DataFrame in Pandas using the .loc[]
accessor.
We need to provide the index labels for each level of the MultiIndex.
For instance, we write
import pandas as pd
# Create a MultiIndex DataFrame
arrays = [['A', 'A', 'B', 'B'], [1, 2, 1, 2]]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second'))
df = pd.DataFrame({'data': [1, 2, 3, 4]}, index=index)
# Select rows for first level 'A'
rows_A = df.loc['A']
print("Rows for first level 'A':")
print(rows_A)
# Select rows for first level 'B' and second level 1
rows_B_1 = df.loc[('B', 1)]
print("\nRows for first level 'B' and second level 1:")
print(rows_B_1)
This code will output the selected rows based on the specified index labels.
Keep in mind that you can use different indexers based on your requirements. .loc[]
is primarily label-based, while .iloc[]
is integer-based.
In the case of MultiIndex DataFrames, .loc[]
is generally more convenient.