Multi-indexing in pandas

pandas, a popular data manipulation library in Python, offers an essential feature known as multi-indexing (multilevel indexing) or hierarchical indexing. Multilevel indexing is when one index refers to one or more indexes, and those indexes further refer to values. This can be useful when dealing with different kinds of data.

We can create multi-index in pandas using various methods:

The MultiIndex.from_arrays() method

This method creates a multi-index from a list of arrays, each representing a different index level.

Syntax

pandas.MultiIndex.from_arrays(arrays, names=None)

The parameters involved are as follows:

  • arrays: A list of arrays where each array represents one level of the MultiIndex. The arrays should have the same length.

  • names: An optional list of names for the levels of the MultiIndex.

Code example

import pandas as pd
arrays = [['John', 'John', 'Nick', 'Nick'], [30, 45, 30, 45]]
multi_index = pd.MultiIndex.from_arrays(arrays, names=('Name', 'Number'))
print(multi_index)

The MultiIndex.from_tuples()method

This method creates a multi-index from an array of tuples, where each tuple represents a unique index entry across multiple levels.

Syntax

pandas.MultiIndex.from_tuples(tuples, names=None)

The parameters involved are as follows:

  • tuples: A list of tuples, where each tuple represents one entry in the MultiIndex, each element corresponding to a level in the MultiIndex.

  • names: An optional list of names for the levels of the MultiIndex.

Code example

import pandas as pd
tuples = [('John', 30), ('John', 45), ('Nick', 30), ('Nick', 45)]
multi_index = pd.MultiIndex.from_tuples(tuples, names=('Name', 'Number'))
print(multi_index)

The MultiIndex.from_product()method

This method generates a multi-index by taking the Cartesian product of iterables, which helps create all possible combinations of index entries.

Syntax

pandas.MultiIndex.from_product(iterables, names=None)

The parameters involved are as follows:

  • iterables: A list of iterables, each representing one level of the MultiIndex. The from_product() method takes the cartesian product of these iterables to generate all possible combinations.

  • names: An optional list of names for the levels of the MultiIndex.

Code example

import pandas as pd
Names = ['John', 'Nick']
Numbers = [30, 45]
multi_index = pd.MultiIndex.from_product([Names, Numbers], names=('Name', 'Number'))
print(multi_index)

The MultiIndex.from_frame()method

This method directly creates a multi-index from an existing DataFrame. It uses the DataFrame's columns to form the levels of the MultiIndex.

Syntax

pandas.MultiIndex.from_frame(frame, names=None)

The parameters involved are as follows:

  • frame: The DataFrame from which the MultiIndex will be created. The DataFrame has one or more columns acting as levels of the MultiIndex.

  • names: An optional list of names for the levels of the MultiIndex.

Code example

import pandas as pd
data = {
'Name': ['John', 'John', 'Nick', 'Nick'],
'Number': [30, 45, 30, 45]
}
frame = pd.DataFrame(data)
multi_index = pd.MultiIndex.from_frame(frame)
print(multi_index)

Conclusion

Multilevel indexing in pandas enables efficient manipulation, analysis, and visualization of complex datasets. It allows data scientists and analysts to handle complex datasets efficiently, providing a powerful tool for organizing and analyzing multidimensional data.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved