pandas, a popular data manipulation library in Python, offers an essential feature known as multi-indexing (multilevel indexing) or hierarchical indexing. Multilevel indexing is when one index refers to one or more indexes, and those indexes further refer to values. This can be useful when dealing with different kinds of data.
We can create multi-index in pandas using various methods:
MultiIndex.from_arrays()
methodThis method creates a multi-index from a list of arrays, each representing a different index level.
pandas.MultiIndex.from_arrays(arrays, names=None)
The parameters involved are as follows:
arrays
: A list of arrays where each array represents one level of the MultiIndex. The arrays should have the same length.
names
: An optional list of names for the levels of the MultiIndex.
import pandas as pdarrays = [['John', 'John', 'Nick', 'Nick'], [30, 45, 30, 45]]multi_index = pd.MultiIndex.from_arrays(arrays, names=('Name', 'Number'))print(multi_index)
MultiIndex.from_tuples()
methodThis method creates a multi-index from an array of tuples, where each tuple represents a unique index entry across multiple levels.
pandas.MultiIndex.from_tuples(tuples, names=None)
The parameters involved are as follows:
tuples
: A list of tuples, where each tuple represents one entry in the MultiIndex, each element corresponding to a level in the MultiIndex.
names
: An optional list of names for the levels of the MultiIndex.
import pandas as pdtuples = [('John', 30), ('John', 45), ('Nick', 30), ('Nick', 45)]multi_index = pd.MultiIndex.from_tuples(tuples, names=('Name', 'Number'))print(multi_index)
MultiIndex.from_product()
methodThis method generates a multi-index by taking the Cartesian product of iterables, which helps create all possible combinations of index entries.
pandas.MultiIndex.from_product(iterables, names=None)
The parameters involved are as follows:
iterables
: A list of iterables, each representing one level of the MultiIndex. The from_product()
method takes the cartesian product of these iterables to generate all possible combinations.
names
: An optional list of names for the levels of the MultiIndex.
import pandas as pdNames = ['John', 'Nick']Numbers = [30, 45]multi_index = pd.MultiIndex.from_product([Names, Numbers], names=('Name', 'Number'))print(multi_index)
MultiIndex.from_frame()
methodThis method directly creates a multi-index from an existing DataFrame. It uses the DataFrame's columns to form the levels of the MultiIndex.
pandas.MultiIndex.from_frame(frame, names=None)
The parameters involved are as follows:
frame
: The DataFrame from which the MultiIndex will be created. The DataFrame has one or more columns acting as levels of the MultiIndex.
names
: An optional list of names for the levels of the MultiIndex.
import pandas as pddata = {'Name': ['John', 'John', 'Nick', 'Nick'],'Number': [30, 45, 30, 45]}frame = pd.DataFrame(data)multi_index = pd.MultiIndex.from_frame(frame)print(multi_index)
Multilevel indexing in pandas enables efficient manipulation, analysis, and visualization of complex datasets. It allows data scientists and analysts to handle complex datasets efficiently, providing a powerful tool for organizing and analyzing multidimensional data.
Free Resources