34. Multi-level Indexing in Pandas
By Bernd Klein. Last modified: 26 Apr 2023.
Introduction
We learned the basic concepts of Pandas in our previous chapter of our tutorial on Pandas. We introduced the data structures
- Series and
- DataFrame
We also learned how to create and manipulate the Series and DataFrame objects in numerous Python programs.
Now it is time to learn some further aspects of theses data structures in this chapter of our tutorial.
We will start with advanced indexing possibilities in Pandas.
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Advanced or Multi-Level Indexing
Advanced or multi-level indexing is available both for Series and for DataFrames. It is a fascinating way of working with higher dimensional data, using Pandas data structures. It's an efficient way to store and manipulate arbitrarily high dimension data in 1-dimensional (Series) and 2-dimensional tabular (DataFrame) structures. In other words, we can work with higher dimensional data in lower dimensions. It's time to present an example in Python:
import pandas as pd
cities = ["Vienna", "Vienna", "Vienna",
"Hamburg", "Hamburg", "Hamburg",
"Berlin", "Berlin", "Berlin",
"Zürich", "Zürich", "Zürich"]
index = [cities, ["country", "area", "population",
"country", "area", "population",
"country", "area", "population",
"country", "area", "population"]]
print(index)
OUTPUT:
[['Vienna', 'Vienna', 'Vienna', 'Hamburg', 'Hamburg', 'Hamburg', 'Berlin', 'Berlin', 'Berlin', 'Zürich', 'Zürich', 'Zürich'], ['country', 'area', 'population', 'country', 'area', 'population', 'country', 'area', 'population', 'country', 'area', 'population']]
data = ["Austria", 414.60, 1805681,
"Germany", 755.00, 1760433,
"Germany", 891.85, 3562166,
"Switzerland", 87.88, 378884]
city_series = pd.Series(data, index=index)
print(city_series)
OUTPUT:
Vienna country Austria area 414.6 population 1805681 Hamburg country Germany area 755.0 population 1760433 Berlin country Germany area 891.85 population 3562166 Zürich country Switzerland area 87.88 population 378884 dtype: object
We can access the data of a city in the following way:
print(city_series["Vienna"])
OUTPUT:
country Austria area 414.6 population 1805681 dtype: object
We can also access the information about the country, area or population of a city. We can do this in two ways:
print(city_series["Vienna"]["area"])
OUTPUT:
414.6
The other way to accomplish it:
print(city_series["Vienna", "area"])
OUTPUT:
414.6
We can also get the content of multiple cities at the same time by using a list of city names as the key:
city_series["Hamburg",:]
OUTPUT:
country Germany area 755.0 population 1760433 dtype: object
If the index is sorted, we can also apply a slicing operation:
city_series = city_series.sort_index()
print("city_series with sorted index:")
print(city_series)
print("\n\nSlicing the city_series:")
city_series["Berlin":"Vienna"]
OUTPUT:
city_series with sorted index: Berlin area 891.85 country Germany population 3562166 Hamburg area 755.0 country Germany population 1760433 Vienna area 414.6 country Austria population 1805681 Zürich area 87.88 country Switzerland population 378884 dtype: object Slicing the city_series: Berlin area 891.85 country Germany population 3562166 Hamburg area 755.0 country Germany population 1760433 Vienna area 414.6 country Austria population 1805681 dtype: object
In the next example, we show that it is possible to access the inner keys as well:
print(city_series[:, "area"])
OUTPUT:
Berlin 891.85 Hamburg 755.0 Vienna 414.6 Zürich 87.88 dtype: object
Swapping MultiIndex Levels
It is possible to swap the levels of a MultiIndex with the method swaplevel:
swaplevel(self, i=-2, j=-1, copy=True)
Swap levels i and j in a MultiIndex
Parameters
----------
i, j : int, string (can be mixed)
Level of index to be swapped. Can pass level name as string.
The indexes 'i' and 'j' are optional, and default to
the two innermost levels of the index
Returns
-------
swapped : Series
city_series = city_series.swaplevel()
city_series.sort_index(inplace=True)
city_series
OUTPUT:
area Berlin 891.85 Hamburg 755.0 Vienna 414.6 Zürich 87.88 country Berlin Germany Hamburg Germany Vienna Austria Zürich Switzerland population Berlin 3562166 Hamburg 1760433 Vienna 1805681 Zürich 378884 dtype: object
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Upcoming online Courses