GeoPandas library in Python

Key takeaways:

  • GeoPandas extends the pandas library by adding features for handling geospatial data, making it useful for spatial analysis.

  • GeoPandas includes built-in datasets, such as world-country boundaries and cities, for easy geospatial analysis.

  • Performing spatial analysis is straightforward with GeoPandas, enabling tasks like identifying relationships between cities and countries on maps.

  • Dissolving and aggregating data helps merge geospatial data based on attributes like continent, allowing for the analysis of aggregated populations or areas.

  • GeoPandas integrates with Matplotlib for easy and flexible visualization of geospatial data, allowing you to create and save customized maps.

  • Overlaying datasets allows for combining multiple layers of geospatial data, like cities and country borders, in a single visual representation.

What is the GeoPandas library in Python?

GeoPandas is a Python library that extends the capabilities of pandas by adding functionalities related to geospatial data. It combines the functionality of pandas with geospatial libraries like Shapely and Fiona. It is a powerful tool for the visualization of spatial data. We use GeoPandas in urban planning, environmental sciences, and the domains where spatial data analysis is required.

In this answer, we will explore various practical use cases of GeoPandas that demonstrate geospatial analysis.

Applications of GeoPandas
Applications of GeoPandas

Built-in dataset

The code imports the GeoPandas library for geospatial data handling and reads world data using GeoPandas’s built-in naturalearth_lowres dataset, storing it in a GeoDataFrame named world. Finally, it prints the first few rows of this dataset to give a quick look at its contents.

import geopandas as gpd
# Read the world data and show the header
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
print(world.head())

Basic visualization

The following code reads the built-in naturalearth_cities dataset, and stores it in a GeoDataFrame named capitals. It displays the first few rows to give an overview of the dataset contents, which includes the capitals of countries.

import geopandas as gpd
capitals = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
print(capitals.head())

We can use the dataset to plot countries based on their GDP per capita. We first filter out countries with missing or zero population data, as well as Antarctica. Then, we calculate GDP per capita by dividing estimated GDP by population. The plot() function in GeoPandas generates the map, automatically coloring countries based on their GDP per capita using the pyplot library. Finally, the plot is saved and displayed.

import geopandas as gpd
import matplotlib.pyplot as plt
# Read the world data
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Define the map while omitting non populated countries and Antarctica
world = world[(world.pop_est>0) & (world.name!="Antarctica")]
# Calculate GDP per capita by dividing GDP by population size
world['gdp_per_cap'] = world.gdp_md_est / world.pop_est
world.plot(column='gdp_per_cap')
plt.savefig("./output/Plot.png")
plt.show()

Through GeoPandas, we can plot the map of the country as well. To plot the map of different countries, we change the country’s name as shown below in the code. The map uses a Greens_r color map, and matplotlib.pyplot handles figure customization.

import geopandas as gpd
import matplotlib.pyplot as plt
# Use pyplot (plt) to plot a country map such as Pakistan
fig, ax_nz = plt.subplots(figsize=(8,6))
countries = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
countries[countries["name"] == "Pakistan"].plot(cmap='Greens_r', ax=ax_nz)
plt.savefig("./output/Plot.png")
plt.show()

We can visualize the geographical distribution of countries worldwide and customize the plot by adjusting the color map, adding legends, or modifying the style per your requirements. Run the following code to have a beautiful illustration of the world map.

import geopandas as gpd
import matplotlib.pyplot as plt
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world_data.plot()
plt.title('World Countries Data')
plt.savefig("./output/Plot.png")
plt.show()

Filtering and querying

The provided code demonstrates filtering and querying of the world countries’ data using GeoPandas. It specifically filters the countries based on their population, selecting only those countries with a population greater than 100100 million. The filtered data is then printed, displaying the countries’ names and corresponding populations.

import geopandas as gpd
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Filter countries with a population greater than 100 million
filtered_data = world_data[world_data['pop_est'] > 100000000]
print(filtered_data[['name', 'pop_est']])

Geometric and spatial operations

Using GeoPandas, we can calculate each country’s area and find the centroid (geometric center) of each country’s shape. Such operations are crucial for applications like comparing region sizes, labeling maps, spatial clustering, and conducting land-use analysis. To ensure accurate area measurements, geometries are reprojected to a projected CRSCRS is a Coordinate Reference System that defines how spatial data is represented on the Earth's surface using coordinates and a datum. It ensures consistency and accuracy in mapping, with examples like WGS84 (used in GPS) and EPSG:3395 (World Mercator). (e.g., EPSG:3395), which allows computations in units like square kilometers instead of degrees. The code below demonstrates these steps.

To do that, the code is given below.

import geopandas as gpd
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Reproject geometries to a projected CRS (e.g., World Mercator - EPSG:3395)
world_data = world_data.to_crs(epsg=3395)
# Calculate the area of each country and find the centroid of each country
world_data['area_sqkm'] = world_data.geometry.area
world_data['centroid'] = world_data.geometry.centroid
print(world_data.head())

Spatial joins and merging

GeoPandas helps perform a spatial join between two datasets—one containing world countries’ data and the other containing city-data. The goal is to identify which country each city belongs to by checking if a city’s coordinates fall within a country’s boundaries.

import geopandas as gpd
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
cities_data = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
countries_with_cities = gpd.sjoin(world_data, cities_data, how='left', op='contains')
print(countries_with_cities[['name_left', 'name_right']].drop_duplicates())

The output contains two key columns:

  • name_left: The name of the country from the world_data dataset.

  • name_right: The name of the city from the cities_data dataset.

This spatial join effectively matches each city to its corresponding country, resulting in a dataset that associates countries with their contained cities. The drop_duplicates() ensures each unique city-country pair is listed only once.

Map overlay and spatial analysis

The code below visually represents the locations of cities within each country, allowing for spatial analysis and identification of the relationship between cities and countries. The countries’ data is plotted as boundaries, and the cities’ data is represented as red dots on the same map.

import geopandas as gpd
import matplotlib.pyplot as plt
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
cities_data = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
# Example: Overlay countries and cities data to identify cities within each country
world_data.plot()
cities_data.plot(ax=plt.gca(), color='red', markersize=5)
plt.title('World Countries and Cities Data')
plt.savefig("./output/Plot.png")
plt.show()

Dissolving and aggregating

The provided code demonstrates the usage of the dissolve function in GeoPandas for dissolving and aggregating geospatial data based on a specific attribute. In this case, it dissolves the countries’ boundaries based on their continent and calculates the total population for each continent.

import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Point
import warnings
# Ignore the FutureWarning
warnings.simplefilter(action='ignore', category=FutureWarning)
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Function 15: Dissolving and Aggregating
# Example: Dissolve countries based on their continent and calculate the total population for each continent
continent_population = world_data.dissolve(by='continent', aggfunc='sum', as_index=False)
print(continent_population)

Voronoi diagram

We can construct and visualize a Voronoi diagram for the centroids of world countries using geopandas, scipy.spatial, and matplotlib. The Voronoi diagram is a geometric representation that partitions the plane into regions based on the proximity to a set of input points. In this case, the input points are the world countries’ centroids.

import geopandas as gpd
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
import numpy as np
world_data = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
points = np.array(world_data['geometry'].apply(lambda geom: (geom.centroid.x, geom.centroid.y)).tolist())
voronoi_diagram = Voronoi(points)
# Plot the Voronoi diagram
fig, ax = plt.subplots(figsize=(10, 6))
# Plot the Voronoi regions
voronoi_plot_2d(voronoi_diagram, ax=ax, show_vertices=False)
# Add world countries as an overlay
world_data.plot(ax=ax, edgecolor='black', facecolor='none')
plt.title('Voronoi Diagram of World Countries')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.savefig("./output/Plot.png")
plt.show()

Conclusion

The functionalities of GeoPandas help us interact with geospatial data. With GeoPandas, we can visualize data to understand the country’s economics. We can perform filtering, make Voronoi diagrams, and perform geometrical operations on the data. In short, in this answer, we discussed the important uses of GeoPandas while analyzing the data on a map.

To continue learning about geospatial data and analysis, explore the following projects that will get you hands-on with real-world examples:

1

Which of the following is a valid GeoPandas geometry type?

A)

LinePoint

B)

Polygon

C)

Sphere

D)

Circle

Question 1 of 30 attempted

Frequently asked questions

Haven’t found what you were looking for? Contact Us


How to install GeoPandas for Python?

You can install geopandas by running the command pip install geopandas in your terminal or command prompt.


What is the difference between pandas and geopandas?

pandas is used for handling regular data tables, while geopandas extends it to handle geographic data, adding support for spatial operations.


What is the use of GeoPandas in Python?

GeoPandas is used to work with geographic data, allowing you to perform spatial analysis, plot maps, and manage geometry like points, lines, and polygons.


Free Resources

Copyright ©2025 Educative, Inc. All rights reserved