GeoPandas extends pandas to enable spatial operations on geometric types. It combines the capabilities of pandas and shapely for geospatial data analysis.
uv pip install geopandas
# For interactive maps
uv pip install folium
# For classification schemes in mapping
uv pip install mapclassify
# For faster I/O operations (2-4x speedup)
uv pip install pyarrow
# For PostGIS database support
uv pip install psycopg2
uv pip install geoalchemy2
# For basemaps
uv pip install contextily
# For cartographic projections
uv pip install cartopy
import geopandas as gpd
# Read spatial data
gdf = gpd.read_file("data.geojson")
# Basic exploration
print(gdf.head())
print(gdf.crs)
print(gdf.geometry.geom_type)
# Simple plot
gdf.plot()
# Reproject to different CRS
gdf_projected = gdf.to_crs("EPSG:3857")
# Calculate area (use projected CRS for accuracy)
gdf_projected['area'] = gdf_projected.geometry.area
# Save to file
gdf.to_file("output.gpkg")
See data-structures.md for details.
GeoPandas reads/writes multiple formats: Shapefile, GeoJSON, GeoPackage, PostGIS, Parquet.
# Read with filtering
gdf = gpd.read_file("data.gpkg", bbox=(xmin, ymin, xmax, ymax))
# Write with Arrow acceleration
gdf.to_file("output.gpkg", use_arrow=True)
See data-io.md for comprehensive I/O operations.
Always check and manage CRS for accurate spatial operations:
# Check CRS
print(gdf.crs)
# Reproject (transforms coordinates)
gdf_projected = gdf.to_crs("EPSG:3857")
# Set CRS (only when metadata missing)
gdf = gdf.set_crs("EPSG:4326")
See crs-management.md for CRS operations.
Buffer, simplify, centroid, convex hull, affine transformations:
# Buffer by 10 units
buffered = gdf.geometry.buffer(10)
# Simplify with tolerance
simplified = gdf.geometry.simplify(tolerance=5, preserve_topology=True)
# Get centroids
centroids = gdf.geometry.centroid
See geometric-operations.md for all operations.
Spatial joins, overlay operations, dissolve:
# Spatial join (intersects)
joined = gpd.sjoin(gdf1, gdf2, predicate='intersects')
# Nearest neighbor join
nearest = gpd.sjoin_nearest(gdf1, gdf2, max_distance=1000)
# Overlay intersection
intersection = gpd.overlay(gdf1, gdf2, how='intersection')
# Dissolve by attribute
dissolved = gdf.dissolve(by='region', aggfunc='sum')
See spatial-analysis.md for analysis operations.
Create static and interactive maps:
# Choropleth map
gdf.plot(column='population', cmap='YlOrRd', legend=True)
# Interactive map
gdf.explore(column='population', legend=True).save('map.html')
# Multi-layer map
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
gdf1.plot(ax=ax, color='blue')
gdf2.plot(ax=ax, color='red')
See visualization.md for mapping techniques.
# 1. Load data
gdf = gpd.read_file("data.shp")
# 2. Check and transform CRS
print(gdf.crs)
gdf = gdf.to_crs("EPSG:3857")
# 3. Perform analysis
gdf['area'] = gdf.geometry.area
buffered = gdf.copy()
buffered['geometry'] = gdf.geometry.buffer(100)
# 4. Export results
gdf.to_file("results.gpkg", layer='original')
buffered.to_file("results.gpkg", layer='buffered')
# Join points to polygons
points_in_polygons = gpd.sjoin(points_gdf, polygons_gdf, predicate='within')
# Aggregate by polygon
aggregated = points_in_polygons.groupby('index_right').agg({
'value': 'sum',
'count': 'size'
})
# Merge back to polygons
result = polygons_gdf.merge(aggregated, left_index=True, right_index=True)
# Read from different sources
roads = gpd.read_file("roads.shp")
buildings = gpd.read_file("buildings.geojson")
parcels = gpd.read_postgis("SELECT * FROM parcels", con=engine, geom_col='geom')
# Ensure matching CRS
buildings = buildings.to_crs(roads.crs)
parcels = parcels.to_crs(roads.crs)
# Perform spatial operations
buildings_near_roads = buildings[buildings.geometry.distance(roads.union_all()) < 50]
bbox, mask, or where parameters to load only needed datause_arrow=True for 2-4x faster reading/writing.simplify() to reduce complexity when precision isn't critical.is_valid before operations.copy() when modifying geometry columns to avoid side effects