Merging Data¶
There are two ways to combine datasets in geopandas – attribute joins and spatial joins.
In an attribute join, a GeoSeries
or GeoDataFrame
is combined with a regular pandas Series
or DataFrame
based on a common variable. This is analogous to normal merging or joining in pandas.
In a Spatial Join, observations from to GeoSeries
or GeoDataFrames
are combined based on their spatial relationship to one another.
In the following examples, we use these datasets:
In [1]: world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
In [2]: cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
# For attribute join
In [3]: country_shapes = world[['geometry', 'iso_a3']]
In [4]: country_names = world[['name', 'iso_a3']]
# For spatial join
In [5]: countries = world[['geometry', 'name']]
In [6]: countries = countries.rename(columns={'name':'country'})
Attribute Joins¶
Attribute joins are accomplished using the merge
method. In general, it is recommended to use the merge
method called from the spatial dataset. With that said, the stand-alone merge
function will work if the GeoDataFrame is in the left
argument; if a DataFrame is in the left
argument and a GeoDataFrame is in the right
position, the result will no longer be a GeoDataFrame.
For example, consider the following merge that adds full names to a GeoDataFrame
that initially has only ISO codes for each country by merging it with a pandas DataFrame
.
# `country_shapes` is GeoDataFrame with country shapes and iso codes
In [7]: country_shapes.head()
Out[7]:
geometry iso_a3
0 (POLYGON ((180 -16.06713266364245, 180 -16.555... FJI
1 POLYGON ((33.90371119710453 -0.950000000000000... TZA
2 POLYGON ((-8.665589565454809 27.65642588959236... ESH
3 (POLYGON ((-122.84 49.00000000000011, -122.974... CAN
4 (POLYGON ((-122.84 49.00000000000011, -120 49.... USA
# `country_names` is DataFrame with country names and iso codes
In [8]: country_names.head()