Use mapping from one Pandas DataFrame to add columns to another

Python

Pandas

Join

One-to-many merge

Luc B.

Python

Pandas

DataFrame

illustration of one-to-many merge

Consider the situation depicted above where we have two DataFrames, one that lists the cities several people live in and another that specifies the state each city is in. How do we create a single DataFrame with three columns, name, city, and state that lists the city and state for each person?

In database terminology, this is called an "one-to-many join" (or "one-to-many merge") because one row from one DataFrame (the mapping) is related to many rows in another DataFrame (the data). Pandas has a robust suite of algorithms for performing operations like this.

Code Example

Use the DataFrame.merge() method to perform an one-to-many merge on two DataFrames. This code example implements the exact situation described above.

import pandas as pd

people = pd.DataFrame({
    'name': ['Abe', 'Sally', 'Mark', 'Kenny', 'Sheela'],
    'city': ['New York', 'Austin', 'Atlanta', 'San Francisco', 'New York']
})

cities = pd.DataFrame({
    'city': ['New York', 'Austin', 'Atlanta', 'San Francisco', 'Chicago'],
    'state': ['NY', 'TX', 'GA', 'CA', 'IL']
})

people.merge(cities, on='city')
name city state
0 Abe New York NY
1 Sheela New York NY
2 Sally Austin TX
3 Mark Atlanta GA
4 Kenny San Francisco CA