Merge Pandas DataFrames with corresponding keys




One-to-one merge

Luc B.




illustration of one-to-one merge

Consider the situation depicted above where we have two DataFrames, one that lists the cities several people live and another that lists the jobs of the same several people. How do we create a single DataFrame with three columns, name, city, and job that lists the city and job of each person?

In database terminology, this is called an "one-to-one join" (or "one-to-one merge") because only one row from each DataFrame are related. Pandas has a robust suite of algorithms for performing operations like this.

Code Example

Use the DataFrame.merge() method to perform an one-to-one merge on two DataFrames. This code example implements the exact situation described above.

import pandas as pd

cities = pd.DataFrame({
    'name': ['Abe', 'Sally', 'Mark', 'Kenny', 'Sheela'],
    'city': ['New York', 'Austin', 'Atlanta', 'San Francisco', 'New York']

jobs = pd.DataFrame({
    'name': ['Mark', 'Abe', 'Sally', 'Kenny', 'Sheela'],
    'job': ['Sales', 'Dancer', 'Stock Trader', 'Janitor', 'Lawyer']

cities.merge(jobs, on='name')
name city job
0 Abe New York Dancer
1 Sally Austin Stock Trader
2 Mark Atlanta Sales
3 Kenny San Francisco Janitor
4 Sheela New York Lawyer