How to Join Pandas DataFrames using Merge?
Last Updated :
05 May, 2022
Joining and merging DataFrames is that the core process to start out with data analysis and machine learning tasks. It's one of the toolkits which each Data Analyst or Data Scientist should master because in most cases data comes from multiple sources and files. In this tutorial, you'll how to join data frames in pandas using the merge technique. More specifically, we will practice the concatenation of DataFrames along row and column.
Getting Started
The most widely used operation related to DataFrames is the merging operation. Two DataFrames might hold different kinds of information about the same entity and they may have some same columns, so we need to combine the two data frames in pandas for better reliability code. To join these DataFrames, pandas provides various functions like join(), concat(), merge(), etc. In this section, you will practice using the merge() function of pandas. There are basically four methods of merging:
- inner join
- outer join
- right join
- left join
Inner join
From the name itself, it is clear enough that the inner join keeps rows where the merge “on” value exists in both the left and right dataframes.
Now let us create two dataframes and then try merging them using inner.
Python3
import numpy as np
import pandas as pd
left = pd.DataFrame({'Sr.no': ['1', '2', '3', '4', '5'],
'Name': ['Rashmi', 'Arun', 'John',
'Kshitu', 'Bresha'],
'Roll No': ['1', '2', '3', '4', '5']})
right = pd.DataFrame({'Sr.no': ['2', '4', '6', '7', '8'],
'Gender': ['F', 'M', 'M', 'F', 'F'],
'Interest': ['Writing', 'Cricket', 'Dancing',
'Chess', 'Sleeping']})
# Merging the dataframes
pd.merge(left, right, how ='inner', on ='Sr.no')
Output: 
Outer join
An outer join returns all the rows from the left dataframe, all the rows from the right dataframe, and matches up rows where possible, with NaNs elsewhere. But if the dataframe is complete, then we get the same output.
Python3
import numpy as np
import pandas as pd
left = pd.DataFrame({'Sr.no': ['1', '2', '3', '4', '5'],
'Name': ['Rashmi', 'Arun', 'John',
'Kshitu', 'Bresha'],
'Roll No': ['1', '2', '3', '4', '5']})
right = pd.DataFrame({'Sr.no': ['2', '4', '6', '7', '8'],
'Gender': ['F', 'M', 'M', 'F', 'F'],
'Interest': ['Writing', 'Cricket', 'Dancing',
'Chess', 'Sleeping']})
# Merging the dataframes
pd.merge(left, right, how ='outer', on ='Sr.no')
Output: 
Left join
With a left join, all the records from the first dataframe will be displayed, irrespective of whether the keys in the first dataframe can be found in the second dataframe. Whereas, for the second dataframe, only the records with the keys in the second dataframe that can be found in the first dataframe will be displayed.
Python3
import numpy as np
import pandas as pd
left = pd.DataFrame({'Sr.no': ['1', '2', '3', '4', '5'],
'Name': ['Rashmi', 'Arun', 'John',
'Kshitu', 'Bresha'],
'Roll No': ['1', '2', '3', '4', '5']})
right = pd.DataFrame({'Sr.no': ['2', '4', '6', '7', '8'],
'Gender': ['F', 'M', 'M', 'F', 'F'],
'Interest': ['Writing', 'Cricket',
'Dancing', 'Chess',
'Sleeping']})
# Merging the dataframes
pd.merge(left, right, how ='left', on ='Sr.no')
Output:
Note the Output Carefully.
Right join
For a right join, all the records from the second dataframe will be displayed. However, only the records with the keys in the first dataframe that can be found in the second dataframe will be displayed.
Python3
import numpy as np
import pandas as pd
left = pd.DataFrame({'Sr.no': ['1', '2', '3', '4', '5'],
'Name': ['Rashmi', 'Arun', 'John',
'Kshitu', 'Bresha'],
'Roll No': ['1', '2', '3', '4', '5']})
right = pd.DataFrame({'Sr.no': ['2', '4', '6', '7', '8'],
'Gender': ['F', 'M', 'M', 'F', 'F'],
'Interest': ['Writing', 'Cricket', 'Dancing',
'Chess', 'Sleeping']})
# Merging the dataframes
pd.merge(left, right, how ='right', on ='Sr.no')
Output: 
Similar Reads
Joining two Pandas DataFrames using merge() The merge() function is designed to merge two DataFrames based on one or more columns with matching values. The basic idea is to identify columns that contain common data between the DataFrames and use them to align rows. Let's understand the process of joining two pandas DataFrames using merge(), e
4 min read
How to Merge Two Pandas DataFrames on Index Merging two pandas DataFrames on their index is necessary when working with datasets that share the same row identifiers but have different columns. The core idea is to align the rows of both DataFrames based on their indices, combining the respective columns into one unified DataFrame. To merge two
3 min read
Join Pandas DataFrames matching by substring Prerequisites: Pandas In this article, we will learn how to join two Data Frames matching by substring with python. Functions used:join(): joins all the elements in an iteration into a single stringlambda(): an anonymous method which is declared without a name and can accept any number of parameter
1 min read
How to Union Pandas DataFrames using Concat? concat() function does all of the heavy liftings of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes. The concat() function combines data frames in one of two ways: Stacked: Axis = 0 (This is the d
1 min read
Pandas Join Dataframes Joining DataFrames is a common operation in data analysis, where you combine two or more DataFrames based on common columns or indices. Pandas provides various methods to perform joins, allowing you to merge data in flexible ways. In this article, we will explore how to join DataFrames using methods
4 min read
How to Merge DataFrames of different length in Pandas ? Merging DataFrames of different lengths in Pandas can be done using the merge(), and concat(). These functions allow you to combine data based on shared columns or indices, even if the DataFrames have unequal lengths. By using the appropriate merge method (like a left join, right join, or outer join
3 min read