Python | Pandas DataFrame.set_index()

Last Updated : 11 Jul, 2025

Pandas set_index() method is used to set one or more columns of a DataFrame as the index. This is useful when we need to modify or add new indices to our data as it enhances data retrieval, indexing and merging tasks. Setting the index is helpful for organizing the data more efficiently, especially when we have meaningful column values that can act as identifiers such as employee names, IDs or dates.

Lets see a basic example:

Here we are using a Employee Dataset which you can download it from here. Let’s first load the Employee Dataset to see how to use set_index().

Python

import pandas as pd

data = pd.read_csv("/content/employees.csv")
print("Employee Dataset:")
display(data.head(5))

Output:

Now we are using Pandas DataFrame.set_index() to set a Single Column as Index.

Python

data.set_index("First Name", inplace=True)
print("\nEmployee Dataset with 'First Name' as Index:")
display(data.head(5))

Output:

We set the "First Name" column as the index which makes it easier to access data by the employee's first name.

Syntax:

DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

Parameters:

keys: A single column name or a list of column names to set as the index.
drop: Boolean (default: True). If True, the specified column will be removed from the DataFrame. If False, they are retained as regular columns.
append: Boolean (default: False). If True, the column will be added to the existing index, creating a multi-level index.
inplace: Boolean (default: False). If True, modifies the original DataFrame without returning a new one.
verify_integrity: Boolean (default: False). If True, checks for duplicate index values.

Return: Return type is a new DataFrame with the specified index, unless inplace=True which modifies the original DataFrame directly.

Now let see some practical examples better understand how to use the Pandas set_index() function.

1. Setting Multiple Columns as Index (MultiIndex)

In this example, we set both First Name and Gender as the index columns using the set_index() method with the append and drop parameters. This is useful when we want to organize data by multiple columns.

Python

import pandas as pd
data = pd.read_csv("employees.csv")

data.set_index(["First Name", "Gender"], inplace=True, append=True, drop=False)
data.head()

Output:

Set-multiple-columns-as-Index — Set Multiple Columns as MultiIndex

2. Setting a Float Column as Index

In some cases, we may want to use numeric or float columns as the index which is useful for datasets with scores or other numeric data that should act as unique identifiers. Here, we set the Agg_Marks (a float column) as the index for a DataFrame containing student data.

Python

import pandas as pd

students = [['jack', 34, 'Sydeny', 'Australia', 85.96],
            ['Riti', 30, 'Delhi', 'India', 95.20],
            ['Vansh', 31, 'Delhi', 'India', 85.25],
            ['Nanyu', 32, 'Tokyo', 'Japan', 74.21],
            ['Maychan', 16, 'New York', 'US', 99.63],
            ['Mike', 17, 'Las Vegas', 'US', 47.28]]

df = pd.DataFrame(students, columns=['Name', 'Age', 'City', 'Country', 'Agg_Marks'])

df.set_index('Agg_Marks', inplace=True)
display(df)

Output:

Set-a-Float-Column-as-Index — Setting a Float Column as Index

3. Setting Index of Specific Column (with drop=False)

By default, set_index() removes the column used as the index. However, if we want to keep the column after it’s set as the index, we can use the drop=False parameter.

Python

import pandas as pd

data = pd.read_csv("/content/employees.csv")

data.set_index("First Name", drop=False, inplace=True)

print(data.head())

Output:

Using drop=False ensures that the "First Name" column is retained even after it is set as the index.

4. Setting Index Using inplace=True

When we want to modify the original DataFrame directly rather than creating a new DataFrame, we can use inplace=True.

Python

import pandas as pd

data = {'Name': ['Geek1', 'Geek2', 'Geek3'],
        'Age': [25, 30, 35],
        'City': ['New York', 'San Francisco', 'Los Angeles']}

df = pd.DataFrame(data)

df.set_index('Name', inplace=True)
display(df)

Output:

Set-Index-of-Specific-Column — Setting Index Using inplace=True

With set_index(), we can easily organize our data, making it simpler to access and analyze, ultimately improving our workflow.

Pandas dataframe.groupby() Method

Kartikaybhutani

Improve

Article Tags :

Python | Pandas DataFrame.set_index()

Syntax:

1. Setting Multiple Columns as Index (MultiIndex)

2. Setting a Float Column as Index

3. Setting Index of Specific Column (with drop=False)

4. Setting Index Using inplace=True

Similar Reads

Thank You!

What kind of Experience do you want to share?