Open In App

25+ Useful Pandas Snippets in 2025

Last Updated : 29 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Pandas is an open-source Python library of data manipulation and analysis initially created by Wes McKinney in 2008. Its two primary data structures are the Series. It simplifies many tasks in terms of cleaning and transforming data with ease of making it suitable for manipulating structured data found in sources such as CSV files and SQL databases.

Useful-Pandas-Snippets

In this article we will further discuss about the pandas snippets;

What is Pandas?

Pandas is a powerful open-source Python library used for data analysis and manipulation. The data structures of Pandas include efficient and flexible one-dimensional Series and two-dimensional DataFrame, suitable for dealing with structured data like spreadsheets, SQL tables, and time series. Pandas simplifies tasks from cleaning to filtering, transformation and analysis.

Import and export data from multiple formats including CSV, Excel, JSON, and SQL. With built-in methods for missing values, reshaping datasets and complex operations such as grouping and aggregation. Pandas is a workflow-enabling tool and productivity booster. It is indispensable in data science, machine learning and analytics to professionals and newbies alike by integrating with libraries like NumPy, Matplotlib, and Scikit-learn.

25+ Useful Pandas Snippets

Importing Pandas

1. Importing pandas

To import the Pandas library into Python, you need to first install it via a command prompt or terminal. pip install pandas After installation, you may then import the library into your script using.

Python
import pandas as pd

Creating DataFrames

2. Creating DataFrames

The DataFrames in Pandas can be created in a variety of ways, so data structuring can be very flexible. A dictionary can be used, where keys are column names and values are lists of data. You can use `zip()` function to combine several lists into tuples that can be used to create a DataFrame.

Python
data = {'Column1': [1, 2], 'Column2': [3, 4]}
df = pd.DataFrame(data)

Reading CSV files

To read a CSV you can use the function `read_csv()` which loads the information into a DataFrame; that's a powerful structure for data manipulation and analysis. Make sure you import the library Pandas if you haven't yet like so `import pandas as pd`. This function allows for multiple parameters, such as 'sep' to define the delimiter, `header` to specify row numbers for column names and 'usecols' to select specific columns.

Python
df = pd.read_csv('file.csv')

3. Filtering Data Frames

Python
filtered_df = df[df['Column1'] > 1]

4. Parse Dates on Read

Python
df = pd.read_csv('file.csv', parse_dates=['date_column'])

5. Specify Data Types

Python
df = pd.read_csv('file.csv', dtype={'column_name': 'int32'})

6. Set Index

Python
df.set_index('index_column', inplace=True)

7. No. of Rows to Read

Python
df = pd.read_csv('file.csv', nrows=100)

8. Skip Rows

Python
df = pd.read_csv('file.csv', skiprows=5)

9. Specify NA Values

Python
df = pd.read_csv('file.csv', na_values=['NA', 'NULL'])

10. Setting Boolean Values

Python
df['bool_column'] = df['bool_column'].astype(bool)

11. Read From Multiple Files

Python
import glob
files = glob.glob("data/*.csv")
df = pd.concat((pd.read_csv(f) for f in files), ignore_index=True)

12. Copy and Paste into Data Frames

Python
data = {'Column1': [1, 2], 'Column2': [3, 4]}
df = pd.DataFrame(data)

13. Read Tables from Pdf Files

Python
import camelot
tables = camelot.read_pdf('file.pdf')
df = tables[0].df       # Get the first table as a DataFrame

Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) is the principle process of data analysis which aims at summarizing and visualizing datasets to detect patterns, relationships and anomalies. EDA would involve analyzing a data structure and detecting missing values in the data by using graphical methods such as histograms and scatter plots in getting an understanding of distributions.

14. EDA Cheat Sheet

Python
# Common EDA functions:
print(df.describe())
print(df.info())
print(df.isnull().sum())

Data Types (dtypes)

In Python, the values variables can contain are classified into data types. There are principal built-in types - numeric types: 'int', 'float' and 'complex' ; text type: 'str'; sequence types: 'list' and 'tuple'; there ; Boolean type: 'bool'. Also, there are binary data types and 'NoneType' (absence of a value). Knowing these data types is crucial for successful programming.

15. Filter column by Dtype

Python
numeric_df = df.select_dtypes(include=['number'])

16. Infer Dtype Automatically

Python
df = pd.read_csv('file.csv')      # Dtypes are inferred automatically on read.

17. Downcasting Data Types

Python
df['int_column'] = pd.to_numeric(df['int_column'], downcast='integer')

18. Manual Conversion of Dtypes

Python
df['column'] = df['column'].astype('float64')

19. Convert All Dtype at Once

Python
df = df.astype({'col1': 'int32', 'col2': 'float64'})

Column Operations

Column operations in Python using the Pandas library are extremely important for manipulating data. Direct arithmetic operations can be performed between columns (for example, `df['A'] + df['B']`) and new columns can be created based on existing ones (for example, `df['C'] = df['A'] * 2`).

20. Renaming Column

Python
df.rename(columns={'old_name': 'new_name'}, inplace=True)

21. Add Suffix and Prefix to Column Names

Python
df.columns = df.columns + '_suffix'    # Add suffix to all column names.
# or 
df.rename(columns=lambda x: 'prefix_' + x, inplace=True)    # Add prefix.

22. Create New Columns

Python
df['new_column'] = df['col1'] + df['col2']

23. Insert Columns at specific Positions

Python
df.insert(loc=0, column='new_col', value=[1, 2, 3])      # Insert new_col at position 0.

24. If-Then-Else Logic Using np.where()

Python
import numpy as np
df['new_col'] = np.where(df['col1'] > 0, 'Positive', 'Negative')

25. Dropping Columns or Rows

Python
df.drop(columns=['unwanted_column'], inplace=True)     # Drop column.
df.drop(index=0, inplace=True)                         # Drop row by index.

Missing Values

Handling missing values using Python and more importantly Pandas library, especially, is required to ensure perfect data analysis. Missing values appear as NaN or None. These are found by their methods: isnull(), notnull(). In case there is missing data, rows and columns may be dropped by means of dropna(). In addition, forward fill and backward fill methods can be used for flexible imputation. Proper handling of missing values is important for proper analysis and model performance.

26. Checking for Missing Values

Python
missing_values_count = df.isnull().sum()

27. Dealing with Missing Values

Python
df.fillna(0, inplace=True)       # Fill missing values with zero.
# or 
df.dropna(inplace=True)          # Drop rows with any missing values.

Must Read:

Conclusion

In summary, Pandas in 2025 remains a model tool to manipulate data through cleaning, merging, and analyses of datasets. This set of more than 25 Useful Pandas Snippets in 2025 will definitely save time while increasing efficiency as data professionals focus on problem solving. From beginner to advanced, knowing these techniques puts you in the best position to deal with real-world data challenges in the dynamic landscape of data.


Next Article
Article Tags :

Similar Reads