SlideShare a Scribd company logo
A B C
101 1 2 3
102 4 5 6
103 7 8 9
pandas is open-source and the most popular Python tool for data
wrangling and analytics. It is fast, intuitive, and can handle
multiple data formats such as CSV, Excel, JSON, HTML, and
SQL.
Creating DataFrames
Change the layout, rename the column names, append rows and
Create a pandas dataframe object by specifying the columns name
and index.
From dictionary:
df = pd.DataFrame( {"A" : [1, 4, 7], "B" : [2, 5, 8],
"C" : [3, 6, 9]}, index=[101, 102, 103])
From multi-dimensional list:
df = pd.DataFrame( [[1, 2, 3], [4, 5, 6],[7, 8, 9]],
index=[101, 102, 103], columns=['A', 'B', 'C'])
Importing Data
Import the data from text, Excel, website, database, or nested
JSON file.
pd.read_csv(file_location) # import tabular CSV file
pd.read_table(file_location) # import delimited text file
pd.read_excel(file_location) # import Excel file
# connect and extract the data from SQL database
pd.read_sql(query, connection_object)
# import from JSON string, file, URL
pd.read_json(json_string)
# extract tables from HTML file, URL
pd.read_html(url)
Exporting Data
These commands are commonly used to export files in various
formats but you can also export the dataframe into binary Feather,
HDF5, BigQuery table, and Pickle file.
df.to_csv(filename) # export CSV tabular file
df.to_excel(filename) # export Excel file
# apply modifications to SQL database
df.to_sql(table_name, connection_object)
df.to_json(filename) # export JSON format file
Inspecting Data
Understand the data and the distribution by using these
commands.
# view first n rows or use df.tail(n) for last n rows
df.head(n)
# display and ordered first n values or use df.nsmallest(n,
'value') for ordered last n rows
df.nlargest(n, 'value')
df.sample(n=10) # randomly select and display n rows
Df.shape # view number of rows and columns
# view the index, datatype and memory information
df.info()
# view statistical summary of numerical columns
df.describe()
# view unique values and counts of the city column
df.city.value_counts()
Subsetting
Select a single row or column and multiple rows or columns using
these commands.
df['sale'] # select a single column
df[['sale', 'profit']] # select two selected columns
df.iloc[10 : 20] # select rows from 10 to 20
# select all rows with columns at position 2, 4, and 5
df.iloc[ : , [2, 4, 5]]
# select all rows with columns from sale to profit
df.loc[ : , 'sale' : 'profit']
# filter the dataframe using logical condition and select sale
and profit columns
df.loc[df['sale'] > 10, ['sale', 'profit']]
df.iat[1, 2] # select a single value using positioning
df.at[4, 'sale'] # select single value using label
Querying
Filter out the rows using logical conditions. The query() returns a
boolean for filtering rows.
df.query('sale > 20') # filters rows using logical conditions
df.query('sale > 20 and profit < 30') # combining conditions
# string logical condition
df.query('company.str.startswith("ab")', engine="python")
Reshaping Data
Change the layout, rename the column names, append rows and
columns, and sort values and index.
pd.melt(df) # combine columns into rows
# convert rows into columns
df.pivot(columns='var', values='val')
pd.concat([df1,df2], axis = 0) # appending rows
pd.concat([df1,df2], axis = 1) # appending columns
# sort values by sale column from high to low
df.sort_values('sale', ascending=False)
df.sort_index() # sort the index
df.reset_index() # move the index to columns
# rename a column using dictionary
df.rename(columns = {'sale':'sales'})
# removing sales and profit columns from dataframe
df.drop(columns=['sales', 'profit'])
Abid Ali Awan, 2022

More Related Content

PDF
Getting started with Pandas Cheatsheet.pdf
PDF
pandas.pdf
PDF
pandas (1).pdf
PPTX
Complete Introduction To Pandas Python.pptx
PPTX
Lecture 9.pptx
PDF
pandas-221217084954-937bb582.pdf
PPTX
Pandas.pptx
PPTX
Pandas-(Ziad).pptx
Getting started with Pandas Cheatsheet.pdf
pandas.pdf
pandas (1).pdf
Complete Introduction To Pandas Python.pptx
Lecture 9.pptx
pandas-221217084954-937bb582.pdf
Pandas.pptx
Pandas-(Ziad).pptx

Similar to Python Programming.pptx (20)

PPTX
Pandas Dataframe reading data Kirti final.pptx
PPTX
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
PPTX
PPT on Data Science Using Python
PPTX
Python Pandas.pptx
PPTX
pandas directories on the python language.pptx
PPTX
Data Visualization_pandas in hadoop.pptx
PPTX
PANDAS IN PYTHON (Series and DataFrame)
PPTX
DataStructures in Pyhton Pandas and numpy.pptx
PPT
SASasasASSSasSSSSSasasaSASsasASASasasASs
PPTX
Python-for-Data-Analysis.pptx
PPTX
Python for data analysis
PPTX
interenship.pptx
PDF
Importing Data Sets | Importing Data Sets | Importing Data Sets
PPTX
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
PPTX
Unit 1 Ch 2 Data Frames digital vis.pptx
PDF
Pandas cheat sheet
PDF
Pandas Cheat Sheet
PDF
Pandas cheat sheet_data science
PPTX
Python-for-Data-Analysis.pptx
PPTX
Python-for-Data-Analysis.pptx
Pandas Dataframe reading data Kirti final.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
PPT on Data Science Using Python
Python Pandas.pptx
pandas directories on the python language.pptx
Data Visualization_pandas in hadoop.pptx
PANDAS IN PYTHON (Series and DataFrame)
DataStructures in Pyhton Pandas and numpy.pptx
SASasasASSSasSSSSSasasaSASsasASASasasASs
Python-for-Data-Analysis.pptx
Python for data analysis
interenship.pptx
Importing Data Sets | Importing Data Sets | Importing Data Sets
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
Unit 1 Ch 2 Data Frames digital vis.pptx
Pandas cheat sheet
Pandas Cheat Sheet
Pandas cheat sheet_data science
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptx

Recently uploaded (20)

PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Construction Project Organization Group 2.pptx
PPTX
Artificial Intelligence
PDF
PPT on Performance Review to get promotions
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
Digital Logic Computer Design lecture notes
PDF
composite construction of structures.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Geodesy 1.pptx...............................................
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Well-logging-methods_new................
Operating System & Kernel Study Guide-1 - converted.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Model Code of Practice - Construction Work - 21102022 .pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Construction Project Organization Group 2.pptx
Artificial Intelligence
PPT on Performance Review to get promotions
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Digital Logic Computer Design lecture notes
composite construction of structures.pdf
Lecture Notes Electrical Wiring System Components
Geodesy 1.pptx...............................................
Embodied AI: Ushering in the Next Era of Intelligent Systems
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Foundation to blockchain - A guide to Blockchain Tech
OOP with Java - Java Introduction (Basics)
Well-logging-methods_new................

Python Programming.pptx

  • 1. A B C 101 1 2 3 102 4 5 6 103 7 8 9 pandas is open-source and the most popular Python tool for data wrangling and analytics. It is fast, intuitive, and can handle multiple data formats such as CSV, Excel, JSON, HTML, and SQL. Creating DataFrames Change the layout, rename the column names, append rows and Create a pandas dataframe object by specifying the columns name and index. From dictionary: df = pd.DataFrame( {"A" : [1, 4, 7], "B" : [2, 5, 8], "C" : [3, 6, 9]}, index=[101, 102, 103]) From multi-dimensional list: df = pd.DataFrame( [[1, 2, 3], [4, 5, 6],[7, 8, 9]], index=[101, 102, 103], columns=['A', 'B', 'C']) Importing Data Import the data from text, Excel, website, database, or nested JSON file. pd.read_csv(file_location) # import tabular CSV file pd.read_table(file_location) # import delimited text file pd.read_excel(file_location) # import Excel file # connect and extract the data from SQL database pd.read_sql(query, connection_object) # import from JSON string, file, URL pd.read_json(json_string) # extract tables from HTML file, URL pd.read_html(url) Exporting Data These commands are commonly used to export files in various formats but you can also export the dataframe into binary Feather, HDF5, BigQuery table, and Pickle file. df.to_csv(filename) # export CSV tabular file df.to_excel(filename) # export Excel file # apply modifications to SQL database df.to_sql(table_name, connection_object) df.to_json(filename) # export JSON format file Inspecting Data Understand the data and the distribution by using these commands. # view first n rows or use df.tail(n) for last n rows df.head(n) # display and ordered first n values or use df.nsmallest(n, 'value') for ordered last n rows df.nlargest(n, 'value') df.sample(n=10) # randomly select and display n rows Df.shape # view number of rows and columns # view the index, datatype and memory information df.info() # view statistical summary of numerical columns df.describe() # view unique values and counts of the city column df.city.value_counts() Subsetting Select a single row or column and multiple rows or columns using these commands. df['sale'] # select a single column df[['sale', 'profit']] # select two selected columns df.iloc[10 : 20] # select rows from 10 to 20 # select all rows with columns at position 2, 4, and 5 df.iloc[ : , [2, 4, 5]] # select all rows with columns from sale to profit df.loc[ : , 'sale' : 'profit'] # filter the dataframe using logical condition and select sale and profit columns df.loc[df['sale'] > 10, ['sale', 'profit']] df.iat[1, 2] # select a single value using positioning df.at[4, 'sale'] # select single value using label Querying Filter out the rows using logical conditions. The query() returns a boolean for filtering rows. df.query('sale > 20') # filters rows using logical conditions df.query('sale > 20 and profit < 30') # combining conditions # string logical condition df.query('company.str.startswith("ab")', engine="python") Reshaping Data Change the layout, rename the column names, append rows and columns, and sort values and index. pd.melt(df) # combine columns into rows # convert rows into columns df.pivot(columns='var', values='val') pd.concat([df1,df2], axis = 0) # appending rows pd.concat([df1,df2], axis = 1) # appending columns # sort values by sale column from high to low df.sort_values('sale', ascending=False) df.sort_index() # sort the index df.reset_index() # move the index to columns # rename a column using dictionary df.rename(columns = {'sale':'sales'}) # removing sales and profit columns from dataframe df.drop(columns=['sales', 'profit']) Abid Ali Awan, 2022