SlideShare a Scribd company logo
Data Frames
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with
rows and columns.
import pandas as pd
import pandas as pd
data = {
"Marks": [80, 75, 90],
"Sub": ['Python', 'Java', 'Database']
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)
Data Frames
Locate Row
As you can see from the result above, the DataFrame is like a table with rows and
columns.
Pandas use the loc attribute to return one or more specified row(s)
import pandas as pd
data = {
"Marks": [80, 75, 90],
"Sub": ['Python', 'Java', 'Database']
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df.loc[0]) #print(df.loc[[0, 1]])
Data Frames
Named Index
import pandas as pd
data = {
"Marks": [80, 75, 90],
"Sub": ['Python', 'Java', 'Database']
}
#load data into a DataFrame object:
df = pd.DataFrame(data,index= ["day1","day2","day3"])
print(df)
Locate Named Indexes
Use the named index in the loc attribute to return the specified row(s).
Example
Return "day2":
#refer to the named index:
print(df.loc["day2"])
Data Frames
Load Files Into a DataFrame
If your data sets are stored in a file, Pandas can load them into a DataFrame.
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
import pandas as pd
print(pd.options.display.max_rows)
Data Frames
Read JSON
Big data sets are often stored, or extracted as JSON.
JSON is plain text, but has the format of an object, and is well known in the world of
programming, including Pandas.
In our examples we will be using a JSON file called 'data.json’.
use to_string() to print the entire DataFrame.
Data Frames
import pandas as pd
data = {
"Duration":{
"0":60,
"1":60,
"2":60,
"3":45,
"4":45,
"5":60
},
"Pulse":{
"0":110,
"1":117,
"2":103,
"3":109,
"4":117,
"5":102
},
"Maxpulse":{
"0":130,
"1":145,
"2":135,
"3":175,
"4":148,
"5":127
},
"Calories":{
"0":409,
"1":479,
"2":340,
"3":282,
"4":406,
"5":300
}
}
df = pd.DataFrame(data)
print(df)
Viewing the Data
• One of the most used method for getting a quick overview of the
DataFrame, is the head() method.
• The head() method returns the headers and a specified number of rows,
starting from the top.
• import pandas as pd
• df = pd.read_csv('data.csv')
• print(df.head(10))
• #Print the first 5 rows of the DataFrame:print(df.head())
• There is also a tail() method for viewing the last rows of the
DataFrame.
• The tail() method returns the headers and a specified number of
rows, starting from the bottom.
• Example
• Print the last 5 rows of the DataFrame:
• print(df.tail())
import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving rows by iloc method
row2 = data.iloc[3]
print(row2)
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
# creating a dataframe from dictionary
df = pd.DataFrame(dict)
print(df)
• Dropping missing values using dropna() :
• In order to drop a null values from a dataframe, we used dropna() function this fuction drop Rows/Columns of datasets with Null
values in different ways.
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
# creating a dataframe from dictionary
df = pd.DataFrame(dict)
Print(df)
• Now we drop rows with at least one Nan value (Null value).
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
# creating a dataframe from dictionary
df = pd.DataFrame(dict)
# using dropna() function
print(df.dropna())
• Iterating over rows and columns
• Iteration is a general term for taking each item of something, one
after another. Pandas DataFrame consists of rows and columns so, in
order to iterate over dataframe, we have to iterate a dataframe like a
dictionary.
• Iterating over rows :
• In order to iterate over rows, we can use three function iteritems(),
iterrows(), itertuples() . These three function will help in iteration
over rows.
# importing pandas as pd
import pandas as pd
# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
'degree': ["MBA", "BCA", "M.Tech", "MBA"],
'score':[90, 40, 80, 98]}
# creating a dataframe from a dictionary
df = pd.DataFrame(dict)
print(df)
Now we apply iterrows() function in order to get a each element of rows.
# importing pandas as pd
import pandas as pd
# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
'degree': ["MBA", "BCA", "M.Tech", "MBA"],
'score':[90, 40, 80, 98]}
# creating a dataframe from a dictionary
df = pd.DataFrame(dict)
# iterating over rows using iterrows() function
for i, j in df.iterrows():
print(i, j)
print()
Data Frame Data structure in Python pandas.pptx

More Related Content

PPTX
Pandas Dataframe reading data Kirti final.pptx
PPTX
introduction to data structures in pandas
PPTX
dataframe_operations and various functions
PPTX
DataStructures in Pyhton Pandas and numpy.pptx
PPTX
ppanda.pptx
PPTX
Pandas-(Ziad).pptx
PPTX
Python Pandas.pptx
PPTX
Pandas Dataframe reading data Kirti final.pptx
introduction to data structures in pandas
dataframe_operations and various functions
DataStructures in Pyhton Pandas and numpy.pptx
ppanda.pptx
Pandas-(Ziad).pptx
Python Pandas.pptx

Similar to Data Frame Data structure in Python pandas.pptx (20)

PDF
pandas dataframe notes.pdf
PPTX
Introduction to pandas
PPTX
PANDAS IN PYTHON (Series and DataFrame)
PPTX
Data Visualization_pandas in hadoop.pptx
PPTX
introductiontopandas- for 190615082420.pptx
PPTX
interenship.pptx
PPTX
Python-for-Data-Analysis.pptx
PPTX
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
PPTX
Unit 1 Ch 2 Data Frames digital vis.pptx
PPTX
Introduction To Pandas:Basics with syntax and examples.pptx
PPTX
pandas directories on the python language.pptx
PPTX
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
PPTX
DataFrame Creation.pptx
PPTX
Unit 4_Working with Graphs _python (2).pptx
PPTX
Lecture 9.pptx
PDF
Lecture on Python Pandas for Decision Making
PPTX
Presentation on Pandas in _ detail .pptx
PPTX
Unit 3_Numpy_Vsp.pptx
PPTX
Data Analysis with Python Pandas
PPTX
Pa1 session 5
pandas dataframe notes.pdf
Introduction to pandas
PANDAS IN PYTHON (Series and DataFrame)
Data Visualization_pandas in hadoop.pptx
introductiontopandas- for 190615082420.pptx
interenship.pptx
Python-for-Data-Analysis.pptx
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
Unit 1 Ch 2 Data Frames digital vis.pptx
Introduction To Pandas:Basics with syntax and examples.pptx
pandas directories on the python language.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
DataFrame Creation.pptx
Unit 4_Working with Graphs _python (2).pptx
Lecture 9.pptx
Lecture on Python Pandas for Decision Making
Presentation on Pandas in _ detail .pptx
Unit 3_Numpy_Vsp.pptx
Data Analysis with Python Pandas
Pa1 session 5
Ad

More from Ramakrishna Reddy Bijjam (20)

PPTX
Pyhton with Mysql to perform CRUD operations.pptx
PPTX
Regular expressions,function and glob module.pptx
PPTX
Natural Language processing using nltk.pptx
PPTX
Parsing HTML read and write operations and OS Module.pptx
PPTX
JSON, XML and Data Science introduction.pptx
PPTX
What is FIle and explanation of text files.pptx
PPTX
BINARY files CSV files JSON files with example.pptx
DOCX
VBS control structures for if do whilw.docx
DOCX
Builtinfunctions in vbscript and its types.docx
DOCX
VBScript Functions procedures and arrays.docx
DOCX
VBScript datatypes and control structures.docx
PPTX
Numbers and global functions conversions .pptx
DOCX
Structured Graphics in dhtml and active controls.docx
DOCX
Filters and its types as wave shadow.docx
PPTX
JavaScript Arrays and its types .pptx
PPTX
JS Control Statements and Functions.pptx
PPTX
Code conversions binary to Gray vice versa.pptx
PDF
FIXED and FLOATING-POINT-REPRESENTATION.pdf
PPTX
Handling Missing Data for Data Analysis.pptx
PPTX
Series data structure in Python Pandas.pptx
Pyhton with Mysql to perform CRUD operations.pptx
Regular expressions,function and glob module.pptx
Natural Language processing using nltk.pptx
Parsing HTML read and write operations and OS Module.pptx
JSON, XML and Data Science introduction.pptx
What is FIle and explanation of text files.pptx
BINARY files CSV files JSON files with example.pptx
VBS control structures for if do whilw.docx
Builtinfunctions in vbscript and its types.docx
VBScript Functions procedures and arrays.docx
VBScript datatypes and control structures.docx
Numbers and global functions conversions .pptx
Structured Graphics in dhtml and active controls.docx
Filters and its types as wave shadow.docx
JavaScript Arrays and its types .pptx
JS Control Statements and Functions.pptx
Code conversions binary to Gray vice versa.pptx
FIXED and FLOATING-POINT-REPRESENTATION.pdf
Handling Missing Data for Data Analysis.pptx
Series data structure in Python Pandas.pptx
Ad

Recently uploaded (20)

PDF
RMMM.pdf make it easy to upload and study
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Institutional Correction lecture only . . .
PPTX
GDM (1) (1).pptx small presentation for students
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Lesson notes of climatology university.
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
Supply Chain Operations Speaking Notes -ICLT Program
RMMM.pdf make it easy to upload and study
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Complications of Minimal Access Surgery at WLH
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Institutional Correction lecture only . . .
GDM (1) (1).pptx small presentation for students
O5-L3 Freight Transport Ops (International) V1.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Lesson notes of climatology university.
STATICS OF THE RIGID BODIES Hibbelers.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
Supply Chain Operations Speaking Notes -ICLT Program

Data Frame Data structure in Python pandas.pptx

  • 1. Data Frames A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns. import pandas as pd import pandas as pd data = { "Marks": [80, 75, 90], "Sub": ['Python', 'Java', 'Database'] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df)
  • 2. Data Frames Locate Row As you can see from the result above, the DataFrame is like a table with rows and columns. Pandas use the loc attribute to return one or more specified row(s) import pandas as pd data = { "Marks": [80, 75, 90], "Sub": ['Python', 'Java', 'Database'] } #load data into a DataFrame object: df = pd.DataFrame(data) print(df.loc[0]) #print(df.loc[[0, 1]])
  • 3. Data Frames Named Index import pandas as pd data = { "Marks": [80, 75, 90], "Sub": ['Python', 'Java', 'Database'] } #load data into a DataFrame object: df = pd.DataFrame(data,index= ["day1","day2","day3"]) print(df) Locate Named Indexes Use the named index in the loc attribute to return the specified row(s). Example Return "day2": #refer to the named index: print(df.loc["day2"])
  • 4. Data Frames Load Files Into a DataFrame If your data sets are stored in a file, Pandas can load them into a DataFrame. import pandas as pd df = pd.read_csv('data.csv') print(df) import pandas as pd print(pd.options.display.max_rows)
  • 5. Data Frames Read JSON Big data sets are often stored, or extracted as JSON. JSON is plain text, but has the format of an object, and is well known in the world of programming, including Pandas. In our examples we will be using a JSON file called 'data.json’. use to_string() to print the entire DataFrame.
  • 6. Data Frames import pandas as pd data = { "Duration":{ "0":60, "1":60, "2":60, "3":45, "4":45, "5":60 }, "Pulse":{ "0":110, "1":117, "2":103, "3":109, "4":117, "5":102 }, "Maxpulse":{ "0":130, "1":145, "2":135, "3":175, "4":148, "5":127 }, "Calories":{ "0":409, "1":479, "2":340, "3":282, "4":406, "5":300 } } df = pd.DataFrame(data) print(df)
  • 7. Viewing the Data • One of the most used method for getting a quick overview of the DataFrame, is the head() method. • The head() method returns the headers and a specified number of rows, starting from the top. • import pandas as pd • df = pd.read_csv('data.csv') • print(df.head(10)) • #Print the first 5 rows of the DataFrame:print(df.head())
  • 8. • There is also a tail() method for viewing the last rows of the DataFrame. • The tail() method returns the headers and a specified number of rows, starting from the bottom. • Example • Print the last 5 rows of the DataFrame: • print(df.tail())
  • 9. import pandas as pd # making data frame from csv file data = pd.read_csv("nba.csv", index_col ="Name") # retrieving rows by iloc method row2 = data.iloc[3] print(row2)
  • 10. # importing pandas as pd import pandas as pd # importing numpy as np import numpy as np # dictionary of lists dict = {'First Score':[100, 90, np.nan, 95], 'Second Score': [30, np.nan, 45, 56], 'Third Score':[52, 40, 80, 98], 'Fourth Score':[np.nan, np.nan, np.nan, 65]} # creating a dataframe from dictionary df = pd.DataFrame(dict) print(df)
  • 11. • Dropping missing values using dropna() : • In order to drop a null values from a dataframe, we used dropna() function this fuction drop Rows/Columns of datasets with Null values in different ways. # importing pandas as pd import pandas as pd # importing numpy as np import numpy as np # dictionary of lists dict = {'First Score':[100, 90, np.nan, 95], 'Second Score': [30, np.nan, 45, 56], 'Third Score':[52, 40, 80, 98], 'Fourth Score':[np.nan, np.nan, np.nan, 65]} # creating a dataframe from dictionary df = pd.DataFrame(dict) Print(df)
  • 12. • Now we drop rows with at least one Nan value (Null value). # importing pandas as pd import pandas as pd # importing numpy as np import numpy as np # dictionary of lists dict = {'First Score':[100, 90, np.nan, 95], 'Second Score': [30, np.nan, 45, 56], 'Third Score':[52, 40, 80, 98], 'Fourth Score':[np.nan, np.nan, np.nan, 65]} # creating a dataframe from dictionary df = pd.DataFrame(dict) # using dropna() function print(df.dropna())
  • 13. • Iterating over rows and columns • Iteration is a general term for taking each item of something, one after another. Pandas DataFrame consists of rows and columns so, in order to iterate over dataframe, we have to iterate a dataframe like a dictionary. • Iterating over rows : • In order to iterate over rows, we can use three function iteritems(), iterrows(), itertuples() . These three function will help in iteration over rows.
  • 14. # importing pandas as pd import pandas as pd # dictionary of lists dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"], 'degree': ["MBA", "BCA", "M.Tech", "MBA"], 'score':[90, 40, 80, 98]} # creating a dataframe from a dictionary df = pd.DataFrame(dict) print(df)
  • 15. Now we apply iterrows() function in order to get a each element of rows. # importing pandas as pd import pandas as pd # dictionary of lists dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"], 'degree': ["MBA", "BCA", "M.Tech", "MBA"], 'score':[90, 40, 80, 98]} # creating a dataframe from a dictionary df = pd.DataFrame(dict) # iterating over rows using iterrows() function for i, j in df.iterrows(): print(i, j) print()