SlideShare a Scribd company logo
Visvesvaraya Technological University
“jnana Sangam” Belagavi Karnataka
India
SECAB Institute of Engineering & Technology, Vijayapur
Department Of Master Of Computer Applications 2022-2023
A Seminar On
Web Scraping And Numerical Analysis
By
Course Co-Ordinator
Prof.Nazeera Madabhavi
Mohammad Azeem Maniyar 2SA22MC013
Web Scraping
 Web scraping in Python is a technique used to
extract data from websites. It's a valuable skill in
data analytics as it allows you to collect large
amounts of data from the web for analysis
 Beautiful Soup
 lxml
 Requests
 Scrapy
 Selenium
 html5lib
Python libraries are commonly used.
Here are some most Popular libraries
Parsing XML with lxml.objectify
<?xml version="1.0" encoding="UTF-8"?>
<root>
<room>
<n35237 type="number">1.0</n35237>
<n32238 type="number">3.0</n32238>
<n44699 type="number">nan</n44699>
</room>
<price>
<n35237 type="number">7020000.0</n35237>
<n32238 type="number">10000000.0</n32238>
<n44699 type="number">4128000.0</n44699>
</price>
<property_id>
<n35237 type="number">35237.0</n35237>
<n32238 type="number">32238.0</n32238>
<n44699 type="number">44699.0</n44699>
</property_id>
</root>
Program
from lxml import objectify
import pandas as pd
# Parse XML data
xml_data = objectify.parse('properties.xml')
root = xml_data.getroot() # Root element
# Extract data and column names
data = []
cols = []
for child in root.getchildren():
data.append([subchild.text for subchild in child.getchildren()])
cols.append(child.tag)
# Create DataFrame
df = pd.DataFrame(data).T # Create DataFrame and transpose it
# Set column names
df.columns = cols
# Print DataFrame
print(df)
Output
Python Seminar of Data analytics using python
Ad

Recommended

Lead Data Scientist | Machine Learning & AI Expert | Predictive Maintenance &...
Lead Data Scientist | Machine Learning & AI Expert | Predictive Maintenance &...
Akshay Kakkar
 
IJET-V3I2P2
IJET-V3I2P2
IJET - International Journal of Engineering and Techniques
 
Struts(mrsurwar) ppt
Struts(mrsurwar) ppt
mrsurwar
 
NTC 409 RANK Education Your Life / ntc409rank.com
NTC 409 RANK Education Your Life / ntc409rank.com
kopiko16
 
[Srijan Wednesday Webinars] The Fundamentals of ReactJS
[Srijan Wednesday Webinars] The Fundamentals of ReactJS
Srijan Technologies
 
Oracle Database Programming Using Java And Web Services 1st Edition Kuassi Me...
Oracle Database Programming Using Java And Web Services 1st Edition Kuassi Me...
janionkocho
 
NTC 409 RANK Inspiring Innovation--ntc409rank.com
NTC 409 RANK Inspiring Innovation--ntc409rank.com
agathachristie61
 
NTC 409 RANK Lessons in Excellence-- ntc409rank.com
NTC 409 RANK Lessons in Excellence-- ntc409rank.com
RoelofMerwe127
 
automatic database schema generation
automatic database schema generation
soma Dileep kumar
 
NTC 409 RANK Achievement Education--ntc409rank.com
NTC 409 RANK Achievement Education--ntc409rank.com
kopiko165
 
NTC 409 RANK Education Counseling -- ntc409rank.com
NTC 409 RANK Education Counseling -- ntc409rank.com
kopiko99
 
NTC 409 RANK Redefined Education--ntc409rank.com
NTC 409 RANK Redefined Education--ntc409rank.com
kopiko183
 
Documentation
Documentation
Naveen Kumar
 
5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster
Venkat Projects
 
Mvc training By Jaganath Rao Niku
Mvc training By Jaganath Rao Niku
jaganath rao
 
Mvc razor and working with data
Mvc razor and working with data
Vladislav Hadzhiyski
 
NTC 409 Invent Yourself/newtonhelp.com
NTC 409 Invent Yourself/newtonhelp.com
lechenau117
 
Art of Java Web Development.pdf
Art of Java Web Development.pdf
Nancy Rinehart
 
Presentation On Industrial Training
Presentation On Industrial Training
Batra Computer Centre
 
Event Management System using Full Stack Web Application Review-1
Event Management System using Full Stack Web Application Review-1
karthick de cluzters
 
Online examination documentation
Online examination documentation
Wakimul Alam
 
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
Munisekhar Gunapati
 
A vague improved markov model approach for web page prediction
A vague improved markov model approach for web page prediction
IJCSES Journal
 
Asp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity Framework
Shravan A
 
Presentation: Project Preliminary
Presentation: Project Preliminary
Mrugen Deshmukh
 
Java non ieee project titles 2013-2014
Java non ieee project titles 2013-2014
xtreamtechnologies
 
Resume
Resume
shravan Jambula
 
Cv kapil
Cv kapil
Codescape Consultant Pvt Ltd Jaipur
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 

More Related Content

Similar to Python Seminar of Data analytics using python (20)

automatic database schema generation
automatic database schema generation
soma Dileep kumar
 
NTC 409 RANK Achievement Education--ntc409rank.com
NTC 409 RANK Achievement Education--ntc409rank.com
kopiko165
 
NTC 409 RANK Education Counseling -- ntc409rank.com
NTC 409 RANK Education Counseling -- ntc409rank.com
kopiko99
 
NTC 409 RANK Redefined Education--ntc409rank.com
NTC 409 RANK Redefined Education--ntc409rank.com
kopiko183
 
Documentation
Documentation
Naveen Kumar
 
5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster
Venkat Projects
 
Mvc training By Jaganath Rao Niku
Mvc training By Jaganath Rao Niku
jaganath rao
 
Mvc razor and working with data
Mvc razor and working with data
Vladislav Hadzhiyski
 
NTC 409 Invent Yourself/newtonhelp.com
NTC 409 Invent Yourself/newtonhelp.com
lechenau117
 
Art of Java Web Development.pdf
Art of Java Web Development.pdf
Nancy Rinehart
 
Presentation On Industrial Training
Presentation On Industrial Training
Batra Computer Centre
 
Event Management System using Full Stack Web Application Review-1
Event Management System using Full Stack Web Application Review-1
karthick de cluzters
 
Online examination documentation
Online examination documentation
Wakimul Alam
 
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
Munisekhar Gunapati
 
A vague improved markov model approach for web page prediction
A vague improved markov model approach for web page prediction
IJCSES Journal
 
Asp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity Framework
Shravan A
 
Presentation: Project Preliminary
Presentation: Project Preliminary
Mrugen Deshmukh
 
Java non ieee project titles 2013-2014
Java non ieee project titles 2013-2014
xtreamtechnologies
 
Resume
Resume
shravan Jambula
 
Cv kapil
Cv kapil
Codescape Consultant Pvt Ltd Jaipur
 
automatic database schema generation
automatic database schema generation
soma Dileep kumar
 
NTC 409 RANK Achievement Education--ntc409rank.com
NTC 409 RANK Achievement Education--ntc409rank.com
kopiko165
 
NTC 409 RANK Education Counseling -- ntc409rank.com
NTC 409 RANK Education Counseling -- ntc409rank.com
kopiko99
 
NTC 409 RANK Redefined Education--ntc409rank.com
NTC 409 RANK Redefined Education--ntc409rank.com
kopiko183
 
5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster
Venkat Projects
 
Mvc training By Jaganath Rao Niku
Mvc training By Jaganath Rao Niku
jaganath rao
 
NTC 409 Invent Yourself/newtonhelp.com
NTC 409 Invent Yourself/newtonhelp.com
lechenau117
 
Art of Java Web Development.pdf
Art of Java Web Development.pdf
Nancy Rinehart
 
Event Management System using Full Stack Web Application Review-1
Event Management System using Full Stack Web Application Review-1
karthick de cluzters
 
Online examination documentation
Online examination documentation
Wakimul Alam
 
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
LOAD BALANCED CLUSTERING WITH MIMO UPLOADING TECHNIQUE FOR MOBILE DATA GATHER...
Munisekhar Gunapati
 
A vague improved markov model approach for web page prediction
A vague improved markov model approach for web page prediction
IJCSES Journal
 
Asp.Net Core MVC with Entity Framework
Asp.Net Core MVC with Entity Framework
Shravan A
 
Presentation: Project Preliminary
Presentation: Project Preliminary
Mrugen Deshmukh
 
Java non ieee project titles 2013-2014
Java non ieee project titles 2013-2014
xtreamtechnologies
 

Recently uploaded (20)

GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
ScyllaDB
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Ad

Python Seminar of Data analytics using python

  • 1. Visvesvaraya Technological University “jnana Sangam” Belagavi Karnataka India SECAB Institute of Engineering & Technology, Vijayapur Department Of Master Of Computer Applications 2022-2023 A Seminar On Web Scraping And Numerical Analysis By Course Co-Ordinator Prof.Nazeera Madabhavi Mohammad Azeem Maniyar 2SA22MC013
  • 2. Web Scraping  Web scraping in Python is a technique used to extract data from websites. It's a valuable skill in data analytics as it allows you to collect large amounts of data from the web for analysis
  • 3.  Beautiful Soup  lxml  Requests  Scrapy  Selenium  html5lib Python libraries are commonly used. Here are some most Popular libraries
  • 4. Parsing XML with lxml.objectify <?xml version="1.0" encoding="UTF-8"?> <root> <room> <n35237 type="number">1.0</n35237> <n32238 type="number">3.0</n32238> <n44699 type="number">nan</n44699> </room> <price> <n35237 type="number">7020000.0</n35237> <n32238 type="number">10000000.0</n32238> <n44699 type="number">4128000.0</n44699> </price> <property_id> <n35237 type="number">35237.0</n35237> <n32238 type="number">32238.0</n32238> <n44699 type="number">44699.0</n44699> </property_id> </root>
  • 5. Program from lxml import objectify import pandas as pd # Parse XML data xml_data = objectify.parse('properties.xml') root = xml_data.getroot() # Root element # Extract data and column names data = [] cols = [] for child in root.getchildren(): data.append([subchild.text for subchild in child.getchildren()]) cols.append(child.tag) # Create DataFrame df = pd.DataFrame(data).T # Create DataFrame and transpose it # Set column names df.columns = cols # Print DataFrame print(df)