SlideShare a Scribd company logo
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn
What’s in it for you?
1. History of Spark
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
What’s in it for you?
Spark Core
Spark SQL
Spark Streaming
Spark MLlib
GraphX
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark Architecture
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark Architecture
6. Applications of Spark
What’s in it for you?
What’s in it for you?
1. History of Spark
2. What is Spark?
3. Hadoop vs Spark
4. Components of Apache Spark
5. Spark Architecture
6. Applications of Spark
7. Spark Use Case
What’s in it for you?
History of Apache Spark
Started as a project at UC
Berkley AMPLab
2009
History of Apache Spark
Started as a project at UC
Berkley AMPLab
Open sourced under a
BSD license
2009
2010
History of Apache Spark
Started as a project at UC
Berkley AMPLab
Open sourced under a
BSD license
Spark became an Apache top
level project
2009
2010
2013
History of Apache Spark
Started as a project at UC
Berkley AMPLab
Open sourced under a
BSD license
Spark became an Apache top
level project
Used by Databricks to sort
large-scale datasets and set a
new world record
2009
2010
2013
2014
History of Apache Spark
What is Apache Spark?
What is Apache Spark?
Apache Spark is an open-source data processing engine to store and process data in
real-time across various clusters of computers using simple programming constructs
What is Apache Spark?
Support various programming languages
Apache Spark is an open-source data processing engine to store and process data in
real-time across various clusters of computers using simple programming constructs
What is Apache Spark?
Support various programming languages Developers and data scientists incorporate
Spark into their applications to rapidly
query, analyze, and transform data at
scale
Query Analyze Transform
Apache Spark is an open-source data processing engine to store and process data in
real-time across various clusters of computers using simple programming constructs
History of Apache Spark
Hadoop vs Spark
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Performs batch processing of data Performs both batch processing and real-time processing of data
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Performs batch processing of data Performs both batch processing and real-time processing of data
Hadoop has more lines of code. Since it is written in Java, it takes
more time to execute
Spark has fewer lines of code as it is implemented in Scala
Hadoop vs Spark
Processing data using MapReduce in Hadoop is slow
Spark processes data 100 times faster than MapReduce as it is done in-
memory
Performs batch processing of data Performs both batch processing and real-time processing of data
Hadoop has more lines of code. Since it is written in Java, it takes
more time to execute
Spark has fewer lines of code as it is implemented in Scala
Hadoop supports Kerberos authentication, which is difficult to manage Spark supports authentication via a shared secret. It can also
run on YARN leveraging the capability of Kerberos
History of Apache Spark
Spark Features
Spark Features
Fast processing
Spark contains Resilient Distributed
Datasets (RDD) which saves time
taken in reading, and writing
operations and hence, it runs almost
ten to hundred times faster than
Hadoop
Spark Features
In-memory
computing
In Spark, data is stored in the RAM,
so it can access the data quickly and
accelerate the speed of analytics
Fast processing
Spark Features
Flexible
Spark supports multiple languages
and allows the developers to write
applications in Java, Scala, R, or
Python
In-memory
computingFast processing
Spark Features
Fault tolerance
Spark contains Resilient Distributed
Datasets (RDD) that are designed to
handle the failure of any worker
node in the cluster. Thus, it ensures
that the loss of data reduces to zero
Flexible
In-memory
computingFast processing
Spark Features
Better analytics
Spark has a rich set of SQL queries,
machine learning algorithms,
complex analytics, etc. With all these
functionalities, analytics can be
performed better
Fault toleranceFlexible
In-memory
computingFast processing
History of Apache Spark
Components of Spark
Components of Apache Spark
Spark Core
Components of Apache Spark
Spark Core Spark SQL
SQL
Components of Apache Spark
Spark
Streaming
Spark Core Spark SQL
SQL Streaming
Components of Apache Spark
MLlib
Spark
Streaming
Spark Core Spark SQL
SQL Streaming MLlib
Components of Apache Spark
MLlib
Spark
Streaming
Spark Core Spark SQL GraphX
SQL Streaming MLlib
History of Apache Spark
Components of Spark –
Spark Core
Spark Core
Spark Core
Spark Core is the base engine for large-scale parallel and distributed
data processing
Spark Core
Spark Core
Spark Core is the base engine for large-scale parallel and distributed
data processing
It is responsible for:
memory management fault recovery
scheduling, distributing and
monitoring jobs on a cluster
interacting with storage
systems
Resilient Distributed Dataset
Spark Core
Spark Core is embedded with RDDs (Resilient Distributed Datasets), an
immutable fault-tolerant, distributed collection of objects that can be operated on
in parallel
RDD
Transformation Action
These are operations (such as reduce,
first, count) that return
a value after running a computation on
an RDD
These are operations (such as map, filter,
join, union) that are performed on an RDD
that yields a new RDD containing the
result
History of Apache Spark
Components of Spark –
Spark SQL
Spark SQL
Spark SQL framework component is used for structured and semi-structured data
processing
Spark SQL
SQL
Spark SQL
Spark SQL framework component is used for structured and semi-structured data
processing
Spark SQL
SQL
DataFrame DSL Spark SQL and HQL
DataFrame API
Data Source API
CSV JSON JDBC
Spark SQL Architecture
History of Apache Spark
Components of Spark –
Spark Streaming
Spark Streaming
Spark Streaming is a lightweight API that allows developers to perform batch
processing and real-time streaming of data with ease
Spark
Streaming
Streaming
Provides secure, reliable, and fast processing of live data
streams
Spark Streaming
Spark Streaming is a lightweight API that allows developers to perform batch
processing and real-time streaming of data with ease
Spark
Streaming
Streaming
Provides secure, reliable, and fast processing of live data
streams
Streaming Engine
Input data
stream
Batches of
input data
Batches of
processed
data
History of Apache Spark
Components of Spark –
Spark MLlib
Spark MLlib
MLlib is a low-level machine learning library that is simple to use,
is scalable, and compatible with various programming languages
MLlib
MLlib
MLlib eases the deployment and development of
scalable machine learning algorithms
Spark MLlib
MLlib is a low-level machine learning library that is simple to use,
is scalable, and compatible with various programming languages
MLlib
MLlib
MLlib eases the deployment and development of
scalable machine learning algorithms
It contains machine learning libraries that have an
implementation of various machine learning algorithms
Clustering Classification Collaborative
Filtering
History of Apache Spark
Components of Spark –
GraphX
GraphX
GraphX is Spark’s own Graph Computation Engine and data store
GraphX
GraphX
GraphX is Spark’s own Graph Computation Engine and data store
GraphX
Provides a uniform tool for ETL Exploratory data analysis
Interactive graph computations
History of Apache Spark
Spark Architecture
Master Node
Driver Program
SparkContext
• Master Node has a Driver Program
• The Spark code behaves as a driver
program and creates a SparkContext,
which is a gateway to all the Spark
functionalities
Apache Spark uses a master-slave architecture that consists of a driver, that runs on a
master node, and multiple executors which run across the worker nodes in the cluster
Spark Architecture
Master Node
Driver Program
SparkContext Cluster Manager
• Spark applications run as independent
sets of processes
on a cluster
• The driver program & Spark context
takes care of the job execution within
the cluster
Spark Architecture
Master Node
Driver Program
SparkContext Cluster Manager
Cache
Task Task
Executor
Worker Node
Cache
Task Task
Executor
Worker Node
• A job is split into multiple tasks that are
distributed over the worker node
• When an RDD is created in Spark
context, it can be distributed across
various nodes
• Worker nodes are slaves that run
different tasks
Spark Architecture
Master Node
Driver Program
SparkContext Cluster Manager
Cache
Task Task
Executor
Worker Node
Cache
Task Task
Executor
Worker Node
• The Executor is responsible for the
execution of these tasks
• Worker nodes execute the tasks
assigned by the Cluster Manager and
return the results back to the
SparkContext
Spark Architecture
Spark Cluster Managers
Standalone mode
1
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Spark Cluster Managers
Standalone mode
1 2
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Apache Mesos is an
open-source project to
manage computer
clusters, and can also run
Hadoop applications
Spark Cluster Managers
Standalone mode
1 2 3
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Apache Mesos is an
open-source project to
manage computer
clusters, and can also run
Hadoop applications
Apache YARN is the
cluster resource manager
of Hadoop 2. Spark can
be run on YARN
Spark Cluster Managers
Standalone mode
1 2 3 4
By default, applications
submitted to the
standalone mode cluster
will run in FIFO order,
and each application will
try to use all available
nodes
Apache Mesos is an
open-source project to
manage computer
clusters, and can also run
Hadoop applications
Apache YARN is the
cluster resource manager
of Hadoop 2. Spark can
be run on YARN
Kubernetes is an open-
source system for
automating deployment,
scaling, and management
of containerized
applications
History of Apache Spark
Applications of Spark
Applications of Spark
Banking
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Applications of Spark
Banking E-Commerce
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Alibaba uses Spark to analyze large
sets of data such as real-time
transaction details, browsing history,
etc. in the form of Spark jobs and
provides recommendations to its users
Applications of Spark
Banking E-Commerce Healthcare
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Alibaba uses Spark to analyze large
sets of data such as real-time
transaction details, browsing history,
etc. in the form of Spark jobs and
provides recommendations to its users
IQVIA is a leading healthcare company
that uses Spark to analyze patient’s
data, identify possible health issues,
and diagnose it based on their medical
history
Applications of Spark
Banking E-Commerce Healthcare Entertainment
JPMorgan uses Spark to detect
fraudulent transactions, analyze the
business spends of an individual to
suggest offers, and identify patterns
to decide how much to invest and
where to invest
Alibaba uses Spark to analyze large
sets of data such as real-time
transaction details, browsing history,
etc. in the form of Spark jobs and
provides recommendations to its users
IQVIA is a leading healthcare company
that uses Spark to analyze patient’s
data, identify possible health issues,
and diagnose it based on their medical
history
Entertainment and gaming companies
like Netflix and Riot games use
Apache Spark to showcase relevant
advertisements to their users based on
the videos that they watch, share, and
like
History of Apache Spark
Spark Use Case
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Video streaming is a challenge, especially with
increasing demand for high-quality streaming
experiences
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Video streaming is a challenge, especially with
increasing demand for high-quality streaming
experiences
Conviva collects data about video streaming
quality to give their customers visibility into the end-
user experience they are delivering
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva delivers a better
quality of service to its customers by removing the
screen buffering and learning in detail about the
network conditions in real-time
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva delivers a better
quality of service to its customers by removing the
screen buffering and learning in detail about the
network conditions in real-time
This information is stored in the video player to
manage live video traffic coming from 4 billion video
feeds every month, to ensure maximum retention
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Reduces waiting time before the
video starts
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Reduces waiting time before the
video starts
Avoids buffering and recovers the
video from a technical error
Spark Use Case
Conviva is one of the world’s leading video streaming companies
Using Apache Spark, Conviva has
created an auto diagnostics alert
It automatically detects anomalies
along the video streaming pipeline and
diagnoses the root cause of the issue
Reduces waiting time before the
video starts
Avoids buffering and recovers the
video from a technical error
Goal is to maximize the viewer
engagement
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn

More Related Content

What's hot (20)

Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
Robert Sanders
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Vadim Y. Bichutskiy
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
DataArt
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Edureka!
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Introduction to Pig
Introduction to PigIntroduction to Pig
Introduction to Pig
Prashanth Babu
 
Spark
SparkSpark
Spark
Koushik Mondal
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Spark
SparkSpark
Spark
Heena Madan
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
Hadoop YARN
Hadoop YARNHadoop YARN
Hadoop YARN
Vigen Sahakyan
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
Dan Gunter
 
Mining Data Streams
Mining Data StreamsMining Data Streams
Mining Data Streams
SujaAldrin
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
Girish Khanzode
 
Hadoop vs Apache Spark
Hadoop vs Apache SparkHadoop vs Apache Spark
Hadoop vs Apache Spark
ALTEN Calsoft Labs
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
DataArt
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Edureka!
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
Dan Gunter
 
Mining Data Streams
Mining Data StreamsMining Data Streams
Mining Data Streams
SujaAldrin
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
Russell Jurney
 

Similar to What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn (20)

SparkPaper
SparkPaperSparkPaper
SparkPaper
Suraj Thapaliya
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Big_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionBig_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_Session
RUHULAMINHAZARIKA
 
APACHE SPARK.pptx
APACHE SPARK.pptxAPACHE SPARK.pptx
APACHE SPARK.pptx
DeepaThirumurugan
 
Apache spark
Apache sparkApache spark
Apache spark
Dona Mary Philip
 
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
bhuvankumar3877
 
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for Beginners
Anirudh
 
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsfPyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
sasuke20y4sh
 
Apache spark
Apache sparkApache spark
Apache spark
Prashant Pranay
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache Spark
Venkata Naga Ravi
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
Frank Schroeter
 
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014
mahchiev
 
39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf
ajajkhan16
 
Spark Concepts Cheat Sheet_Interview_Question.pdf
Spark Concepts Cheat Sheet_Interview_Question.pdfSpark Concepts Cheat Sheet_Interview_Question.pdf
Spark Concepts Cheat Sheet_Interview_Question.pdf
aekannake
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
Happiest Minds Technologies
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
Marius Soutier
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
apache spark Presentation general seminar.pptx
apache spark Presentation general seminar.pptxapache spark Presentation general seminar.pptx
apache spark Presentation general seminar.pptx
abhinavas9207
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Big_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionBig_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_Session
RUHULAMINHAZARIKA
 
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
bhuvankumar3877
 
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for Beginners
Anirudh
 
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsfPyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
Pyspark presentationsfspfsjfspfjsfpsjfspfjsfpsjfsfsf
sasuke20y4sh
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache Spark
Venkata Naga Ravi
 
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014
mahchiev
 
39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf39.-Introduction-to-Sparkspark and all-1.pdf
39.-Introduction-to-Sparkspark and all-1.pdf
ajajkhan16
 
Spark Concepts Cheat Sheet_Interview_Question.pdf
Spark Concepts Cheat Sheet_Interview_Question.pdfSpark Concepts Cheat Sheet_Interview_Question.pdf
Spark Concepts Cheat Sheet_Interview_Question.pdf
aekannake
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
apache spark Presentation general seminar.pptx
apache spark Presentation general seminar.pptxapache spark Presentation general seminar.pptx
apache spark Presentation general seminar.pptx
abhinavas9207
 
Ad

More from Simplilearn (20)

Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Simplilearn
 
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Simplilearn
 
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Simplilearn
 
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
Simplilearn
 
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
Simplilearn
 
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
Simplilearn
 
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Simplilearn
 
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
Simplilearn
 
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
Simplilearn
 
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Simplilearn
 
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Simplilearn
 
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Simplilearn
 
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
Simplilearn
 
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Simplilearn
 
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Simplilearn
 
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Simplilearn
 
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Simplilearn
 
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Simplilearn
 
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
Simplilearn
 
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
Simplilearn
 
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
Simplilearn
 
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
Simplilearn
 
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
Simplilearn
 
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
Simplilearn
 
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
Simplilearn
 
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Simplilearn
 
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Simplilearn
 
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
Simplilearn
 
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
Simplilearn
 
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Simplilearn
 
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Simplilearn
 
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Simplilearn
 
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Simplilearn
 
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Simplilearn
 
Ad

Recently uploaded (20)

Different pricelists for different shops in odoo Point of Sale in Odoo 17
Different pricelists for different shops in odoo Point of Sale in Odoo 17Different pricelists for different shops in odoo Point of Sale in Odoo 17
Different pricelists for different shops in odoo Point of Sale in Odoo 17
Celine George
 
june 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptxjune 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptx
roger malina
 
MATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKANMATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKAN
aditya23173
 
How to Create an Event in Odoo 18 - Odoo 18 Slides
How to Create an Event in Odoo 18 - Odoo 18 SlidesHow to Create an Event in Odoo 18 - Odoo 18 Slides
How to Create an Event in Odoo 18 - Odoo 18 Slides
Celine George
 
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Analysis of Quantitative Data Parametric and non-parametric tests.pptxAnalysis of Quantitative Data Parametric and non-parametric tests.pptx
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Shrutidhara2
 
Hemiptera & Neuroptera: Insect Diversity.pptx
Hemiptera & Neuroptera: Insect Diversity.pptxHemiptera & Neuroptera: Insect Diversity.pptx
Hemiptera & Neuroptera: Insect Diversity.pptx
Arshad Shaikh
 
How to Manage Upselling of Subscriptions in Odoo 18
How to Manage Upselling of Subscriptions in Odoo 18How to Manage Upselling of Subscriptions in Odoo 18
How to Manage Upselling of Subscriptions in Odoo 18
Celine George
 
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Basic English for Communication - Dr Hj Euis Eti Rohaeti MpdBasic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Restu Bias Primandhika
 
EUPHORIA GENERAL QUIZ FINALS | QUIZ CLUB OF PSGCAS | 21 MARCH 2025
EUPHORIA GENERAL QUIZ FINALS | QUIZ CLUB OF PSGCAS | 21 MARCH 2025EUPHORIA GENERAL QUIZ FINALS | QUIZ CLUB OF PSGCAS | 21 MARCH 2025
EUPHORIA GENERAL QUIZ FINALS | QUIZ CLUB OF PSGCAS | 21 MARCH 2025
Quiz Club of PSG College of Arts & Science
 
Pfeiffer "Secrets to Changing Behavior in Scholarly Communication: A 2025 NIS...
Pfeiffer "Secrets to Changing Behavior in Scholarly Communication: A 2025 NIS...Pfeiffer "Secrets to Changing Behavior in Scholarly Communication: A 2025 NIS...
Pfeiffer "Secrets to Changing Behavior in Scholarly Communication: A 2025 NIS...
National Information Standards Organization (NISO)
 
MATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKANMATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKAN
aditya23173
 
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition OecdEnergy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
razelitouali
 
What is FIle and explanation of text files.pptx
What is FIle and explanation of text files.pptxWhat is FIle and explanation of text files.pptx
What is FIle and explanation of text files.pptx
Ramakrishna Reddy Bijjam
 
Parenting Teens: Supporting Trust, resilience and independence
Parenting Teens: Supporting Trust, resilience and independenceParenting Teens: Supporting Trust, resilience and independence
Parenting Teens: Supporting Trust, resilience and independence
Pooky Knightsmith
 
Black and White Illustrative Group Project Presentation.pdf (1).pdf
Black and White Illustrative Group Project Presentation.pdf (1).pdfBlack and White Illustrative Group Project Presentation.pdf (1).pdf
Black and White Illustrative Group Project Presentation.pdf (1).pdf
AnnasofiaUrsini
 
Strengthened Senior High School - Landas Tool Kit.pptx
Strengthened Senior High School - Landas Tool Kit.pptxStrengthened Senior High School - Landas Tool Kit.pptx
Strengthened Senior High School - Landas Tool Kit.pptx
SteffMusniQuiballo
 
Rai dyansty Chach or Brahamn dynasty, History of Dahir History of Sindh NEP.pptx
Rai dyansty Chach or Brahamn dynasty, History of Dahir History of Sindh NEP.pptxRai dyansty Chach or Brahamn dynasty, History of Dahir History of Sindh NEP.pptx
Rai dyansty Chach or Brahamn dynasty, History of Dahir History of Sindh NEP.pptx
Dr. Ravi Shankar Arya Mahila P. G. College, Banaras Hindu University, Varanasi, India.
 
IDF 30min presentation - December 2, 2024.pptx
IDF 30min presentation - December 2, 2024.pptxIDF 30min presentation - December 2, 2024.pptx
IDF 30min presentation - December 2, 2024.pptx
ArneeAgligar
 
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Webcrawler_Mule_AIChain_MuleSoft_Meetup_HyderabadWebcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Veera Pallapu
 
Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...
Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...
Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...
GeorgeDiamandis11
 
Different pricelists for different shops in odoo Point of Sale in Odoo 17
Different pricelists for different shops in odoo Point of Sale in Odoo 17Different pricelists for different shops in odoo Point of Sale in Odoo 17
Different pricelists for different shops in odoo Point of Sale in Odoo 17
Celine George
 
june 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptxjune 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptx
roger malina
 
MATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKANMATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 4 LANDASAN FILOSOFIS PENDIDIKAN
aditya23173
 
How to Create an Event in Odoo 18 - Odoo 18 Slides
How to Create an Event in Odoo 18 - Odoo 18 SlidesHow to Create an Event in Odoo 18 - Odoo 18 Slides
How to Create an Event in Odoo 18 - Odoo 18 Slides
Celine George
 
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Analysis of Quantitative Data Parametric and non-parametric tests.pptxAnalysis of Quantitative Data Parametric and non-parametric tests.pptx
Analysis of Quantitative Data Parametric and non-parametric tests.pptx
Shrutidhara2
 
Hemiptera & Neuroptera: Insect Diversity.pptx
Hemiptera & Neuroptera: Insect Diversity.pptxHemiptera & Neuroptera: Insect Diversity.pptx
Hemiptera & Neuroptera: Insect Diversity.pptx
Arshad Shaikh
 
How to Manage Upselling of Subscriptions in Odoo 18
How to Manage Upselling of Subscriptions in Odoo 18How to Manage Upselling of Subscriptions in Odoo 18
How to Manage Upselling of Subscriptions in Odoo 18
Celine George
 
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Basic English for Communication - Dr Hj Euis Eti Rohaeti MpdBasic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Basic English for Communication - Dr Hj Euis Eti Rohaeti Mpd
Restu Bias Primandhika
 
MATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKANMATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKAN
MATERI PPT TOPIK 1 LANDASAN FILOSOFIS PENDIDIKAN
aditya23173
 
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition OecdEnergy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
Energy Balances Of Oecd Countries 2011 Iea Statistics 1st Edition Oecd
razelitouali
 
What is FIle and explanation of text files.pptx
What is FIle and explanation of text files.pptxWhat is FIle and explanation of text files.pptx
What is FIle and explanation of text files.pptx
Ramakrishna Reddy Bijjam
 
Parenting Teens: Supporting Trust, resilience and independence
Parenting Teens: Supporting Trust, resilience and independenceParenting Teens: Supporting Trust, resilience and independence
Parenting Teens: Supporting Trust, resilience and independence
Pooky Knightsmith
 
Black and White Illustrative Group Project Presentation.pdf (1).pdf
Black and White Illustrative Group Project Presentation.pdf (1).pdfBlack and White Illustrative Group Project Presentation.pdf (1).pdf
Black and White Illustrative Group Project Presentation.pdf (1).pdf
AnnasofiaUrsini
 
Strengthened Senior High School - Landas Tool Kit.pptx
Strengthened Senior High School - Landas Tool Kit.pptxStrengthened Senior High School - Landas Tool Kit.pptx
Strengthened Senior High School - Landas Tool Kit.pptx
SteffMusniQuiballo
 
IDF 30min presentation - December 2, 2024.pptx
IDF 30min presentation - December 2, 2024.pptxIDF 30min presentation - December 2, 2024.pptx
IDF 30min presentation - December 2, 2024.pptx
ArneeAgligar
 
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Webcrawler_Mule_AIChain_MuleSoft_Meetup_HyderabadWebcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Webcrawler_Mule_AIChain_MuleSoft_Meetup_Hyderabad
Veera Pallapu
 
Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...
Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...
Module 4 Presentation - Enhancing Competencies and Engagement Strategies in Y...
GeorgeDiamandis11
 

What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial | Simplilearn

  • 2. What’s in it for you? 1. History of Spark What’s in it for you?
  • 3. What’s in it for you? 1. History of Spark 2. What is Spark? What’s in it for you?
  • 4. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark What’s in it for you?
  • 5. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark What’s in it for you? Spark Core Spark SQL Spark Streaming Spark MLlib GraphX
  • 6. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark Architecture What’s in it for you?
  • 7. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark Architecture 6. Applications of Spark What’s in it for you?
  • 8. What’s in it for you? 1. History of Spark 2. What is Spark? 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark Architecture 6. Applications of Spark 7. Spark Use Case What’s in it for you?
  • 9. History of Apache Spark Started as a project at UC Berkley AMPLab 2009
  • 10. History of Apache Spark Started as a project at UC Berkley AMPLab Open sourced under a BSD license 2009 2010
  • 11. History of Apache Spark Started as a project at UC Berkley AMPLab Open sourced under a BSD license Spark became an Apache top level project 2009 2010 2013
  • 12. History of Apache Spark Started as a project at UC Berkley AMPLab Open sourced under a BSD license Spark became an Apache top level project Used by Databricks to sort large-scale datasets and set a new world record 2009 2010 2013 2014
  • 13. History of Apache Spark What is Apache Spark?
  • 14. What is Apache Spark? Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs
  • 15. What is Apache Spark? Support various programming languages Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs
  • 16. What is Apache Spark? Support various programming languages Developers and data scientists incorporate Spark into their applications to rapidly query, analyze, and transform data at scale Query Analyze Transform Apache Spark is an open-source data processing engine to store and process data in real-time across various clusters of computers using simple programming constructs
  • 17. History of Apache Spark Hadoop vs Spark
  • 18. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory
  • 19. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory Performs batch processing of data Performs both batch processing and real-time processing of data
  • 20. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory Performs batch processing of data Performs both batch processing and real-time processing of data Hadoop has more lines of code. Since it is written in Java, it takes more time to execute Spark has fewer lines of code as it is implemented in Scala
  • 21. Hadoop vs Spark Processing data using MapReduce in Hadoop is slow Spark processes data 100 times faster than MapReduce as it is done in- memory Performs batch processing of data Performs both batch processing and real-time processing of data Hadoop has more lines of code. Since it is written in Java, it takes more time to execute Spark has fewer lines of code as it is implemented in Scala Hadoop supports Kerberos authentication, which is difficult to manage Spark supports authentication via a shared secret. It can also run on YARN leveraging the capability of Kerberos
  • 22. History of Apache Spark Spark Features
  • 23. Spark Features Fast processing Spark contains Resilient Distributed Datasets (RDD) which saves time taken in reading, and writing operations and hence, it runs almost ten to hundred times faster than Hadoop
  • 24. Spark Features In-memory computing In Spark, data is stored in the RAM, so it can access the data quickly and accelerate the speed of analytics Fast processing
  • 25. Spark Features Flexible Spark supports multiple languages and allows the developers to write applications in Java, Scala, R, or Python In-memory computingFast processing
  • 26. Spark Features Fault tolerance Spark contains Resilient Distributed Datasets (RDD) that are designed to handle the failure of any worker node in the cluster. Thus, it ensures that the loss of data reduces to zero Flexible In-memory computingFast processing
  • 27. Spark Features Better analytics Spark has a rich set of SQL queries, machine learning algorithms, complex analytics, etc. With all these functionalities, analytics can be performed better Fault toleranceFlexible In-memory computingFast processing
  • 28. History of Apache Spark Components of Spark
  • 29. Components of Apache Spark Spark Core
  • 30. Components of Apache Spark Spark Core Spark SQL SQL
  • 31. Components of Apache Spark Spark Streaming Spark Core Spark SQL SQL Streaming
  • 32. Components of Apache Spark MLlib Spark Streaming Spark Core Spark SQL SQL Streaming MLlib
  • 33. Components of Apache Spark MLlib Spark Streaming Spark Core Spark SQL GraphX SQL Streaming MLlib
  • 34. History of Apache Spark Components of Spark – Spark Core
  • 35. Spark Core Spark Core Spark Core is the base engine for large-scale parallel and distributed data processing
  • 36. Spark Core Spark Core Spark Core is the base engine for large-scale parallel and distributed data processing It is responsible for: memory management fault recovery scheduling, distributing and monitoring jobs on a cluster interacting with storage systems
  • 37. Resilient Distributed Dataset Spark Core Spark Core is embedded with RDDs (Resilient Distributed Datasets), an immutable fault-tolerant, distributed collection of objects that can be operated on in parallel RDD Transformation Action These are operations (such as reduce, first, count) that return a value after running a computation on an RDD These are operations (such as map, filter, join, union) that are performed on an RDD that yields a new RDD containing the result
  • 38. History of Apache Spark Components of Spark – Spark SQL
  • 39. Spark SQL Spark SQL framework component is used for structured and semi-structured data processing Spark SQL SQL
  • 40. Spark SQL Spark SQL framework component is used for structured and semi-structured data processing Spark SQL SQL DataFrame DSL Spark SQL and HQL DataFrame API Data Source API CSV JSON JDBC Spark SQL Architecture
  • 41. History of Apache Spark Components of Spark – Spark Streaming
  • 42. Spark Streaming Spark Streaming is a lightweight API that allows developers to perform batch processing and real-time streaming of data with ease Spark Streaming Streaming Provides secure, reliable, and fast processing of live data streams
  • 43. Spark Streaming Spark Streaming is a lightweight API that allows developers to perform batch processing and real-time streaming of data with ease Spark Streaming Streaming Provides secure, reliable, and fast processing of live data streams Streaming Engine Input data stream Batches of input data Batches of processed data
  • 44. History of Apache Spark Components of Spark – Spark MLlib
  • 45. Spark MLlib MLlib is a low-level machine learning library that is simple to use, is scalable, and compatible with various programming languages MLlib MLlib MLlib eases the deployment and development of scalable machine learning algorithms
  • 46. Spark MLlib MLlib is a low-level machine learning library that is simple to use, is scalable, and compatible with various programming languages MLlib MLlib MLlib eases the deployment and development of scalable machine learning algorithms It contains machine learning libraries that have an implementation of various machine learning algorithms Clustering Classification Collaborative Filtering
  • 47. History of Apache Spark Components of Spark – GraphX
  • 48. GraphX GraphX is Spark’s own Graph Computation Engine and data store GraphX
  • 49. GraphX GraphX is Spark’s own Graph Computation Engine and data store GraphX Provides a uniform tool for ETL Exploratory data analysis Interactive graph computations
  • 50. History of Apache Spark Spark Architecture
  • 51. Master Node Driver Program SparkContext • Master Node has a Driver Program • The Spark code behaves as a driver program and creates a SparkContext, which is a gateway to all the Spark functionalities Apache Spark uses a master-slave architecture that consists of a driver, that runs on a master node, and multiple executors which run across the worker nodes in the cluster Spark Architecture
  • 52. Master Node Driver Program SparkContext Cluster Manager • Spark applications run as independent sets of processes on a cluster • The driver program & Spark context takes care of the job execution within the cluster Spark Architecture
  • 53. Master Node Driver Program SparkContext Cluster Manager Cache Task Task Executor Worker Node Cache Task Task Executor Worker Node • A job is split into multiple tasks that are distributed over the worker node • When an RDD is created in Spark context, it can be distributed across various nodes • Worker nodes are slaves that run different tasks Spark Architecture
  • 54. Master Node Driver Program SparkContext Cluster Manager Cache Task Task Executor Worker Node Cache Task Task Executor Worker Node • The Executor is responsible for the execution of these tasks • Worker nodes execute the tasks assigned by the Cluster Manager and return the results back to the SparkContext Spark Architecture
  • 55. Spark Cluster Managers Standalone mode 1 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes
  • 56. Spark Cluster Managers Standalone mode 1 2 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications
  • 57. Spark Cluster Managers Standalone mode 1 2 3 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications Apache YARN is the cluster resource manager of Hadoop 2. Spark can be run on YARN
  • 58. Spark Cluster Managers Standalone mode 1 2 3 4 By default, applications submitted to the standalone mode cluster will run in FIFO order, and each application will try to use all available nodes Apache Mesos is an open-source project to manage computer clusters, and can also run Hadoop applications Apache YARN is the cluster resource manager of Hadoop 2. Spark can be run on YARN Kubernetes is an open- source system for automating deployment, scaling, and management of containerized applications
  • 59. History of Apache Spark Applications of Spark
  • 60. Applications of Spark Banking JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest
  • 61. Applications of Spark Banking E-Commerce JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest Alibaba uses Spark to analyze large sets of data such as real-time transaction details, browsing history, etc. in the form of Spark jobs and provides recommendations to its users
  • 62. Applications of Spark Banking E-Commerce Healthcare JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest Alibaba uses Spark to analyze large sets of data such as real-time transaction details, browsing history, etc. in the form of Spark jobs and provides recommendations to its users IQVIA is a leading healthcare company that uses Spark to analyze patient’s data, identify possible health issues, and diagnose it based on their medical history
  • 63. Applications of Spark Banking E-Commerce Healthcare Entertainment JPMorgan uses Spark to detect fraudulent transactions, analyze the business spends of an individual to suggest offers, and identify patterns to decide how much to invest and where to invest Alibaba uses Spark to analyze large sets of data such as real-time transaction details, browsing history, etc. in the form of Spark jobs and provides recommendations to its users IQVIA is a leading healthcare company that uses Spark to analyze patient’s data, identify possible health issues, and diagnose it based on their medical history Entertainment and gaming companies like Netflix and Riot games use Apache Spark to showcase relevant advertisements to their users based on the videos that they watch, share, and like
  • 64. History of Apache Spark Spark Use Case
  • 65. Spark Use Case Conviva is one of the world’s leading video streaming companies
  • 66. Spark Use Case Conviva is one of the world’s leading video streaming companies Video streaming is a challenge, especially with increasing demand for high-quality streaming experiences
  • 67. Spark Use Case Conviva is one of the world’s leading video streaming companies Video streaming is a challenge, especially with increasing demand for high-quality streaming experiences Conviva collects data about video streaming quality to give their customers visibility into the end- user experience they are delivering
  • 68. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva delivers a better quality of service to its customers by removing the screen buffering and learning in detail about the network conditions in real-time
  • 69. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva delivers a better quality of service to its customers by removing the screen buffering and learning in detail about the network conditions in real-time This information is stored in the video player to manage live video traffic coming from 4 billion video feeds every month, to ensure maximum retention
  • 70. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert
  • 71. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue
  • 72. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue Reduces waiting time before the video starts
  • 73. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue Reduces waiting time before the video starts Avoids buffering and recovers the video from a technical error
  • 74. Spark Use Case Conviva is one of the world’s leading video streaming companies Using Apache Spark, Conviva has created an auto diagnostics alert It automatically detects anomalies along the video streaming pipeline and diagnoses the root cause of the issue Reduces waiting time before the video starts Avoids buffering and recovers the video from a technical error Goal is to maximize the viewer engagement

Editor's Notes