Spark as a Service with Azure Databricks

Apache Spark Core APIs
RDDs, DataFrame, Datasets
Spark SQL
GraphX
(graph)
Structured
Streaming
Mllib
(machine
learning)
Spark: The Definitive Guide

Source: http://spark.apache.org/

Structured
Streaming
Advanced
Analytics
Libraries &
Ecosystem
Low Level APIs
Structure APIs
Datasets DataFrame SQL
RDDs Distributed Variables

RDD
RDD
RDD
RDDRDD
Transformations ValueActions

Transformations Actions
select show
distinct count
groupBy collect
sum save
orderBy first
filter
limit
summarize
… and much more

Driver
Cluster Manager
Executor
Spark Session
User code
Executor Executor
Distributed Data Structure
Partition Partition Partition
Partition Partition Partition

Managed Apache Spark platform optimized for Azure
Microsoft Azure

Optimized Databricks Runtime Engine
DATABRICKS I/O SERVERLESS
Collaborative Workspace
Cloud storage
Data warehouses
Hadoop storage
IoT / streaming data
Rest APIs
Machine learning models
BI tools
Data exports
Data warehouses
AZURE DATABRICKS
Enhance Productivity
Deploy Production Jobs & Workflows
APACHE SPARK
MULTI-STAGE PIPELINES
DATA ENGINEER
JOB SCHEDULER NOTIFICATION & LOGS
DATA SCIENTIST BUSINESS ANALYST
Build on secure & trusted cloud Scale without limits

Cosmos DB
Kafka on HDInsight
Event Hubs
Power BI
SQL DW
Data Factory
O R C H E S T R A T I O N
Storage (Azure) Azure Data Lake
S T O R A G E
I N G E S T V I S U A L I Z E
S E C U R E Azure Active Directory
A Z U RE DATA BRIC KS

https://movielens.org/
F. Maxwell Harper and Joseph A. Konstan. 2015.
The MovieLens Datasets: History and Context.
ACM Transactions on Interactive Intelligent
Systems (TiiS) 5, 4, Article 19 (December 2015), 19
pages. DOI=http://dx.doi.org/10.1145/2827872

Spark SQL
GraphX
(graph)
Structured
Streaming
Mllib
(machine
learning)




Spark SQL
GraphX
(graph)
Structured
Streaming
Mllib
(machine
learning)

 Collaborative Workspace
DATA ENGINEER

Collaborative Workspace
DATA ENGINEER

https://github.com/devlace/azure-databricks-
recommendation-system

Official Apache Spark website
Azure Databricks Documentation
[Book] Spark: The Definitive Guide

Spark as a Service with Azure Databricks

More Related Content

What's hot (20)

Similar to Spark as a Service with Azure Databricks (20)

Recently uploaded (20)

Spark as a Service with Azure Databricks