SlideShare a Scribd company logo
Vegas
The Missing Matplotlib for
Scala/Spark
DB Tsai
Roger Menezes
Homepage Kids Page Downloads Page
Netflix Recommendations
Every aspect
of the
Experience is
Machine
Learned
3
2017
> 100M members
> 190 countries
Multiple Devices
Genres: 23 rows/page average
Sims: 10 rows/page average
My List:
Continue Watching:
Popular on Netflix:
Trending Now:
Watch It Again:
Top Picks:
Because You Watched:
Genres:
New Releases:
Recently Added:
Originals RowBillboard:
Machine Learning at Netflix
● Optimize the Experimentation usecase vs Productionization
● Experimentation
○ Opportunity sizing, Data Exploration
○ Feature Identification and Selection
○ Tweaks to ML algos
○ Model Evaluation
Experimenter’s loop
Problem
Explore
Data
Identify
Features
Produce
Model
Evaluate
Model
Share
Findings
Notebooks
● Optimal for Experimentation
● Sharing reproducible research
○ Facilitates feedback loop with Product Managers
● End to end ML experiment.
○ Interactivity drives productivity
Python Notebooks
Python Notebooks
● Seamless Experience - ML experimentation
● Well known Scientific computing libraries
● Huge catalog of Visualization plotting libraries
○ Matplotlib, Seaborn, Bokeh, BQPlot, Lightning, etc.
Scala Notebooks
● Zeppelin, Jupyter, Databricks, Spark-Notebooks, ...
● Computing library gap filling up
● Lack of Visualization Libraries
○ Main friction point in adoption
○ End to End ML use case not convincing
Introducing Vegas
● Visualization Library in Scala
● Mainly built for the notebook use case
● Scala wrapper around Vega-Lite
○ Missing MatPlotLib for the Scala/Spark world.
DECLARATIVE
STATISTICAL
VISUALIZATION
GRAMMAR
IN SCALA
You tell it WHAT should be done with the data, and it knows
HOW to do it!
Operations such as filtering, aggregation, faceting are built
into the visualization, rather than putting the burden on the
user to massage the data into shape.
Complex visualizations can be built with a few high level
abstractions:
DATA
TRANS-
FORMS
SCALES
GUIDES MARKS
cf : Altair Talk by Brian Granger in PyData 2016 https://youtu.be/v5mrwq7yJc4
Added Bonus of Declarative
Visualizations:
INTERACTIVITY!
D3JS
VEGAS
VEGAS CODE EXPANDS OUT TO D3JS CODE!
Anatomy of a plot: Channels
X/Y channel
Shape Channel
Size Channel
Color Channel
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger Menezes
Features…
1. Supports most plot types
2. Trellis plots
3. Layers
Layer 1.
Layer 2.
Layer 3.
4. Notebook and Consoles
5. Built-in spark support
Vegas
.withDataFrame(myDataFrame)
.encodeX(“population”)
.encodeY(“age”)
Mapped Columns
Pass In DF.
6. Visual statistics
● Advanced Binning
● Sorting
● Scaling
● Custom Transforms
● Time Series
● Aggregation
● Filtering
● Math functions (log, etc)
● Descriptive Statistics
How It Works !
1. Specify in Scala
2. Embed HTML
(iFrame)
3. Render within
iFrame using JS
VEGA
D3JS
VEGA-LITE*
VEGAS
MOREABSTRACTION SCALA DSL EMITS TYPE-CHECKED
VEGA-LITE JSON
VEGA-LITE CONVERTS INTERNALLY
TO VEGA JSON SPEC
VEGA TRANSLATES JSON TO D3JS
CODE THAT CAN BE VERY VERBOSE
A SCALA DSL FOR VEGA-LITE
* Vega-Lite
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger Menezes
What’s coming
1. Interactive selections
2. Selections transforms
Contributors
Aish DB Roger
Sudeep Jeremy
Thank you.
@NetflixResearch
@rogermenezes @dbtsai
The missing MatPlotLib
for Scala/Spark
http://vegas-viz.org
Ad

Recommended

VEGAS: The Missing Matplotlib for Scala/Apache Spark with Roger Menezes and D...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with Roger Menezes and D...
Spark Summit
 
A Developer’s View into Spark's Memory Model with Wenchen Fan
A Developer’s View into Spark's Memory Model with Wenchen Fan
Databricks
 
Using Pluggable Apache Spark SQL Filters to Help GridPocket Users Keep Up wit...
Using Pluggable Apache Spark SQL Filters to Help GridPocket Users Keep Up wit...
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
State of Spark in the cloud (Spark Summit EU 2017)
State of Spark in the cloud (Spark Summit EU 2017)
Nicolas Poggi
 
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Databricks
 
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Spark Summit
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Jim Dowling
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
felixcss
 
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...
Spark Summit
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Databricks
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes
Ziemowit Jankowski
 
Art of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel Sarwar
Spark Summit
 
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Databricks
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Databricks
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
Databricks
 
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Spark Summit
 
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Spark Summit
 
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial users
Wes McKinney
 
Apache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source Ecosystem
Databricks
 
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Databricks
 
Fabian Hueske – Juggling with Bits and Bytes
Fabian Hueske – Juggling with Bits and Bytes
Flink Forward
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Spark Summit
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Spark Summit
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Amy W. Tang
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit
 
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit
 
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Beat Signer
 

More Related Content

What's hot (20)

SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
felixcss
 
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...
Spark Summit
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Databricks
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes
Ziemowit Jankowski
 
Art of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel Sarwar
Spark Summit
 
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Databricks
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Databricks
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
Databricks
 
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Spark Summit
 
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Spark Summit
 
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial users
Wes McKinney
 
Apache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source Ecosystem
Databricks
 
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Databricks
 
Fabian Hueske – Juggling with Bits and Bytes
Fabian Hueske – Juggling with Bits and Bytes
Flink Forward
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Spark Summit
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Spark Summit
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Amy W. Tang
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
felixcss
 
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...
Hardware Acceleration of Apache Spark on Energy-Efficient FPGAs with Christof...
Spark Summit
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Databricks
 
Case study- Real-time OLAP Cubes
Case study- Real-time OLAP Cubes
Ziemowit Jankowski
 
Art of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel Sarwar
Spark Summit
 
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit EU talk by Patrick Baier and Stanimir Dragiev
Spark Summit
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Databricks
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Databricks
 
Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
Databricks
 
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Storage Engine Considerations for Your Apache Spark Applications with Mladen ...
Spark Summit
 
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Supporting Highly Multitenant Spark Notebook Workloads with Craig Ingram and ...
Spark Summit
 
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit EU talk by Ruben Pulido Behar Veliqi
Spark Summit
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial users
Wes McKinney
 
Apache Spark Usage in the Open Source Ecosystem
Apache Spark Usage in the Open Source Ecosystem
Databricks
 
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Spark Summit San Francisco 2016 - Ali Ghodsi Keynote
Databricks
 
Fabian Hueske – Juggling with Bits and Bytes
Fabian Hueske – Juggling with Bits and Bytes
Flink Forward
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Spark Summit
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Spark Summit
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Amy W. Tang
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit
 

Similar to VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger Menezes (20)

Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit
 
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Beat Signer
 
GeeCON Prague 2015
GeeCON Prague 2015
Mateusz Dymczyk
 
Euroscipy2018
Euroscipy2018
Patrick Muehlbauer
 
Porting R Models into Scala Spark
Porting R Models into Scala Spark
carl_pulley
 
Future of ai on the jvm
Future of ai on the jvm
Adam Gibson
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
Alexey Zinoviev
 
Running deep neural nets in your Java application with Deeplearning4j
Running deep neural nets in your Java application with Deeplearning4j
Alexander Fedintsev
 
Machine Learning by Example - Apache Spark
Machine Learning by Example - Apache Spark
Meeraj Kunnumpurath
 
Navigating the Wide World of Data Visualization Libraries
Navigating the Wide World of Data Visualization Libraries
Krist Wongsuphasawat
 
Data Visualization in Machine Learning | IABAC
Data Visualization in Machine Learning | IABAC
IABAC
 
Spark meetup TCHUG
Spark meetup TCHUG
Ryan Bosshart
 
Spark forplainoldjavageeks svforum_20140724
Spark forplainoldjavageeks svforum_20140724
sdeeg
 
From Experimentation to Production - Scala & Python APIs for DL4J
From Experimentation to Production - Scala & Python APIs for DL4J
Max Pumperla
 
Introduction overviewmachinelearning sig Door Lucas Jellema
Introduction overviewmachinelearning sig Door Lucas Jellema
Getting value from IoT, Integration and Data Analytics
 
IRJET- Plug-In based System for Data Visualization
IRJET- Plug-In based System for Data Visualization
IRJET Journal
 
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
Qianwen Wang
 
Introduction to Machine Learning - An overview and first step for candidate d...
Introduction to Machine Learning - An overview and first step for candidate d...
Lucas Jellema
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using spark
Databricks
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit
 
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Data Processing and Visualisation Frameworks - Lecture 6 - Information Visual...
Beat Signer
 
Porting R Models into Scala Spark
Porting R Models into Scala Spark
carl_pulley
 
Future of ai on the jvm
Future of ai on the jvm
Adam Gibson
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
Alexey Zinoviev
 
Running deep neural nets in your Java application with Deeplearning4j
Running deep neural nets in your Java application with Deeplearning4j
Alexander Fedintsev
 
Machine Learning by Example - Apache Spark
Machine Learning by Example - Apache Spark
Meeraj Kunnumpurath
 
Navigating the Wide World of Data Visualization Libraries
Navigating the Wide World of Data Visualization Libraries
Krist Wongsuphasawat
 
Data Visualization in Machine Learning | IABAC
Data Visualization in Machine Learning | IABAC
IABAC
 
Spark forplainoldjavageeks svforum_20140724
Spark forplainoldjavageeks svforum_20140724
sdeeg
 
From Experimentation to Production - Scala & Python APIs for DL4J
From Experimentation to Production - Scala & Python APIs for DL4J
Max Pumperla
 
IRJET- Plug-In based System for Data Visualization
IRJET- Plug-In based System for Data Visualization
IRJET Journal
 
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
Applying Machine Learning to Data Visaulization: What, Why, Where, and How
Qianwen Wang
 
Introduction to Machine Learning - An overview and first step for candidate d...
Introduction to Machine Learning - An overview and first step for candidate d...
Lucas Jellema
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using spark
Databricks
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
Lucas Jellema
 
Ad

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Spark Summit
 
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Spark Summit
 
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Spark Summit
 
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Spark Summit
 
Ad

Recently uploaded (20)

KLIP2Data voor de herinrichting van R4 West en Oost
KLIP2Data voor de herinrichting van R4 West en Oost
jacoba18
 
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
payalmistryb
 
最新版美国加利福尼亚大学旧金山法学院毕业证(UCLawSF毕业证书)定制
最新版美国加利福尼亚大学旧金山法学院毕业证(UCLawSF毕业证书)定制
taqyea
 
Indigo_Airlines_Strategy_Presentation.pptx
Indigo_Airlines_Strategy_Presentation.pptx
mukeshpurohit991
 
Data Visualisation in data science for students
Data Visualisation in data science for students
confidenceascend
 
最新版美国亚利桑那大学毕业证(UA毕业证书)原版定制
最新版美国亚利桑那大学毕业证(UA毕业证书)原版定制
Taqyea
 
Verweven van EM Legacy en OTL-data bij AWV
Verweven van EM Legacy en OTL-data bij AWV
jacoba18
 
Advanced_English_Pronunciation_in_Use.pdf
Advanced_English_Pronunciation_in_Use.pdf
leogoemmanguyenthao
 
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
 
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
 
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
 
Boost Business Efficiency with Professional Data Entry Services
Boost Business Efficiency with Professional Data Entry Services
eloiacs eloiacs
 
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Mahmoud Shoush
 
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Ameya Patekar
 
B.Tech Business Plan mena countries and europe
B.Tech Business Plan mena countries and europe
AhmedSelim238929
 
SUNSSE Engineering Introduction 2021.pdf
SUNSSE Engineering Introduction 2021.pdf
Ongkino
 
定制OCAD学生卡加拿大安大略艺术与设计大学成绩单范本,OCAD成绩单复刻
定制OCAD学生卡加拿大安大略艺术与设计大学成绩单范本,OCAD成绩单复刻
taqyed
 
Grote OSM datasets zonder kopzorgen bij Reijers
Grote OSM datasets zonder kopzorgen bij Reijers
jacoba18
 
llm_presentation and deep learning methods
llm_presentation and deep learning methods
sayedabdussalam11
 
Shifting Focus on AI: How it Can Make a Positive Difference
Shifting Focus on AI: How it Can Make a Positive Difference
1508 A/S
 
KLIP2Data voor de herinrichting van R4 West en Oost
KLIP2Data voor de herinrichting van R4 West en Oost
jacoba18
 
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
YEAP !NOT WHAT YOU THINK aakshdjdncnkenfj
payalmistryb
 
最新版美国加利福尼亚大学旧金山法学院毕业证(UCLawSF毕业证书)定制
最新版美国加利福尼亚大学旧金山法学院毕业证(UCLawSF毕业证书)定制
taqyea
 
Indigo_Airlines_Strategy_Presentation.pptx
Indigo_Airlines_Strategy_Presentation.pptx
mukeshpurohit991
 
Data Visualisation in data science for students
Data Visualisation in data science for students
confidenceascend
 
最新版美国亚利桑那大学毕业证(UA毕业证书)原版定制
最新版美国亚利桑那大学毕业证(UA毕业证书)原版定制
Taqyea
 
Verweven van EM Legacy en OTL-data bij AWV
Verweven van EM Legacy en OTL-data bij AWV
jacoba18
 
Advanced_English_Pronunciation_in_Use.pdf
Advanced_English_Pronunciation_in_Use.pdf
leogoemmanguyenthao
 
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
 
Attendance Presentation Project Excel.pptx
Attendance Presentation Project Excel.pptx
s2025266191
 
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
 
Boost Business Efficiency with Professional Data Entry Services
Boost Business Efficiency with Professional Data Entry Services
eloiacs eloiacs
 
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints: A...
Mahmoud Shoush
 
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Ameya Patekar
 
B.Tech Business Plan mena countries and europe
B.Tech Business Plan mena countries and europe
AhmedSelim238929
 
SUNSSE Engineering Introduction 2021.pdf
SUNSSE Engineering Introduction 2021.pdf
Ongkino
 
定制OCAD学生卡加拿大安大略艺术与设计大学成绩单范本,OCAD成绩单复刻
定制OCAD学生卡加拿大安大略艺术与设计大学成绩单范本,OCAD成绩单复刻
taqyed
 
Grote OSM datasets zonder kopzorgen bij Reijers
Grote OSM datasets zonder kopzorgen bij Reijers
jacoba18
 
llm_presentation and deep learning methods
llm_presentation and deep learning methods
sayedabdussalam11
 
Shifting Focus on AI: How it Can Make a Positive Difference
Shifting Focus on AI: How it Can Make a Positive Difference
1508 A/S
 

VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger Menezes