SlideShare a Scribd company logo
DISTRIBUTED DEEP
LEARNING WITH KERAS AND
TENSORFLOW ON APACHE
SPARK:YES,YOU CAN!
GUGLIELMO IOZZIA
MSD
MADRID, NOVEMBER 21ST 2019
#guglielmoiozzia
ABOUT ME
Currently at
Previously at
I got some awards lately Author I love cooking
DataOps
Champion
#guglielmoiozzia
MSD IRELAND
+ 50 years
Approx. 2,000 employees
$2.5 billion investment to date
Approx 50% MSD’s top 20 products manufactured here
Export to + 60 countries
€6.1 billion turnover in 2017
2017 + 300 jobs & €280m investment
MSD Biotech, Dublin, coming in 2021
https://www.msd-ireland.com/
THE DUBLIN TECH HUB
CORE TOPICS
• What is it?Deep Learning
• 2 of the most popular frameworks for DLKeras and Tensorflow
• Why is it so difficult?
Why Distributed Deep
Learning on Spark?
• Why and How?
DL in Python on the
JVM
DEEP LEARNING
It is a subset of Machine
Learning which is based on
Multilayer Neural Networks
DEEP LEARNING
http://www.asimovinstitute.org/wp-content/uploads/2019/04/NeuralNetworkZoo20042019.png
DL FRAMEWORKS POPULARITY
TENSORFLOW
It is an end-to-end open source
platform for ML. It has a
comprehensive, flexible
ecosystem of tools, libraries and
community resources for
researchers and developers.
https://www.tensorflow.org/
KERAS
Keras is a high-level neural
networks API, written in Python
and capable of running on top of
TensorFlow, CNTK, or Theano.
It allows for easy prototyping
and runs seamlessly on CPUs
and GPUs.
https://keras.io/
KERAS & TENSORFLOW
Starting from TensorFlow r1.14
Speed
It achieves high performance for
both batch and streaming data,
using a state-of-the-art DAG
scheduler, a query optimizer, and a
physical execution engine.
Ease of Use
It offers over 80 high-level
operators that make it easy to build
parallel apps. And you can use it
interactively from the Scala,
Python, R, and SQL shells.
Generality
Combine SQL, streaming, and
complex analytics.
Runs Everywhere
It runs on Hadoop, Apache
Mesos, Kubernetes,
standalone, or in the cloud. It
can access diverse data
sources.
WHEN WOULDYOU NEED TO TRAIN
MNNS IN SPARK
• Availability of a cluster of machines for training
• Scarcity of GPUs
• Networks very large
• Huge data sets
By the way, DL4J isn’t for Spark only: you can use it on a single machine
with multiple GPUs or multiple physical processors.
CHALLENGES OF TRAINING MNNS
IN SPARK
• Different execution models between Spark and the DL frameworks
• GPU configuration and management
• Performance
• Accuracy
WHY DISTRIBUTED DL ON THE JVM?
DEEPLEARNING4J
It is an Open Source,
distributed, Deep Learning
framework written for JVM
languages.
It is integrated with
Hadoop and Apache
Spark.
It can be used on
distributed GPUs and
CPUs.
WHY DISTRIBUTED DL ON THE JVM?
TensorFlow
DL4J MODULES
• DataVec
• Arbiter
• NN
• Datasets
• RL4J
• DL4J-Spark
• Model Import
• ND4J
It is an Open Source linear algebra
and matrix manipulation library which
supports n-dimensional arrays and it
is integrated with Apache Hadoop
and Spark.
DL4J + APACHE SPARK
• DL4J provides high level API to design, configure train and evaluate
MNNs.
• Spark performances are excellent in particular for ETL/streaming, but
in terms of computation, in a MNN training context, some data
transformation/aggregation needs to be done using a low-level
language.
• DL4J uses ND4J, which is a C++ library that provides high level Scala
API to developers.
MODEL IMPORT IN DL4J
Keras TensorFlow
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
Train the Model
Save it as .pb
Load Model and
Weights
Load New Data
Predict
KerasModelImport
TFGraphMapper
Transfer Learning
MODEL IMPORT IN DL4J
Keras TensorFlow
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
Train the Model
Save it as .pb
Load Model and
Weights
Load New Data
Predict
KERAS MODEL IMPORT: SUPPORTED
FEATURES
• Layers
• Losses
• Activations
• Initializers
• Regularizers
• Constraints
• Metrics
• Optimizers
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
Import the VGG16
Model.
Test it.
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
MODEL IMPORT IN DL4J: EXAMPLE
Keras
Train the Model
Save it as .h5
Load Model and
Weights
Load New Data
Predict
DL4J MODEL IMPORT IN ACTION
DATA PARALLELISM AND MODEL
PARALLELISM
HOW TRAINING HAPPENS IN SPARK
WITH DL4J
Parameter Averaging
(DL4J 1.0.0-alpha)
Asynchronous SDG
(DL4J 1.0.0-beta+)
HOW TRAINING HAPPENS IN SPARK
WITH DL4J
The key classes users should be familiar with to get started with distributed
training in DL4J are:
• TrainingMaster: It specifies how distributed training will be conducted in
practice. Implementations include Gradient Sharing or Parameter Averaging .
• SparkDl4jMultiLayer and SparkComputationGraph: They are wrappers
around the MultiLayerNetwork and ComputationGraph classes in DL4J that
enable the functionality related to distributed training.
• RDD<DataSet> and RDD<MultiDataSet>: Spark RDDs with DL4J’s
DataSet or MultiDataSet classes that define the source of the training or
evaluation data.
RE-TRAIN AN IMPORTED MODEL
Define the Spark Context
Choose the TrainingMaster implementation
Create the Spark network
Start the training
Get the model configuration
DL4J VISUAL FACILITIES
MEMORY UTILIZATION: SOMETHING TO TAKE CARE
OF
Take Care of the
Off-Heap Memory!
More on DL with DL4J on Spark in my book
http://tinyurl.3c1om/y9jkvtuy
Thanks!
Any questions?
You can find me at
@GuglielmoIozzia
https://ie.linkedin.com/in/giozzia
googlielmo.blogspot.com

More Related Content

PDF
Why scala for data science
PDF
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
PDF
From Experimentation to Production - Scala & Python APIs for DL4J
PDF
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
PDF
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
PDF
Spark Summit EU talk by Oscar Castaneda
PDF
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
PDF
Spark Summit EU talk by Jakub Hava
Why scala for data science
Distributed Deep Learning with Keras and TensorFlow on Apache Spark
From Experimentation to Production - Scala & Python APIs for DL4J
TensorFlowOnSpark Enhanced: Scala, Pipelines, and Beyond with Lee Yang and An...
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Spark Summit EU talk by Oscar Castaneda
Extending Apache Spark APIs Without Going Near Spark Source or a Compiler wi...
Spark Summit EU talk by Jakub Hava

What's hot (20)

PDF
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
PDF
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
PPTX
Big data Processing with Apache Spark & Scala
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
Deep Learning on Apache® Spark™ : Workflows and Best Practices
PDF
Operationalize Apache Spark Analytics
PDF
Spark Summit EU talk by Debasish Das and Pramod Narasimha
PPTX
Interactive Analytics using Apache Spark
PDF
EclairJS = Node.Js + Apache Spark
PDF
Productionizing Machine Learning Pipelines with Databricks and Azure ML
PDF
Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
PDF
Deep Learning with GPUs in Production - AI By the Bay
PDF
Spark Summit EU talk by Oscar Castaneda
PDF
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
PDF
An Introduction to Sparkling Water by Michal Malohlava
PDF
Spark Summit EU talk by Emlyn Whittick
PDF
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
PDF
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
PDF
Integrating Existing C++ Libraries into PySpark with Esther Kundin
PDF
DASK and Apache Spark
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Big data Processing with Apache Spark & Scala
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Operationalize Apache Spark Analytics
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Interactive Analytics using Apache Spark
EclairJS = Node.Js + Apache Spark
Productionizing Machine Learning Pipelines with Databricks and Azure ML
Benchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner ...
Deep Learning with GPUs in Production - AI By the Bay
Spark Summit EU talk by Oscar Castaneda
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
An Introduction to Sparkling Water by Michal Malohlava
Spark Summit EU talk by Emlyn Whittick
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Integrating Existing C++ Libraries into PySpark with Esther Kundin
DASK and Apache Spark
Ad

Similar to Big Things Conference 2019 - Distributed Deep Learning with Keras/TensorFlow on Apache Spark (20)

PDF
Guglielmo iozzia - Google I/O extended dublin 2018
PPTX
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
PPTX
Apache Spark Fundamentals
PPTX
Suneel Marthi - Deep Learning with Apache Flink and DL4J
PDF
Integrating Deep Learning Libraries with Apache Spark
PDF
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
PDF
Scaling Big Data with Hadoop and Mesos
PDF
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
PDF
Save money with Postgres on IBM PowerLinux
 
PDF
Build, Scale, and Deploy Deep Learning Pipelines with Ease Using Apache Spark
PDF
Advanced deeplearning4j features
PDF
Bringing Deep Learning into production
PDF
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
PDF
Data Engineering Course Syllabus - WeCloudData
PDF
Deep learning and Apache Spark
PDF
Leveraging Apache Spark for Scalable Data Prep and Inference in Deep Learning
PPT
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
PDF
BigDL webinar - Deep Learning Library for Spark
PPTX
4 Introduction to SPARK.pptx (uploaded from MyFiles)
PPTX
Apache Spark in Scientific Applciations
Guglielmo iozzia - Google I/O extended dublin 2018
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Apache Spark Fundamentals
Suneel Marthi - Deep Learning with Apache Flink and DL4J
Integrating Deep Learning Libraries with Apache Spark
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
Scaling Big Data with Hadoop and Mesos
Build, Scale, and Deploy Deep Learning Pipelines Using Apache Spark
Save money with Postgres on IBM PowerLinux
 
Build, Scale, and Deploy Deep Learning Pipelines with Ease Using Apache Spark
Advanced deeplearning4j features
Bringing Deep Learning into production
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Data Engineering Course Syllabus - WeCloudData
Deep learning and Apache Spark
Leveraging Apache Spark for Scalable Data Prep and Inference in Deep Learning
Deep Learning with DL4J on Apache Spark: Yeah it's Cool, but are You Doing it...
BigDL webinar - Deep Learning Library for Spark
4 Introduction to SPARK.pptx (uploaded from MyFiles)
Apache Spark in Scientific Applciations
Ad

Recently uploaded (20)

PDF
natwest.pdf company description and business model
PPTX
Self management and self evaluation presentation
PPTX
An Unlikely Response 08 10 2025.pptx
PPTX
Intro to ISO 9001 2015.pptx wareness raising
PPTX
Tour Presentation Educational Activity.pptx
PPTX
worship songs, in any order, compilation
PPTX
Project and change Managment: short video sequences for IBA
PPTX
lesson6-211001025531lesson plan ppt.pptx
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
Human Mind & its character Characteristics
PPTX
The Effect of Human Resource Management Practice on Organizational Performanc...
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
Sustainable Forest Management ..SFM.pptx
PPTX
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
PPTX
Anesthesia and it's stage with mnemonic and images
PPTX
Tablets And Capsule Preformulation Of Paracetamol
PDF
Presentation1 [Autosaved].pdf diagnosiss
PPTX
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
PDF
Swiggy’s Playbook: UX, Logistics & Monetization
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
natwest.pdf company description and business model
Self management and self evaluation presentation
An Unlikely Response 08 10 2025.pptx
Intro to ISO 9001 2015.pptx wareness raising
Tour Presentation Educational Activity.pptx
worship songs, in any order, compilation
Project and change Managment: short video sequences for IBA
lesson6-211001025531lesson plan ppt.pptx
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
Human Mind & its character Characteristics
The Effect of Human Resource Management Practice on Organizational Performanc...
Effective_Handling_Information_Presentation.pptx
Sustainable Forest Management ..SFM.pptx
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
Anesthesia and it's stage with mnemonic and images
Tablets And Capsule Preformulation Of Paracetamol
Presentation1 [Autosaved].pdf diagnosiss
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
Swiggy’s Playbook: UX, Logistics & Monetization
_ISO_Presentation_ISO 9001 and 45001.pptx

Big Things Conference 2019 - Distributed Deep Learning with Keras/TensorFlow on Apache Spark

  • 1. DISTRIBUTED DEEP LEARNING WITH KERAS AND TENSORFLOW ON APACHE SPARK:YES,YOU CAN! GUGLIELMO IOZZIA MSD MADRID, NOVEMBER 21ST 2019 #guglielmoiozzia
  • 2. ABOUT ME Currently at Previously at I got some awards lately Author I love cooking DataOps Champion #guglielmoiozzia
  • 3. MSD IRELAND + 50 years Approx. 2,000 employees $2.5 billion investment to date Approx 50% MSD’s top 20 products manufactured here Export to + 60 countries €6.1 billion turnover in 2017 2017 + 300 jobs & €280m investment MSD Biotech, Dublin, coming in 2021 https://www.msd-ireland.com/
  • 5. CORE TOPICS • What is it?Deep Learning • 2 of the most popular frameworks for DLKeras and Tensorflow • Why is it so difficult? Why Distributed Deep Learning on Spark? • Why and How? DL in Python on the JVM
  • 6. DEEP LEARNING It is a subset of Machine Learning which is based on Multilayer Neural Networks
  • 9. TENSORFLOW It is an end-to-end open source platform for ML. It has a comprehensive, flexible ecosystem of tools, libraries and community resources for researchers and developers. https://www.tensorflow.org/
  • 10. KERAS Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It allows for easy prototyping and runs seamlessly on CPUs and GPUs. https://keras.io/
  • 11. KERAS & TENSORFLOW Starting from TensorFlow r1.14
  • 12. Speed It achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine. Ease of Use It offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells. Generality Combine SQL, streaming, and complex analytics. Runs Everywhere It runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources.
  • 13. WHEN WOULDYOU NEED TO TRAIN MNNS IN SPARK • Availability of a cluster of machines for training • Scarcity of GPUs • Networks very large • Huge data sets By the way, DL4J isn’t for Spark only: you can use it on a single machine with multiple GPUs or multiple physical processors.
  • 14. CHALLENGES OF TRAINING MNNS IN SPARK • Different execution models between Spark and the DL frameworks • GPU configuration and management • Performance • Accuracy
  • 15. WHY DISTRIBUTED DL ON THE JVM?
  • 16. DEEPLEARNING4J It is an Open Source, distributed, Deep Learning framework written for JVM languages. It is integrated with Hadoop and Apache Spark. It can be used on distributed GPUs and CPUs.
  • 17. WHY DISTRIBUTED DL ON THE JVM? TensorFlow
  • 18. DL4J MODULES • DataVec • Arbiter • NN • Datasets • RL4J • DL4J-Spark • Model Import • ND4J It is an Open Source linear algebra and matrix manipulation library which supports n-dimensional arrays and it is integrated with Apache Hadoop and Spark.
  • 19. DL4J + APACHE SPARK • DL4J provides high level API to design, configure train and evaluate MNNs. • Spark performances are excellent in particular for ETL/streaming, but in terms of computation, in a MNN training context, some data transformation/aggregation needs to be done using a low-level language. • DL4J uses ND4J, which is a C++ library that provides high level Scala API to developers.
  • 20. MODEL IMPORT IN DL4J Keras TensorFlow Train the Model Save it as .h5 Load Model and Weights Load New Data Predict Train the Model Save it as .pb Load Model and Weights Load New Data Predict KerasModelImport TFGraphMapper Transfer Learning
  • 21. MODEL IMPORT IN DL4J Keras TensorFlow Train the Model Save it as .h5 Load Model and Weights Load New Data Predict Train the Model Save it as .pb Load Model and Weights Load New Data Predict
  • 22. KERAS MODEL IMPORT: SUPPORTED FEATURES • Layers • Losses • Activations • Initializers • Regularizers • Constraints • Metrics • Optimizers
  • 23. MODEL IMPORT IN DL4J: EXAMPLE Keras Train the Model Save it as .h5 Load Model and Weights Load New Data Predict Import the VGG16 Model. Test it.
  • 24. MODEL IMPORT IN DL4J: EXAMPLE Keras Train the Model Save it as .h5 Load Model and Weights Load New Data Predict
  • 25. MODEL IMPORT IN DL4J: EXAMPLE Keras Train the Model Save it as .h5 Load Model and Weights Load New Data Predict
  • 26. MODEL IMPORT IN DL4J: EXAMPLE Keras Train the Model Save it as .h5 Load Model and Weights Load New Data Predict
  • 27. DL4J MODEL IMPORT IN ACTION
  • 28. DATA PARALLELISM AND MODEL PARALLELISM
  • 29. HOW TRAINING HAPPENS IN SPARK WITH DL4J Parameter Averaging (DL4J 1.0.0-alpha) Asynchronous SDG (DL4J 1.0.0-beta+)
  • 30. HOW TRAINING HAPPENS IN SPARK WITH DL4J The key classes users should be familiar with to get started with distributed training in DL4J are: • TrainingMaster: It specifies how distributed training will be conducted in practice. Implementations include Gradient Sharing or Parameter Averaging . • SparkDl4jMultiLayer and SparkComputationGraph: They are wrappers around the MultiLayerNetwork and ComputationGraph classes in DL4J that enable the functionality related to distributed training. • RDD<DataSet> and RDD<MultiDataSet>: Spark RDDs with DL4J’s DataSet or MultiDataSet classes that define the source of the training or evaluation data.
  • 31. RE-TRAIN AN IMPORTED MODEL Define the Spark Context Choose the TrainingMaster implementation Create the Spark network Start the training Get the model configuration
  • 33. MEMORY UTILIZATION: SOMETHING TO TAKE CARE OF Take Care of the Off-Heap Memory!
  • 34. More on DL with DL4J on Spark in my book http://tinyurl.3c1om/y9jkvtuy
  • 35. Thanks! Any questions? You can find me at @GuglielmoIozzia https://ie.linkedin.com/in/giozzia googlielmo.blogspot.com