SlideShare a Scribd company logo
INTRODUCTION TO
MULTI-GPU DEEP LEARNING
WITH DIGITS 2
Mike Wang
NVIDIA
PAPIs.io | Sydney | 6-7 August 2015
2
1 What is Deep Learning?
2 GPUs and Deep Learning
3 NVIDIA DIGITS
AGENDA
3
What is Deep Learning?
4
Deep Learning has become the most popular
approach to developing Artificial Intelligence
(AI) – machines that perceive and understand
the world
The focus is currently on specific perceptual
tasks, and there are many successes.
Today, some of the world’s largest internet
companies, as well as the foremost research
institutions, are using GPUs for deep learning
in research and production
DEEP LEARNING & AI
CUDA for
Deep Learning
5
PRACTICAL DEEP LEARNING EXAMPLES
Image Classification, Object Detection, Localization,
Action Recognition, Scene Understanding
Speech Recognition, Speech Translation,
Natural Language Processing
Pedestrian Detection, Traffic Sign Recognition
Breast Cancer Cell Mitosis Detection,
Volumetric Brain Image Segmentation
6
TRADITIONAL MACHINE PERCEPTION
– HAND TUNED FEATURES
Speaker ID,
speech transcription, …
Topic classification,
machine translation,
sentiment analysis…
Raw data Feature extraction Result
Classifier/
detector
SVM,
shallow neural net,
…
HMM,
shallow neural net,
…
Clustering, HMM,
LDA, LSA
…
7
DEEP LEARNING APPROACH
Train:
Deploy:
Dog
Cat
Honey badger
Errors
Dog
Cat
Raccoon
Dog
8
SOME DEEP LEARNING USE CASES
Jeff Dean, Google, GTC 2015
9
REINFORCEMENT LEARNING
10
ARTIFICIAL NEURAL NETWORK (ANN)
A collection of simple, trainable mathematical units that
collectively learn complex functions
From Stanford cs231n lecture notes
Biological neuron
w1 w2 w3
x1 x2 x3
y
y=F(w1x1+w2x2+w3x3)
F(x)=max(0,x)
Artificial neuron
11
ARTIFICIAL NEURAL NETWORK (ANN)
A collection of simple, trainable mathematical units that
collectively learn complex functions
Input layer Output layer
Hidden layers
Given sufficient training data an artificial neural network can approximate very complex
functions mapping raw data to output decisions
12
DEEP NEURAL NETWORK (DNN)
Input Result
Application components:
Task objective
e.g. Identify face
Training data
10-100M images
Network architecture
~10 layers
1B parameters
Learning algorithm
~30 Exaflops
~30 GPU days
Raw data Low-level features Mid-level features High-level features
13
DEEP LEARNING ADVANTAGES
 Robust
 No need to design the features ahead of time – features are automatically
learned to be optimal for the task at hand
 Robustness to natural variations in the data is automatically learned
 Generalizable
 The same neural net approach can be used for many different applications
and data types
 Scalable
 Performance improves with more data, method is massively parallelizable
14
CONVOLUTIONAL NEURAL NETWORK (CNN)
Inspired by the human visual
cortex
Learns a hierarchy of visual
features
Local pixel level features are
scale and translation invariant
Learns the “essence” of visual
objects and generalizes well
15
CONVOLUTIONAL NEURAL NETWORK (CNN)
16
DNNS DOMINATE IN PERCEPTUAL TASKS
Slide credit: Yann Lecun, Facebook & NYU
17
WHY IS DEEP LEARNING HOT NOW?
Big Data Availability New DL Techniques GPU acceleration
350 millions
images uploaded
per day
2.5 Petabytes of
customer data
hourly
100 hours of video
uploaded every
minute
Three Driving Factors…
18
GPUs and Deep Learning
19
GPUs — THE PLATFORM FOR DEEP LEARNING
1.2M training images • 1000 object categories
Hosted by
Image Recognition Challenge
4
60
110
0
20
40
60
80
100
120
2010 2011 2012 2013 2014
GPU Entries
bird
frog
person
hammer
flower pot
power drill
person
car
helmet
motorcycle
person
dog
chair
20
GPU-ACCELERATED DEEP LEARNING
21
Deep learning with COTS HPC
systems
A. Coates, B. Huval, T. Wang, D. Wu,
A. Ng, B. Catanzaro
ICML 2013
GOOGLE DATACENTER
1,000 CPU Servers
2,000 CPUs • 16,000 cores
600 kWatts
$5,000,000
STANFORD AI LAB
3 GPU-Accelerated Servers
12 GPUs • 18,432 cores
4 kWatts
$33,000
Now You Can Build Google’s
$1M Artificial Brain on the Cheap
“ “
GPUS MAKE DEEP LEARNING ACCESSIBLE
22
WHY ARE GPUs GOOD FOR DEEP LEARNING?
GPUs deliver --
- same or better prediction accuracy
- faster results
- smaller footprint
- lower power
- lower cost
Neural
Networks
GPUs
Inherently
Parallel  
Matrix
Operations  
FLOPS  
Bandwidth  
[Lee, Ranganath & Ng, 2007]
23
DL software landscape
NVIDIA DIGITS
24
HOW TO WRITE APPLICATIONS USING DL
Hardware – Which can accelerate DL building blocks
System Software(Drivers)
Libraries(Key compute intensive commonly used building blocks)
Deep Learning Frameworks(Industry standard or research frameworks)
END USER APPLICATIONS
Speech
Understanding
Image
Analysis
Language
Processing
25
HOW NVIDIA IS HELPING DL STACK
Hardware – Which can accelerate DL building blocks
System Software(Drivers)
Libraries(Key compute intensive commonly used building blocks)
Deep Learning Frameworks(Industry standard or research frameworks)
END USER APPLICATIONS
GPU- World’s best DL Hardware
CUDA- Best Parallel Programming Toolkit
Performance libraries (cuDNN, cuBLAS)- Highly optimized
GPU accelerated DL Frameworks (Caffe, Torch, Theano)
DIGITS
Speech
Understanding
Image
Analysis
Language
Processing
26
CUDNN V2 - PERFORMANCE
v3 RC available to Registered Developers
CPU is 16 core Haswell E5-2698 at 2.3 GHz, with 3.6 GHz Turbo
GPU is NVIDIA Titan X
27
HOW GPU ACCELERATION WORKS
Application Code
+
GPU CPU
5% of Code
Compute-Intensive Functions
Rest of Sequential
CPU Code
~ 80% of run-time
28
DIGITS
DEEP GPU TRAINING
SYSTEM FOR DATA
SCIENTISTS
Design DNNs
Visualize activations
Manage multiple trainingsGPUGPU HW Cloud
GPU
Cluster
Multi-GPU
USER
INTERFACE
Visualize
Layers
Configure
DNN
Process
Data
Monitor
Progress
Theano
Torch
Caffe
cuDNN, cuBLAS
CUDA
29
DIGITS
Interactive Deep Learning GPU Training System
Data Scientists & Researchers:
Quickly design the best deep neural
network (DNN) for your data
Visually monitor DNN training quality in
real-time
Manage training of many DNNs in
parallel on multi-GPU systems
DIGITS 2 - Accelerate training of a
single DNN using multiple GPUs
https://developer.nvidia.com/digits
30
DIGITS WEB INTERFACE & API DRIVEN
Test Image
Monitor ProgressConfigure DNNProcess Data Visualize Layers
31
NVIDIA DIGITS
Training Speedup Achieved with DIGITS on Multiple GeForce TITAN X GPUs in a DIGITS
DevBox. These results were obtained with the Caffe framework and a batch size of 128.
Possible speed up with
multiple GPUs
32
DIGITS deployment
33
DEEP LEARNING DEPLOYMENT WORKFLOW
34
DEEP LEARNING LAB SERIES SCHEDULE
 7/22 Class #1 - Introduction to Deep Learning
 7/29 Office Hours for Class #1
 8/5 Class #2 - Getting Started with DIGITS interactive training system for image classification
 8/12 Office Hours for Class #2
 8/19 Class #3 - Getting Started with the Caffe Framework
 8/26 Office Hours for Class #3
 9/2 Class #4 - Getting Started with the Theano Framework
 9/9 Office Hours for Class #4
 9/16 Class #5 - Getting Started with the Torch Framework
 9/23 Office Hours for Class #5
 More information available at developer.nvidia.com/deep-learning-courses
Recordings
online
35
HANDS-ON LAB
1. Create an account at nvidia.qwiklab.com
2. Go to “Introduction to Deep Learning” lab at bit.ly/dlnvlab1
3. Start the lab and enjoy!
 Only requires a supported browser, no NVIDIA GPU necessary!
 Lab is free until end of Deep Learning Lab series
36
USEFUL LINKS
 Deep Learning Lab Course information & recordings:
developer.nvidia.com/deep-learning-courses
 Recorded presentations from past conferences:
www.gputechconf.com/gtcnew/on-demand-gtc.php
 Parallel Forall (GPU Computing Technical blog):
devblogs.nvidia.com/parallelforall
 Become a Registered Developer:
developer.nvidia.com/programs/cuda/register

More Related Content

PDF
Classification of aerial photographs using DIGITS 2 - Mike Wang
PDF
Introduction to Deep Learning (NVIDIA)
PDF
GPU Accelerated Deep Learning for CUDNN V2
PDF
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
PDF
Enabling Artificial Intelligence - Alison B. Lowndes
PDF
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
PDF
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
PDF
GPU Technology Conference 2014 Keynote
Classification of aerial photographs using DIGITS 2 - Mike Wang
Introduction to Deep Learning (NVIDIA)
GPU Accelerated Deep Learning for CUDNN V2
NVIDIA 深度學習教育機構 (DLI): Neural network deployment
Enabling Artificial Intelligence - Alison B. Lowndes
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
GPU Technology Conference 2014 Keynote

What's hot (20)

PDF
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
PDF
A Platform for Accelerating Machine Learning Applications
PDF
GTC 2013 Jen-Hsun Huang Keynote
PDF
Deep Learning at Scale
PPTX
Squeezing Deep Learning Into Mobile Phones
PDF
Rethinking computation: A processor architecture for machine intelligence
PDF
On-device machine learning: TensorFlow on Android
PDF
Urs Köster Presenting at RE-Work DL Summit in Boston
PDF
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
PDF
GTC 2012 Jen-Hsun Huang Keynote
PDF
Affordable AI Connects To A Better Life
PDF
Nervana and the Future of Computing
PDF
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
PPTX
Deep learning on mobile - 2019 Practitioner's Guide
PDF
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
PPTX
2nd DL Meetup @ Dublin - Irene
PDF
RE-Work Deep Learning Summit - September 2016
PPTX
Deep learning on mobile
PDF
Recent developments in Deep Learning
PDF
Yangqing Jia at AI Frontiers: Towards Better DL Frameworks
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
A Platform for Accelerating Machine Learning Applications
GTC 2013 Jen-Hsun Huang Keynote
Deep Learning at Scale
Squeezing Deep Learning Into Mobile Phones
Rethinking computation: A processor architecture for machine intelligence
On-device machine learning: TensorFlow on Android
Urs Köster Presenting at RE-Work DL Summit in Boston
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
GTC 2012 Jen-Hsun Huang Keynote
Affordable AI Connects To A Better Life
Nervana and the Future of Computing
Kevin Shaw at AI Frontiers: AI on the Edge: Bringing Intelligence to Small De...
Deep learning on mobile - 2019 Practitioner's Guide
GTC Taiwan 2017 在 Google Cloud 當中使用 GPU 進行效能最佳化
2nd DL Meetup @ Dublin - Irene
RE-Work Deep Learning Summit - September 2016
Deep learning on mobile
Recent developments in Deep Learning
Yangqing Jia at AI Frontiers: Towards Better DL Frameworks
Ad

Similar to Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang (20)

PPTX
abelbrownnvidiarakuten2016-170208065814 (1).pptx
PDF
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
PPTX
Chug dl presentation
PPTX
Accelerate AI w/ Synthetic Data using GANs
PDF
Fueling the AI Revolution with Gaming
PDF
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
PPTX
Deep Learning
PPTX
Innovation with ai at scale on the edge vt sept 2019 v0
PDF
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
PDF
GOAI: GPU-Accelerated Data Science DataSciCon 2017
PDF
NVIDIA Rapids presentation
PDF
Rapids: Data Science on GPUs
PPTX
HPE and NVIDIA empowering AI and IoT
PDF
PDF
Deeplearning on Hadoop @OSCON 2014
PPTX
Deep Learning on Qubole Data Platform
PDF
NVIDIA Deep Learning Institute 2017 基調講演
PDF
RAPIDS – Open GPU-accelerated Data Science
PDF
Fuelling the AI Revolution with Gaming
PDF
Deep Learning Update May 2016
abelbrownnvidiarakuten2016-170208065814 (1).pptx
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
Chug dl presentation
Accelerate AI w/ Synthetic Data using GANs
Fueling the AI Revolution with Gaming
Alison B Lowndes - Fueling the Artificial Intelligence Revolution with Gaming...
Deep Learning
Innovation with ai at scale on the edge vt sept 2019 v0
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
GOAI: GPU-Accelerated Data Science DataSciCon 2017
NVIDIA Rapids presentation
Rapids: Data Science on GPUs
HPE and NVIDIA empowering AI and IoT
Deeplearning on Hadoop @OSCON 2014
Deep Learning on Qubole Data Platform
NVIDIA Deep Learning Institute 2017 基調講演
RAPIDS – Open GPU-accelerated Data Science
Fuelling the AI Revolution with Gaming
Deep Learning Update May 2016
Ad

More from PAPIs.io (20)

PDF
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
PDF
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
PDF
Extracting information from images using deep learning and transfer learning ...
PDF
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
PDF
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
PDF
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
PDF
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
PDF
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
PDF
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
PDF
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
PDF
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
PDF
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
PDF
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
PDF
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
PDF
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
PDF
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
PDF
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
PDF
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
PDF
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
PDF
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Feature engineering — HJ Van Veen (Nubank) @@PAPIs Connect — São Paulo 2017
Extracting information from images using deep learning and transfer learning ...
Discovering the hidden treasure of data using graph analytic — Ana Paula Appe...
Deep learning for sentiment analysis — André Barbosa (elo7) @PAPIs Connect — ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning applications locally with Spark — Joel Pinho Lucas ...
Battery log data mining — Ramon Oliveira (Datart) @PAPIs Connect — São Paulo ...
A tensorflow recommending system for news — Fabrício Vargas Matos (Hearst tv)...
Scaling machine learning as a service at Uber — Li Erran Li at #papis2016
Real-world applications of AI - Daniel Hulme @ PAPIs Connect
Past, Present and Future of AI: a Fascinating Journey - Ramon Lopez de Mantar...
Revolutionizing Offline Retail Pricing & Promotions with ML - Daniel Guhl @ P...
Demystifying Deep Learning - Roberto Paredes Palacios @ PAPIs Connect
Predictive APIs: What about Banking? - Natalino Busa @ PAPIs Connect
Microdecision making in financial services - Greg Lamp @ PAPIs Connect
Engineering the Future of Our Choice with General AI - JoEllen Lukavec Koeste...
Distributed deep learning with spark on AWS - Vincent Van Steenbergen @ PAPIs...
How to predict the future of shopping - Ulrich Kerzel @ PAPIs Connect
The emergent opportunity of Big Data for Social Good - Nuria Oliver @ PAPIs C...

Recently uploaded (20)

PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
Quality review (1)_presentation of this 21
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Mega Projects Data Mega Projects Data
PPTX
Managing Community Partner Relationships
PPTX
Leprosy and NLEP programme community medicine
PDF
Business Analytics and business intelligence.pdf
PDF
annual-report-2024-2025 original latest.
PPTX
Computer network topology notes for revision
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Lecture1 pattern recognition............
PPTX
Database Infoormation System (DBIS).pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Supervised vs unsupervised machine learning algorithms
Quality review (1)_presentation of this 21
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Mega Projects Data Mega Projects Data
Managing Community Partner Relationships
Leprosy and NLEP programme community medicine
Business Analytics and business intelligence.pdf
annual-report-2024-2025 original latest.
Computer network topology notes for revision
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
[EN] Industrial Machine Downtime Prediction
Qualitative Qantitative and Mixed Methods.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Galatica Smart Energy Infrastructure Startup Pitch Deck
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Lecture1 pattern recognition............
Database Infoormation System (DBIS).pptx

Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang

  • 1. INTRODUCTION TO MULTI-GPU DEEP LEARNING WITH DIGITS 2 Mike Wang NVIDIA PAPIs.io | Sydney | 6-7 August 2015
  • 2. 2 1 What is Deep Learning? 2 GPUs and Deep Learning 3 NVIDIA DIGITS AGENDA
  • 3. 3 What is Deep Learning?
  • 4. 4 Deep Learning has become the most popular approach to developing Artificial Intelligence (AI) – machines that perceive and understand the world The focus is currently on specific perceptual tasks, and there are many successes. Today, some of the world’s largest internet companies, as well as the foremost research institutions, are using GPUs for deep learning in research and production DEEP LEARNING & AI CUDA for Deep Learning
  • 5. 5 PRACTICAL DEEP LEARNING EXAMPLES Image Classification, Object Detection, Localization, Action Recognition, Scene Understanding Speech Recognition, Speech Translation, Natural Language Processing Pedestrian Detection, Traffic Sign Recognition Breast Cancer Cell Mitosis Detection, Volumetric Brain Image Segmentation
  • 6. 6 TRADITIONAL MACHINE PERCEPTION – HAND TUNED FEATURES Speaker ID, speech transcription, … Topic classification, machine translation, sentiment analysis… Raw data Feature extraction Result Classifier/ detector SVM, shallow neural net, … HMM, shallow neural net, … Clustering, HMM, LDA, LSA …
  • 7. 7 DEEP LEARNING APPROACH Train: Deploy: Dog Cat Honey badger Errors Dog Cat Raccoon Dog
  • 8. 8 SOME DEEP LEARNING USE CASES Jeff Dean, Google, GTC 2015
  • 10. 10 ARTIFICIAL NEURAL NETWORK (ANN) A collection of simple, trainable mathematical units that collectively learn complex functions From Stanford cs231n lecture notes Biological neuron w1 w2 w3 x1 x2 x3 y y=F(w1x1+w2x2+w3x3) F(x)=max(0,x) Artificial neuron
  • 11. 11 ARTIFICIAL NEURAL NETWORK (ANN) A collection of simple, trainable mathematical units that collectively learn complex functions Input layer Output layer Hidden layers Given sufficient training data an artificial neural network can approximate very complex functions mapping raw data to output decisions
  • 12. 12 DEEP NEURAL NETWORK (DNN) Input Result Application components: Task objective e.g. Identify face Training data 10-100M images Network architecture ~10 layers 1B parameters Learning algorithm ~30 Exaflops ~30 GPU days Raw data Low-level features Mid-level features High-level features
  • 13. 13 DEEP LEARNING ADVANTAGES  Robust  No need to design the features ahead of time – features are automatically learned to be optimal for the task at hand  Robustness to natural variations in the data is automatically learned  Generalizable  The same neural net approach can be used for many different applications and data types  Scalable  Performance improves with more data, method is massively parallelizable
  • 14. 14 CONVOLUTIONAL NEURAL NETWORK (CNN) Inspired by the human visual cortex Learns a hierarchy of visual features Local pixel level features are scale and translation invariant Learns the “essence” of visual objects and generalizes well
  • 16. 16 DNNS DOMINATE IN PERCEPTUAL TASKS Slide credit: Yann Lecun, Facebook & NYU
  • 17. 17 WHY IS DEEP LEARNING HOT NOW? Big Data Availability New DL Techniques GPU acceleration 350 millions images uploaded per day 2.5 Petabytes of customer data hourly 100 hours of video uploaded every minute Three Driving Factors…
  • 18. 18 GPUs and Deep Learning
  • 19. 19 GPUs — THE PLATFORM FOR DEEP LEARNING 1.2M training images • 1000 object categories Hosted by Image Recognition Challenge 4 60 110 0 20 40 60 80 100 120 2010 2011 2012 2013 2014 GPU Entries bird frog person hammer flower pot power drill person car helmet motorcycle person dog chair
  • 21. 21 Deep learning with COTS HPC systems A. Coates, B. Huval, T. Wang, D. Wu, A. Ng, B. Catanzaro ICML 2013 GOOGLE DATACENTER 1,000 CPU Servers 2,000 CPUs • 16,000 cores 600 kWatts $5,000,000 STANFORD AI LAB 3 GPU-Accelerated Servers 12 GPUs • 18,432 cores 4 kWatts $33,000 Now You Can Build Google’s $1M Artificial Brain on the Cheap “ “ GPUS MAKE DEEP LEARNING ACCESSIBLE
  • 22. 22 WHY ARE GPUs GOOD FOR DEEP LEARNING? GPUs deliver -- - same or better prediction accuracy - faster results - smaller footprint - lower power - lower cost Neural Networks GPUs Inherently Parallel   Matrix Operations   FLOPS   Bandwidth   [Lee, Ranganath & Ng, 2007]
  • 24. 24 HOW TO WRITE APPLICATIONS USING DL Hardware – Which can accelerate DL building blocks System Software(Drivers) Libraries(Key compute intensive commonly used building blocks) Deep Learning Frameworks(Industry standard or research frameworks) END USER APPLICATIONS Speech Understanding Image Analysis Language Processing
  • 25. 25 HOW NVIDIA IS HELPING DL STACK Hardware – Which can accelerate DL building blocks System Software(Drivers) Libraries(Key compute intensive commonly used building blocks) Deep Learning Frameworks(Industry standard or research frameworks) END USER APPLICATIONS GPU- World’s best DL Hardware CUDA- Best Parallel Programming Toolkit Performance libraries (cuDNN, cuBLAS)- Highly optimized GPU accelerated DL Frameworks (Caffe, Torch, Theano) DIGITS Speech Understanding Image Analysis Language Processing
  • 26. 26 CUDNN V2 - PERFORMANCE v3 RC available to Registered Developers CPU is 16 core Haswell E5-2698 at 2.3 GHz, with 3.6 GHz Turbo GPU is NVIDIA Titan X
  • 27. 27 HOW GPU ACCELERATION WORKS Application Code + GPU CPU 5% of Code Compute-Intensive Functions Rest of Sequential CPU Code ~ 80% of run-time
  • 28. 28 DIGITS DEEP GPU TRAINING SYSTEM FOR DATA SCIENTISTS Design DNNs Visualize activations Manage multiple trainingsGPUGPU HW Cloud GPU Cluster Multi-GPU USER INTERFACE Visualize Layers Configure DNN Process Data Monitor Progress Theano Torch Caffe cuDNN, cuBLAS CUDA
  • 29. 29 DIGITS Interactive Deep Learning GPU Training System Data Scientists & Researchers: Quickly design the best deep neural network (DNN) for your data Visually monitor DNN training quality in real-time Manage training of many DNNs in parallel on multi-GPU systems DIGITS 2 - Accelerate training of a single DNN using multiple GPUs https://developer.nvidia.com/digits
  • 30. 30 DIGITS WEB INTERFACE & API DRIVEN Test Image Monitor ProgressConfigure DNNProcess Data Visualize Layers
  • 31. 31 NVIDIA DIGITS Training Speedup Achieved with DIGITS on Multiple GeForce TITAN X GPUs in a DIGITS DevBox. These results were obtained with the Caffe framework and a batch size of 128. Possible speed up with multiple GPUs
  • 34. 34 DEEP LEARNING LAB SERIES SCHEDULE  7/22 Class #1 - Introduction to Deep Learning  7/29 Office Hours for Class #1  8/5 Class #2 - Getting Started with DIGITS interactive training system for image classification  8/12 Office Hours for Class #2  8/19 Class #3 - Getting Started with the Caffe Framework  8/26 Office Hours for Class #3  9/2 Class #4 - Getting Started with the Theano Framework  9/9 Office Hours for Class #4  9/16 Class #5 - Getting Started with the Torch Framework  9/23 Office Hours for Class #5  More information available at developer.nvidia.com/deep-learning-courses Recordings online
  • 35. 35 HANDS-ON LAB 1. Create an account at nvidia.qwiklab.com 2. Go to “Introduction to Deep Learning” lab at bit.ly/dlnvlab1 3. Start the lab and enjoy!  Only requires a supported browser, no NVIDIA GPU necessary!  Lab is free until end of Deep Learning Lab series
  • 36. 36 USEFUL LINKS  Deep Learning Lab Course information & recordings: developer.nvidia.com/deep-learning-courses  Recorded presentations from past conferences: www.gputechconf.com/gtcnew/on-demand-gtc.php  Parallel Forall (GPU Computing Technical blog): devblogs.nvidia.com/parallelforall  Become a Registered Developer: developer.nvidia.com/programs/cuda/register