SlideShare a Scribd company logo
Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa
Waseda University
(*equal contribution)
2
Colorization of Black-and-white Pictures
3
Our Goal: Fully-automatic colorization
4
Colorization of Old Films
5
Related Work
 Scribble-based [Levin+ 2004; Yatziv+ 2004;
An+ 2009; Xu+ 2013; Endo+ 2016]
 Specify colors with scribbles
 Require manual inputs
 Reference image-based [Chia+ 2011;
Gupta+ 2012]
 Transfer colors of reference images
 Require very similar images
[Levin+ 2004]
[Gupta+ 2012]
Input Reference Output
6
Related Work
 Automatic colorization with hand-crafted features [Cheng+ 2015]
 Uses existing multiple image features
 Computes chrominance via a shallow neural network
 Depends on the performance of semantic segmentation
 Only handles simple outdoor scenes
Image
features
Chroma OutputInput Neural Network
7
Contributions
 Novel end-to-end network that jointly learns global and local features
for automatic image colorization
 New fusion layer that elegantly merges the global and local features
 Exploit classification labels for learning
8
Layers of Our Model
 Fully-connected layer
 All neurons are connected between layers
 Convolutional layer
 Takes into account underlying spatial structure
No. of feature maps
Convolutional layer
…
…
Fully-connected layer
Neuron
𝑥
𝑦
9
Our Model
 Two branches: local features and global features
 Composed of four networks
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
10
Low-Level Features Network
 Extract low-level features such as edges and corners
 Lower resolution for efficient processing
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
11
Global Features Network
 Compute a global 256-dimensional vector representation of the image
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
12
Mid-Level Features Network
 Extract mid-level features such as texture
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
13
Fusion Layer
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
14
Fusion Layer
 Combine the global features with the mid-level features
 The resulting features are independent of any resolution
Fusion Layer
Mid-Level Features
Network
Global Features Network
𝐲 𝑢,𝑣
fusion = 𝜎 𝐛 + 𝑊
𝐲global
𝐲 𝑢,𝑣
mid
15
Colorization Network
 Compute chrominance from the fused features
 Restore the image to the input resolution
Shared
weights
Scaling
Chrominance
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
16
Training of Colors
 Mean Squared Error (MSE) as loss function
 Optimization using ADADELTA [Zeiler 2012]
 Adaptively sets a learning rate
Model
Forward
Backward
MSE
Input Output Ground truth
17
Joint Training
 Training for classification jointly with the colorization
 Classification network connected to the global features
Shared
weights
Scaling
Low-Level Features
Network Global Features Network
Mid-Level Features
Network
Colorization
Network
Upsampling
Luminance
Fusion Layer
20.60% Formal Garden
16.13% Arch
13.50% Abbey
7.07% Botanical Garden
6.53% Golf Course
Predicted labels
Classification
Network
Chrominance
18
Dataset
 MIT Places Scene Dataset [Zhou+ 2014]
 2.3 million training images with 205 scene labels
 256 × 256 pixels
Abbey Airport terminal Aquarium Baseball field
Dining room Forest road Gas station Gift shop
⋯
⋯
[SIGGRAPH 2016] Automatic Image Colorization
20
Computational Time
 Colorize within a few seconds
80ms
21
Colorization of MIT Places Dataset
22
Comparisons
Input [Cheng+ 2015] Ours
(w/ global features)
Ours
(w/o global features)
23
Effectiveness of Global Features
w/ global featuresInput w/o global features
24
User Study
 10 users participated
 We show 500 images of each type: total 1,500 images per user
 90% of our results are considered “natural”
Natural Unnatural
25
Colorization of Historical Photographs
Mount Moran, 1941 Scott's Run, 1937 Youngsters, 1912 Burns Basement, 1910
26
Style Transfer
Low-Level Features
Network
27
Style Transfer
Low-Level Features
Network
28
Style Transfer
 Adapting the colorization of one image to the style of another
Inputs
Local Global Local Global Local Global
Output
29
Limitations
 Difficult to output colorful images
 Cannot restore exact colors
Input Ground truth Output
Input Ground truth Output
30
Conclusion
 Novel approach for image colorization by fusing global and local information
 Fusion layer
 Joint training of colorization and classification
 Style transfer
Farm Land, 1933 California National
Park, 1936
Homes, 1936 Spinners, 1910 Doffer Boys, 1909
31
Thank you!
 Project Page http://hi.cs.waseda.ac.jp/~iizuka/projects/colorization
 Code on GitHub! https://github.com/satoshiiizuka/siggraph2016_colorization
Norris Dam, 1933North Dome,
1936
Miner,
1937
Community Center,
1936

More Related Content

PPTX
Image colorization
PPTX
Image colorization
PPT
Spatial filtering using image processing
PPTX
Convolutional Neural Network and Its Applications
PPTX
Halftoning in Computer Graphics
PPSX
Image Processing: Spatial filters
PDF
Convolutional Neural Network Models - Deep Learning
PPTX
Chapter 9 morphological image processing
Image colorization
Image colorization
Spatial filtering using image processing
Convolutional Neural Network and Its Applications
Halftoning in Computer Graphics
Image Processing: Spatial filters
Convolutional Neural Network Models - Deep Learning
Chapter 9 morphological image processing

What's hot (20)

PPTX
Image processing second unit Notes
PPTX
Color fundamentals and color models - Digital Image Processing
PPT
05 histogram processing DIP
PPSX
Image processing on matlab presentation
PPTX
Digital Image Processing
PPTX
Convolutional Neural Networks
PPTX
digital image processing
PPTX
Smoothing Filters in Spatial Domain
PDF
An introduction to Machine Learning
PPTX
Image Classification using deep learning
PDF
Digital Image Processing: Image Segmentation
PPTX
5. gray level transformation
PPTX
Application of edge detection
PPTX
Image classification using CNN
PPTX
Inception V3 Image Processing (1).pptx
PPTX
Lect 02 second portion
PPT
image enhancement
PPTX
Simultaneous Smoothing and Sharpening of Color Images
Image processing second unit Notes
Color fundamentals and color models - Digital Image Processing
05 histogram processing DIP
Image processing on matlab presentation
Digital Image Processing
Convolutional Neural Networks
digital image processing
Smoothing Filters in Spatial Domain
An introduction to Machine Learning
Image Classification using deep learning
Digital Image Processing: Image Segmentation
5. gray level transformation
Application of edge detection
Image classification using CNN
Inception V3 Image Processing (1).pptx
Lect 02 second portion
image enhancement
Simultaneous Smoothing and Sharpening of Color Images
Ad

Similar to [SIGGRAPH 2016] Automatic Image Colorization (20)

PDF
[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
PDF
A digital image is a representation of a two-dimensional image using binary n...
PPTX
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
ODP
An Introduction to Computer Vision
PPT
Graphics and Java 2D
PPTX
Artist Assistant AI(AAA)
PDF
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
PPT
Towards Utilizing GPUs in Information Visualization
PPT
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
PPT
Computational Methods in Physics for Students
PDF
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
PPT
Image_Processing_LECTURE_c#_programming.ppt
PPTX
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
PDF
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
PDF
Unreal Engine 4 Introduction
PDF
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
PDF
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
PPTX
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
PDF
Using A Application For A Desktop Application
PDF
iVideo Editor with Background Remover and Image Inpainting
[SIGGRAPH 2017] Globally and Locally Consistent Image Completion
A digital image is a representation of a two-dimensional image using binary n...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
An Introduction to Computer Vision
Graphics and Java 2D
Artist Assistant AI(AAA)
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Towards Utilizing GPUs in Information Visualization
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Computational Methods in Physics for Students
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Image_Processing_LECTURE_c#_programming.ppt
AN INTEGRATED APPROACH TO CONTENT BASED IMAGE RETRIEVAL by Madhu
IRJET- An Approach to FPGA based Implementation of Image Mosaicing using Neur...
Unreal Engine 4 Introduction
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Color and 3D Semantic Reconstruction of Indoor Scenes from RGB-D stream
Screen2Vec: Semantic Embedding of GUI Screens and GUI Components
Using A Application For A Desktop Application
iVideo Editor with Background Remover and Image Inpainting
Ad

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Mushroom cultivation and it's methods.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
August Patch Tuesday
PDF
Approach and Philosophy of On baking technology
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
A Presentation on Artificial Intelligence
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Machine Learning_overview_presentation.pptx
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Mushroom cultivation and it's methods.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
NewMind AI Weekly Chronicles - August'25-Week II
August Patch Tuesday
Approach and Philosophy of On baking technology
Group 1 Presentation -Planning and Decision Making .pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
A Presentation on Artificial Intelligence
Encapsulation_ Review paper, used for researhc scholars
Diabetes mellitus diagnosis method based random forest with bat algorithm
OMC Textile Division Presentation 2021.pptx
Machine Learning_overview_presentation.pptx
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Network Security Unit 5.pdf for BCA BBA.
Univ-Connecticut-ChatGPT-Presentaion.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Reach Out and Touch Someone: Haptics and Empathic Computing

[SIGGRAPH 2016] Automatic Image Colorization

  • 1. Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa Waseda University (*equal contribution)
  • 5. 5 Related Work  Scribble-based [Levin+ 2004; Yatziv+ 2004; An+ 2009; Xu+ 2013; Endo+ 2016]  Specify colors with scribbles  Require manual inputs  Reference image-based [Chia+ 2011; Gupta+ 2012]  Transfer colors of reference images  Require very similar images [Levin+ 2004] [Gupta+ 2012] Input Reference Output
  • 6. 6 Related Work  Automatic colorization with hand-crafted features [Cheng+ 2015]  Uses existing multiple image features  Computes chrominance via a shallow neural network  Depends on the performance of semantic segmentation  Only handles simple outdoor scenes Image features Chroma OutputInput Neural Network
  • 7. 7 Contributions  Novel end-to-end network that jointly learns global and local features for automatic image colorization  New fusion layer that elegantly merges the global and local features  Exploit classification labels for learning
  • 8. 8 Layers of Our Model  Fully-connected layer  All neurons are connected between layers  Convolutional layer  Takes into account underlying spatial structure No. of feature maps Convolutional layer … … Fully-connected layer Neuron 𝑥 𝑦
  • 9. 9 Our Model  Two branches: local features and global features  Composed of four networks Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 10. 10 Low-Level Features Network  Extract low-level features such as edges and corners  Lower resolution for efficient processing Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 11. 11 Global Features Network  Compute a global 256-dimensional vector representation of the image Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 12. 12 Mid-Level Features Network  Extract mid-level features such as texture Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 13. 13 Fusion Layer Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 14. 14 Fusion Layer  Combine the global features with the mid-level features  The resulting features are independent of any resolution Fusion Layer Mid-Level Features Network Global Features Network 𝐲 𝑢,𝑣 fusion = 𝜎 𝐛 + 𝑊 𝐲global 𝐲 𝑢,𝑣 mid
  • 15. 15 Colorization Network  Compute chrominance from the fused features  Restore the image to the input resolution Shared weights Scaling Chrominance Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer
  • 16. 16 Training of Colors  Mean Squared Error (MSE) as loss function  Optimization using ADADELTA [Zeiler 2012]  Adaptively sets a learning rate Model Forward Backward MSE Input Output Ground truth
  • 17. 17 Joint Training  Training for classification jointly with the colorization  Classification network connected to the global features Shared weights Scaling Low-Level Features Network Global Features Network Mid-Level Features Network Colorization Network Upsampling Luminance Fusion Layer 20.60% Formal Garden 16.13% Arch 13.50% Abbey 7.07% Botanical Garden 6.53% Golf Course Predicted labels Classification Network Chrominance
  • 18. 18 Dataset  MIT Places Scene Dataset [Zhou+ 2014]  2.3 million training images with 205 scene labels  256 × 256 pixels Abbey Airport terminal Aquarium Baseball field Dining room Forest road Gas station Gift shop ⋯ ⋯
  • 20. 20 Computational Time  Colorize within a few seconds 80ms
  • 21. 21 Colorization of MIT Places Dataset
  • 22. 22 Comparisons Input [Cheng+ 2015] Ours (w/ global features) Ours (w/o global features)
  • 23. 23 Effectiveness of Global Features w/ global featuresInput w/o global features
  • 24. 24 User Study  10 users participated  We show 500 images of each type: total 1,500 images per user  90% of our results are considered “natural” Natural Unnatural
  • 25. 25 Colorization of Historical Photographs Mount Moran, 1941 Scott's Run, 1937 Youngsters, 1912 Burns Basement, 1910
  • 28. 28 Style Transfer  Adapting the colorization of one image to the style of another Inputs Local Global Local Global Local Global Output
  • 29. 29 Limitations  Difficult to output colorful images  Cannot restore exact colors Input Ground truth Output Input Ground truth Output
  • 30. 30 Conclusion  Novel approach for image colorization by fusing global and local information  Fusion layer  Joint training of colorization and classification  Style transfer Farm Land, 1933 California National Park, 1936 Homes, 1936 Spinners, 1910 Doffer Boys, 1909
  • 31. 31 Thank you!  Project Page http://hi.cs.waseda.ac.jp/~iizuka/projects/colorization  Code on GitHub! https://github.com/satoshiiizuka/siggraph2016_colorization Norris Dam, 1933North Dome, 1936 Miner, 1937 Community Center, 1936