SlideShare a Scribd company logo
2
Most read
4
Most read
15
Most read
Image-to-ImageTranslation
with Pix2Pix GAN
Presenter: S.Shayan Daneshvar
Overview
1. What is Image-to-ImageTranslation?
2. The Problem of Image-to-ImageTranslation
3. Pix2Pix GAN for Image-to-ImageTranslation
4. Pix2Pix Architectural Details
5. Applications of the Pix2Pix GAN
6. Conclusion & References
1/20
What is Image-to-ImageTranslation?
1
ImageTranslation
Image-to-image translation is the controlled conversion of a given source
image to a target image.
3/20
1
The Problem of Image-to-ImageTranslation
2
Problem Definition
• It is a challenging problem that typically requires the development of a
specialized model and hand-crafted loss function for the type of translation
task being performed.
• Classical approaches , use per-pixel classification or regression models.
Ideally, a technique is required that is general, meaning that the same general
model and loss function can be used for multiple different image-to-image
translation tasks.
5/20
2
Pix2Pix GAN for Image-to-ImageTranslation
3
Pix2Pix
Pix2Pix is a Generative Adversarial Network, model designed for general purpose
image-to-image translation.The approach was presented by Phillip Isola, et al. in
their 2016 paper titled Image-to-ImageTranslation with Conditional Adversarial
Networks and presented at CVPR in 2017.
7/20
3
Pix2Pix
8/20
3
Demo available at https://affinelayer.com/pixsrv/
GAN and cGAN
9/20
3
The GAN architecture is an approach to training a generator model, typically
used for generating images. A discriminator model is trained to classify
images as real or fake and the generator is trained to fool the discriminator
model.
The Conditional GAN, is an extension of the GAN architecture that provides
control over the image that is generated, e.g. allowing an image of a given
class to be generated.
Pix2Pix GAN
10/20
3
The generator model is provided with a given image as input and generates a
translated version of the image.
The discriminator model is given an input image and a real or generated
paired image and must determine whether the paired image is real or fake.
Finally, the generator model is trained to both fool the discriminator model
and to minimize the loss between the generated image and the expected
target image.
Dataset Required for Pix2Pix
11/20
3
Pix2Pix GAN must be trained on image datasets that are comprised of input
images (before translation) and output or target images (after translation).
Pix2Pix Architectural Details
4
Pix2Pix Architecture
13/20
4
Pix2Pix GAN architecture involves the specification of the following:
• The generator model
• The discriminator model
• model optimization procedure
Both the generator and discriminator models use the standard Convolution-
BatchNormalization-ReLU blocks of layers.
U-Net Generator Model
14/20
4
A U-Net model architecture is used for the generator, instead of the common
encoder-decoder model. It is very similar to encoder-decoder model as it
involves down sampling to a bottleneck and up sampling again to an output
image, but links or skip-connections are made between layers of the same size
in the encoder and the decoder.
PatchGAN Discriminator Model
15/20
4
Unlike the standard GAN model that uses a deep convolutional neural network
to classify images, the Pix2Pix model uses a PatchGAN, which is a deep
convolutional neural network designed to classify patches of an input image as
real or fake, rather than the entire image.The output of the network is a single
feature map of real/fake predictions that can be averaged to give a single
score.
Composite Adversarial and L1 Loss
16/20
4
The discriminator model is trained in a standalone manner in the same way as
a standard GAN model.
The generator model is trained using both the adversarial loss for the
discriminator model and the mean absolute pixel difference between the
generated image and the expected image.
Generator Loss = Adversarial Loss + λ × L1 Loss
Applications of the Pix2Pix GAN
5
Applications
18/20
5
• Semantic labels ⇔ photo
• Architectural labels ⇒ photo
• Map ⇔ aerial photo
• Black andWhite ⇒ color photos.
• Edges ⇒ photo.
• Sketch ⇒ photo.
• Day ⇒ night photographs.
• Thermal ⇒ color photos.
• Photo with missing pixels ⇒ inpainted photo
Conclusion
19/20
6
• Image-to-image translation often requires specialized models and hand-
crafted loss functions.
• Pix2Pix GAN provides a general purpose model and loss function for image-
to-image translation.
• The Pix2Pix GAN was demonstrated on a wide variety of image generation
tasks, including translating photographs from day to night and product
sketches to photographs.
References
20/20
6
1. P. Isola, J. Zhu,T. Zhou and A. A. Efros, "Image-to-ImageTranslation with
Conditional Adversarial Networks," 2017 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2017, pp. 5967-5976.
2. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D.Warde-Farley, S. Ozair,
et al., "Generative adversarial nets", NIPS, 2014, pp. 2672–2680.
3. Generative Adversarial Networks with Python Deep Learning Generative
Models for Image Synthesis and ImageTranslation by Jason Brownlee.
4. Generative Adversarial Networks for Image-to-ImageTranslation by Arun
Solanki, Anand Nayyar, and Mohd Naved
ThankYou!
Any Questions?

More Related Content

PDF
Image-to-Image Translation
PPTX
Image-to-Image Translation pix2pix
PPTX
A Style-Based Generator Architecture for Generative Adversarial Networks
PDF
Evolution of the StyleGAN family
PDF
Generative Adversarial Networks and Their Medical Imaging Applications
PPTX
Generative adversarial networks
PPTX
Style gan
PDF
Generative adversarial network and its applications to speech signal and natu...
Image-to-Image Translation
Image-to-Image Translation pix2pix
A Style-Based Generator Architecture for Generative Adversarial Networks
Evolution of the StyleGAN family
Generative Adversarial Networks and Their Medical Imaging Applications
Generative adversarial networks
Style gan
Generative adversarial network and its applications to speech signal and natu...

What's hot (20)

PDF
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
PPTX
Hog and sift
PDF
Generative adversarial networks
PDF
Lec10: Medical Image Segmentation as an Energy Minimization Problem
PDF
What is the Expectation Maximization (EM) Algorithm?
PDF
Deep learning for image video processing
PDF
Generative adversarial networks
PDF
GANs and Applications
PPTX
Gradient Boosting
PPTX
U-Net (1).pptx
PPTX
Generative Adversarial Networks (GANs)
PPSX
Image Processing: Spatial filters
PDF
The Origin of Grad-CAM
ODP
Simple Introduction to AutoEncoder
PDF
CIFAR-10
PPTX
Introduction to Grad-CAM (complete version)
PDF
Single Image Super Resolution Overview
PPTX
Explainable AI
PPTX
Autoencoders in Deep Learning
PDF
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Hog and sift
Generative adversarial networks
Lec10: Medical Image Segmentation as an Energy Minimization Problem
What is the Expectation Maximization (EM) Algorithm?
Deep learning for image video processing
Generative adversarial networks
GANs and Applications
Gradient Boosting
U-Net (1).pptx
Generative Adversarial Networks (GANs)
Image Processing: Spatial filters
The Origin of Grad-CAM
Simple Introduction to AutoEncoder
CIFAR-10
Introduction to Grad-CAM (complete version)
Single Image Super Resolution Overview
Explainable AI
Autoencoders in Deep Learning
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
Ad

Similar to Image to image translation with Pix2Pix GAN (20)

PDF
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
PDF
Unpaired Image Translations Using GANs: A Review
PDF
Google Dev Group Yangon (2020) AI Talk (Creative AI in Action)
PPTX
Generative Adversarial Network (GAN) for Image Synthesis
PDF
IRJET- Transformation of Realistic Images and Videos into Cartoon Images and ...
PPTX
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
PDF
An Intelligent approach to Pic to Cartoon Conversion using White-box-cartooni...
PPTX
Image colorization
PPTX
Image colorization
PPTX
affritical Intelligent thuat toan Gan cho AI
PPTX
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
PPTX
Computer Vision Gans
PDF
Y. Jung, ICML 2023, MLILAB, KAISTAI
PDF
Generative Adversarial Network and its Applications to Speech Processing an...
PDF
An Introduction To Generative Adversarial Networks
PDF
Deep Generative Modelling (updated)
PPTX
Generative Adversarial Network (GAN)
PPTX
Face-GAN project report.pptx
PPTX
Face-GAN project report
PPTX
Module4_GAN.pptxgdgdijehejejjejejejhehjdd
[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation
Unpaired Image Translations Using GANs: A Review
Google Dev Group Yangon (2020) AI Talk (Creative AI in Action)
Generative Adversarial Network (GAN) for Image Synthesis
IRJET- Transformation of Realistic Images and Videos into Cartoon Images and ...
A Survey of Generative Adversarial Neural Networks (GAN) for Text-to-Image Sy...
An Intelligent approach to Pic to Cartoon Conversion using White-box-cartooni...
Image colorization
Image colorization
affritical Intelligent thuat toan Gan cho AI
Let's paint a Picasso - A Look at Generative Adversarial Networks (GAN) and i...
Computer Vision Gans
Y. Jung, ICML 2023, MLILAB, KAISTAI
Generative Adversarial Network and its Applications to Speech Processing an...
An Introduction To Generative Adversarial Networks
Deep Generative Modelling (updated)
Generative Adversarial Network (GAN)
Face-GAN project report.pptx
Face-GAN project report
Module4_GAN.pptxgdgdijehejejjejejejhehjdd
Ad

More from S.Shayan Daneshvar (8)

PDF
Microservice architecture (MSA) and patterns
PDF
PostgreSQL - Case Study
PDF
Advanced SQL - Database Access from Programming Languages
PPTX
P, NP and NP-Complete, Theory of NP-Completeness V2
PPTX
Longest increasing subsequence
PPTX
Analysis of algorithms
PPTX
Amortized analysis
PPTX
Introduction to MongoDB
Microservice architecture (MSA) and patterns
PostgreSQL - Case Study
Advanced SQL - Database Access from Programming Languages
P, NP and NP-Complete, Theory of NP-Completeness V2
Longest increasing subsequence
Analysis of algorithms
Amortized analysis
Introduction to MongoDB

Recently uploaded (20)

PPTX
C1 cut-Methane and it's Derivatives.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
BIOMOLECULES PPT........................
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
Sciences of Europe No 170 (2025)
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
An interstellar mission to test astrophysical black holes
PPTX
Microbiology with diagram medical studies .pptx
PDF
The scientific heritage No 166 (166) (2025)
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
2Systematics of Living Organisms t-.pptx
C1 cut-Methane and it's Derivatives.pptx
neck nodes and dissection types and lymph nodes levels
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
BIOMOLECULES PPT........................
Placing the Near-Earth Object Impact Probability in Context
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
POSITIONING IN OPERATION THEATRE ROOM.ppt
Sciences of Europe No 170 (2025)
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
Introduction to Fisheries Biotechnology_Lesson 1.pptx
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
An interstellar mission to test astrophysical black holes
Microbiology with diagram medical studies .pptx
The scientific heritage No 166 (166) (2025)
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
2Systematics of Living Organisms t-.pptx

Image to image translation with Pix2Pix GAN

  • 2. Overview 1. What is Image-to-ImageTranslation? 2. The Problem of Image-to-ImageTranslation 3. Pix2Pix GAN for Image-to-ImageTranslation 4. Pix2Pix Architectural Details 5. Applications of the Pix2Pix GAN 6. Conclusion & References 1/20
  • 4. ImageTranslation Image-to-image translation is the controlled conversion of a given source image to a target image. 3/20 1
  • 5. The Problem of Image-to-ImageTranslation 2
  • 6. Problem Definition • It is a challenging problem that typically requires the development of a specialized model and hand-crafted loss function for the type of translation task being performed. • Classical approaches , use per-pixel classification or regression models. Ideally, a technique is required that is general, meaning that the same general model and loss function can be used for multiple different image-to-image translation tasks. 5/20 2
  • 7. Pix2Pix GAN for Image-to-ImageTranslation 3
  • 8. Pix2Pix Pix2Pix is a Generative Adversarial Network, model designed for general purpose image-to-image translation.The approach was presented by Phillip Isola, et al. in their 2016 paper titled Image-to-ImageTranslation with Conditional Adversarial Networks and presented at CVPR in 2017. 7/20 3
  • 9. Pix2Pix 8/20 3 Demo available at https://affinelayer.com/pixsrv/
  • 10. GAN and cGAN 9/20 3 The GAN architecture is an approach to training a generator model, typically used for generating images. A discriminator model is trained to classify images as real or fake and the generator is trained to fool the discriminator model. The Conditional GAN, is an extension of the GAN architecture that provides control over the image that is generated, e.g. allowing an image of a given class to be generated.
  • 11. Pix2Pix GAN 10/20 3 The generator model is provided with a given image as input and generates a translated version of the image. The discriminator model is given an input image and a real or generated paired image and must determine whether the paired image is real or fake. Finally, the generator model is trained to both fool the discriminator model and to minimize the loss between the generated image and the expected target image.
  • 12. Dataset Required for Pix2Pix 11/20 3 Pix2Pix GAN must be trained on image datasets that are comprised of input images (before translation) and output or target images (after translation).
  • 14. Pix2Pix Architecture 13/20 4 Pix2Pix GAN architecture involves the specification of the following: • The generator model • The discriminator model • model optimization procedure Both the generator and discriminator models use the standard Convolution- BatchNormalization-ReLU blocks of layers.
  • 15. U-Net Generator Model 14/20 4 A U-Net model architecture is used for the generator, instead of the common encoder-decoder model. It is very similar to encoder-decoder model as it involves down sampling to a bottleneck and up sampling again to an output image, but links or skip-connections are made between layers of the same size in the encoder and the decoder.
  • 16. PatchGAN Discriminator Model 15/20 4 Unlike the standard GAN model that uses a deep convolutional neural network to classify images, the Pix2Pix model uses a PatchGAN, which is a deep convolutional neural network designed to classify patches of an input image as real or fake, rather than the entire image.The output of the network is a single feature map of real/fake predictions that can be averaged to give a single score.
  • 17. Composite Adversarial and L1 Loss 16/20 4 The discriminator model is trained in a standalone manner in the same way as a standard GAN model. The generator model is trained using both the adversarial loss for the discriminator model and the mean absolute pixel difference between the generated image and the expected image. Generator Loss = Adversarial Loss + λ × L1 Loss
  • 18. Applications of the Pix2Pix GAN 5
  • 19. Applications 18/20 5 • Semantic labels ⇔ photo • Architectural labels ⇒ photo • Map ⇔ aerial photo • Black andWhite ⇒ color photos. • Edges ⇒ photo. • Sketch ⇒ photo. • Day ⇒ night photographs. • Thermal ⇒ color photos. • Photo with missing pixels ⇒ inpainted photo
  • 20. Conclusion 19/20 6 • Image-to-image translation often requires specialized models and hand- crafted loss functions. • Pix2Pix GAN provides a general purpose model and loss function for image- to-image translation. • The Pix2Pix GAN was demonstrated on a wide variety of image generation tasks, including translating photographs from day to night and product sketches to photographs.
  • 21. References 20/20 6 1. P. Isola, J. Zhu,T. Zhou and A. A. Efros, "Image-to-ImageTranslation with Conditional Adversarial Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5967-5976. 2. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D.Warde-Farley, S. Ozair, et al., "Generative adversarial nets", NIPS, 2014, pp. 2672–2680. 3. Generative Adversarial Networks with Python Deep Learning Generative Models for Image Synthesis and ImageTranslation by Jason Brownlee. 4. Generative Adversarial Networks for Image-to-ImageTranslation by Arun Solanki, Anand Nayyar, and Mohd Naved