SlideShare a Scribd company logo
Score-based Generative
Models
이상윤
Generative Models
“Learning causal relation”
https://web.eecs.umich.edu/~justincj/slides/eecs498/498_FA2019_lecture19.pdf
Generative Models
GAN: More focusing on sampling
VAE: Maximize the lower bound of likelihood using surrogate loss
Normalizing Flow: Exact likelihood maximization via invertible transformations
https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html
Generative Models - limitations
GAN: High fidelity but hard to train
VAE: Not exact likelihood maximization
Normalizing Flow: Lack of flexibility (must be invertible)
https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html
Motivations
Sampling without computing probability density through langevin
dynamics (similar to gradient ascend)
Motivations
Why we need PDF?
- MLE equal to minimizing KLD with real data distribution.
- In PDF, all densities compete with each other.
https://web.eecs.umich.edu/~justincj/slides/eecs498/498_FA2019_lecture19.pdf
Motivations
: normalizing constant
: unnormalized function
or energy function
- The integral part of normalizing constant is intractable.
- AR, NF handle this problem via special structures which make the unit normalizing constant.
It is difficult to define a flexible, high-capacity probability density function
Motivations
Score-based models naturally bypass the problem.
Score matching
Score matching
Score matching
Score matching
Gradient is zero in data points. Local maxima regularization
But computing hessian trace
needs O(D) backprop!
Score matching - sliced score matching
Project the score onto random vectors
Equals to
Hessian-vector product, https://en.wikipedia.org/wiki/Hessian_automatic_differentiation
-> single backprop
Score matching - denoising score matching
Predict the score of the perturbed distribution instead of p(x).
Minimizing the object function above give us the optimal score function
But results could be noisy if is large
Let be gaussian,
Pitfalls
Now that we have score function, so let’s sample through langevin dynamics.
Under some conditions, it is proved that is the exact sample from p(x)!
But there are some problems...
Pitfalls - manifold hypothesis
The manifold hypothesis: Data lie on low-dimensional manifold.
Problems
- Score function is inaccurate in the low-density regions.
- It is difficult to recover the relative weights between modes.
Pitfalls - inaccurate score function
is not well-defined in the low density regions.
Since we train the objective function above via montecarlo estimation,
a model observes less samples from the low density regions.
Score matching - recovering relative weights
Score matching - recovering relative weights
Langevin dynamics failed to recover the relative weights between two modes.
Perturbed distribution
The low density regions can be filled by injecting the noises.
But how to determine the noise strength? -> gradually decrease the variance.
Noise Conditional Score Network (NCSN)
A single network estimates the score of multiple perturbed data distribution.
Annealed Langevin Dynamics
Results
Generative Modeling by Estimating Gradients of the Data Distribution, NeurIPS2019, Oral
Results
SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS,
ICLR2021 (Oral)
Inverse Problem Solving
Inverse problem is Bayesian inference problem. We want to know p(x|y) when p(y|x) is given. E.g,
super resolution, colorization, inpainting...
score matching
known forward
process
We could sample from p(x|y) via langevin dynamics!
Inverse Problem Solving
Inverse problem solving without retraining!
Conclusion
Very flexible architecture compared to NF or AR. Score function can be any function.
Therefore, modern DL architectures can be used (Resnet, UNet, etc).
Sampling from exact p(x) is possible compared to VAEs which use surrogate loss.
GAN-level fidelity without minmax game.
Naturally solves inverse problem.
References
Song, Yang, and Stefano Ermon. "Generative modeling by estimating gradients of the data distribution." arXiv
preprint arXiv:1907.05600 (2019).
Song, Yang, et al. "Score-based generative modeling through stochastic differential equations." arXiv preprint
arXiv:2011.13456 (2020).
Yang Song blog (https://yang-song.github.io/blog/2021/score/)
Stefano Ermon seminar (https://www.youtube.com/watch?v=8TcNXi3A5DI&t=562s)

More Related Content

PDF
Introduction to Diffusion Models
PDF
Score-Based Generative Modeling through Stochastic Differential Equations
PDF
PR-409: Denoising Diffusion Probabilistic Models
PPTX
Diffusion models beat gans on image synthesis
PDF
Introduction of VAE
PDF
diffusion 모델부터 DALLE2까지.pdf
PDF
CVPR 2022 Tutorial에 대한 쉽고 상세한 Diffusion Probabilistic Model
PDF
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics
Introduction to Diffusion Models
Score-Based Generative Modeling through Stochastic Differential Equations
PR-409: Denoising Diffusion Probabilistic Models
Diffusion models beat gans on image synthesis
Introduction of VAE
diffusion 모델부터 DALLE2까지.pdf
CVPR 2022 Tutorial에 대한 쉽고 상세한 Diffusion Probabilistic Model
007 20151214 Deep Unsupervised Learning using Nonequlibrium Thermodynamics

What's hot (20)

PDF
Score based Generative Modeling through Stochastic Differential Equations
PDF
Tutorial on Deep Generative Models
PDF
Feature Engineering
PDF
Introduction to Generative Adversarial Networks (GANs)
PDF
Emerging Properties in Self-Supervised Vision Transformers
PDF
Explicit Density Models
PDF
(文献紹介) 画像復元:Plug-and-Play ADMM
PPTX
Masked Autoencoders Are Scalable Vision Learners.pptx
PDF
오토인코더의 모든 것
PPTX
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
PPTX
What is word2vec?
PDF
【DL輪読会】Domain Generalization by Learning and Removing Domainspecific Features
PDF
Introduction to Autoencoders
PPTX
Beginner's Guide to Diffusion Models..pptx
PDF
Dimensionality reduction with UMAP
PPTX
Generative Adversarial Networks (GAN)
PDF
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
PDF
Deep Generative Models
PPTX
MCMCでマルチレベルモデル
PDF
【論文読み会】Self-Attention Generative Adversarial Networks
Score based Generative Modeling through Stochastic Differential Equations
Tutorial on Deep Generative Models
Feature Engineering
Introduction to Generative Adversarial Networks (GANs)
Emerging Properties in Self-Supervised Vision Transformers
Explicit Density Models
(文献紹介) 画像復元:Plug-and-Play ADMM
Masked Autoencoders Are Scalable Vision Learners.pptx
오토인코더의 모든 것
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
What is word2vec?
【DL輪読会】Domain Generalization by Learning and Removing Domainspecific Features
Introduction to Autoencoders
Beginner's Guide to Diffusion Models..pptx
Dimensionality reduction with UMAP
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Deep Generative Models
MCMCでマルチレベルモデル
【論文読み会】Self-Attention Generative Adversarial Networks
Ad

Similar to Score based generative model (20)

PPTX
Synthetic Image Data Generation using GAN &Triple GAN.pptx
PDF
Diffusion Schrödinger bridges for score-based generative modeling
PDF
Diffusion Schrödinger bridges for score-based generative modeling
PDF
Apr. 2, 2024 unsupervised learning fintecj.pdf
PPTX
GDC2019 - SEED - Towards Deep Generative Models in Game Development
PDF
Introduction to Generative Adversarial Network
PDF
gans_copy.pdfhjsjsisidkskskkskwkduydjekedj
PDF
Tutorial on Theory and Application of Generative Adversarial Networks
PDF
Deep Generative Modelling
PPTX
Deep learning.pptxst8itsstitissitdyiitsistitsitd
PPTX
Slides for "Do Deep Generative Models Know What They Don't know?"
PPTX
GAN for Bayesian Inference objectives
PDF
Deep Learning: Generative Adversarial Network
PDF
VAE-type Deep Generative Models
PDF
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
PDF
TPDM Presentation Slide (ICCV23)
PDF
A Short Introduction to Generative Adversarial Networks
PDF
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
PDF
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
PDF
Normalizing flow
Synthetic Image Data Generation using GAN &Triple GAN.pptx
Diffusion Schrödinger bridges for score-based generative modeling
Diffusion Schrödinger bridges for score-based generative modeling
Apr. 2, 2024 unsupervised learning fintecj.pdf
GDC2019 - SEED - Towards Deep Generative Models in Game Development
Introduction to Generative Adversarial Network
gans_copy.pdfhjsjsisidkskskkskwkduydjekedj
Tutorial on Theory and Application of Generative Adversarial Networks
Deep Generative Modelling
Deep learning.pptxst8itsstitissitdyiitsistitsitd
Slides for "Do Deep Generative Models Know What They Don't know?"
GAN for Bayesian Inference objectives
Deep Learning: Generative Adversarial Network
VAE-type Deep Generative Models
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
TPDM Presentation Slide (ICCV23)
A Short Introduction to Generative Adversarial Networks
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
Normalizing flow
Ad

Recently uploaded (20)

PPT
veterinary parasitology ````````````.ppt
PDF
An interstellar mission to test astrophysical black holes
PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPT
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
. Radiology Case Scenariosssssssssssssss
PPT
6.1 High Risk New Born. Padetric health ppt
PPTX
Application of enzymes in medicine (2).pptx
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PPTX
Microbes in human welfare class 12 .pptx
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
veterinary parasitology ````````````.ppt
An interstellar mission to test astrophysical black holes
Science Quipper for lesson in grade 8 Matatag Curriculum
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
7. General Toxicologyfor clinical phrmacy.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
lecture 2026 of Sjogren's syndrome l .pdf
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
. Radiology Case Scenariosssssssssssssss
6.1 High Risk New Born. Padetric health ppt
Application of enzymes in medicine (2).pptx
TOTAL hIP ARTHROPLASTY Presentation.pptx
Phytochemical Investigation of Miliusa longipes.pdf
Placing the Near-Earth Object Impact Probability in Context
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Microbes in human welfare class 12 .pptx
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf

Score based generative model

  • 2. Generative Models “Learning causal relation” https://web.eecs.umich.edu/~justincj/slides/eecs498/498_FA2019_lecture19.pdf
  • 3. Generative Models GAN: More focusing on sampling VAE: Maximize the lower bound of likelihood using surrogate loss Normalizing Flow: Exact likelihood maximization via invertible transformations https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html
  • 4. Generative Models - limitations GAN: High fidelity but hard to train VAE: Not exact likelihood maximization Normalizing Flow: Lack of flexibility (must be invertible) https://lilianweng.github.io/lil-log/2018/10/13/flow-based-deep-generative-models.html
  • 5. Motivations Sampling without computing probability density through langevin dynamics (similar to gradient ascend)
  • 6. Motivations Why we need PDF? - MLE equal to minimizing KLD with real data distribution. - In PDF, all densities compete with each other. https://web.eecs.umich.edu/~justincj/slides/eecs498/498_FA2019_lecture19.pdf
  • 7. Motivations : normalizing constant : unnormalized function or energy function - The integral part of normalizing constant is intractable. - AR, NF handle this problem via special structures which make the unit normalizing constant. It is difficult to define a flexible, high-capacity probability density function
  • 12. Score matching Gradient is zero in data points. Local maxima regularization But computing hessian trace needs O(D) backprop!
  • 13. Score matching - sliced score matching Project the score onto random vectors Equals to Hessian-vector product, https://en.wikipedia.org/wiki/Hessian_automatic_differentiation -> single backprop
  • 14. Score matching - denoising score matching Predict the score of the perturbed distribution instead of p(x). Minimizing the object function above give us the optimal score function But results could be noisy if is large Let be gaussian,
  • 15. Pitfalls Now that we have score function, so let’s sample through langevin dynamics. Under some conditions, it is proved that is the exact sample from p(x)! But there are some problems...
  • 16. Pitfalls - manifold hypothesis The manifold hypothesis: Data lie on low-dimensional manifold. Problems - Score function is inaccurate in the low-density regions. - It is difficult to recover the relative weights between modes.
  • 17. Pitfalls - inaccurate score function is not well-defined in the low density regions. Since we train the objective function above via montecarlo estimation, a model observes less samples from the low density regions.
  • 18. Score matching - recovering relative weights
  • 19. Score matching - recovering relative weights Langevin dynamics failed to recover the relative weights between two modes.
  • 20. Perturbed distribution The low density regions can be filled by injecting the noises. But how to determine the noise strength? -> gradually decrease the variance.
  • 21. Noise Conditional Score Network (NCSN) A single network estimates the score of multiple perturbed data distribution.
  • 23. Results Generative Modeling by Estimating Gradients of the Data Distribution, NeurIPS2019, Oral
  • 24. Results SCORE-BASED GENERATIVE MODELING THROUGH STOCHASTIC DIFFERENTIAL EQUATIONS, ICLR2021 (Oral)
  • 25. Inverse Problem Solving Inverse problem is Bayesian inference problem. We want to know p(x|y) when p(y|x) is given. E.g, super resolution, colorization, inpainting... score matching known forward process We could sample from p(x|y) via langevin dynamics!
  • 26. Inverse Problem Solving Inverse problem solving without retraining!
  • 27. Conclusion Very flexible architecture compared to NF or AR. Score function can be any function. Therefore, modern DL architectures can be used (Resnet, UNet, etc). Sampling from exact p(x) is possible compared to VAEs which use surrogate loss. GAN-level fidelity without minmax game. Naturally solves inverse problem.
  • 28. References Song, Yang, and Stefano Ermon. "Generative modeling by estimating gradients of the data distribution." arXiv preprint arXiv:1907.05600 (2019). Song, Yang, et al. "Score-based generative modeling through stochastic differential equations." arXiv preprint arXiv:2011.13456 (2020). Yang Song blog (https://yang-song.github.io/blog/2021/score/) Stefano Ermon seminar (https://www.youtube.com/watch?v=8TcNXi3A5DI&t=562s)