SlideShare a Scribd company logo
Generative Models as Distributions
of Functions
PR12와 함께 이해하는
Jaejun Yoo
(current) Postdoc. @EPFL
(from July) Assistant Prof., @UNIST
PR-312, 11th April, 2021
Today’s contents
“For all datasets, we use an MLP with 3
hidden layers of size 128 … and an MLP
with 2 hidden layers of size 256 and 512”
“We performed all training on a single
2080Ti GPU with 11GB of RAM.”
Motivation and Main Problem
“Conventional signal representations are usually discrete.”
However, Mother Nature is continuous!
(well… up to planck constant…?)
2D Images Audio 3D Shapes
Motivation and Main Problem
Of course, these functions are usually not analytically tractable. it is impossible to "write down"
the function that parameterizes a natural image as a mathematical formula.
Continuous representation?
Why hard?
Motivation and Main Problem
Why important?
• independent of spatial resolution (infinite resolution)
• Geometric transformation of images: zoom, rotation, super-resolution.
• Derivatives are well-defined.
Motivation and Main Problem
Why important?
• independent of spatial resolution (infinite resolution)
• Geometric transformation of images: zoom, rotation, super-resolution.
• Derivatives are well-defined.
Motivation and Main Problem
Why important?
Piecewise Constant Bilinear Cubic Spline
Motivation and Main Problem
Why important?
Piecewise constant Bilinear Cubic Spline
Motivation and Main Problem
Why important?
Motivation and Main Problem
Spline: Variational properties
Explicit derivative!
Motivation and Main Problem
Spline: Variational properties
Continuous representation?
• DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation (Park et al. 2019)
• Occupancy Networks: Learning 3D Reconstruction in Function Space (Mescheder et al. 2019)
• IM-Net: Learning Implicit Fields for Generative Shape Modeling (Chen et al. 2018)
• … NeRF (PR-302)…
“Implicit Neural Representations approximate this function via a neural network!”
Motivation and Main Problem
Implicit Neural Representation!
Implicit Neural Representation
- Remarkably, the representation !" is independent of
the number of pixels. The representation !" therefore,
unlike most image representations, does not depend
on the resolution of the image.
- The core property of these representations is that
they scale with signal complexity and not with
signal resolution.
Learning Distributions of Functions
1. Parameterizing a distribution over neural
networks with a hypernetwork (Ha et al., 2017)
Overall Scheme
“Sample the weights of a neural network”
to obtain a function.
: Learning a distribution over functions !" is equivalent to
learning a distribution over weights #(%).
: Then, #(%), where % = () * , is refer to as a neural
function distribution (NFD).
Learning Distributions of Functions
1. Parameterizing a distribution over neural
networks with a hypernetwork (Ha et al., 2017)
Overall Scheme
“Sample the weights of a neural network”
to obtain a function.
: Learning a distribution over functions !" is equivalent to
learning a distribution over weights #(%).
: Then, #(%), where % = () * , is refer to as a neural
function distribution (NFD).
However! How do we get access to the
ground truth functions to train the network?
Learning Distributions of Functions
1. Parameterizing a distribution over neural
networks with a hypernetwork (Ha et al., 2017)
Overall Scheme
“We do have access to input/output
pairs of these functions through the
coordinates and features, allowing us to
learn function distributions without
operating directly on the functions!”
Learning Distributions of Functions
1. Parameterizing a distribution over neural
networks with a hypernetwork (Ha et al., 2017)
Overall Scheme
2. Training this distribution with an adversarial
approach (Goodfellow et al., 2014).
“We do have access to input/output
pairs of these functions through the
coordinates and features, allowing us to
learn function distributions without
operating directly on the functions!”
Learning Distributions of Functions
1. Parameterizing a distribution over neural
networks with a hypernetwork (Ha et al., 2017)
Overall Scheme
2. Training this distribution with an adversarial
approach (Goodfellow et al., 2014).
* is a kind of position encoding (Fourier feature).
“We do have access to input/output
pairs of these functions through the
coordinates and features, allowing us to
learn function distributions without
operating directly on the functions!”
Learning Distributions of Functions
Overall Scheme
NFD
Now we know how to design a network to learn continuous functions!
Learning Distributions of Functions
Overall Scheme
Discriminator
But, the data we consider may not necessarily lie on a grid…
Learning Distributions of Functions
Overall Scheme
Discriminator
… in which case it is not possible to use convolutional discriminators.
Learning Distributions of Functions
Overall Scheme
Discriminator
Our discriminator should be able to distinguish between
real and fake sets of coordinate and feature pairs.
Point Cloud Discriminator
Point Convolution
In contrast to regular convolutions,
where the convolution kernels are only
defined at certain grid locations, the
convolution filters in PointConv are
parameterized by an MLP mapping
coordinates to kernel values:
Experiments
“For all datasets, we use an MLP with 3
hidden layers of size 128 … and an MLP
with 2 hidden layers of size 256 and 512”
“We performed all training on a single
2080Ti GPU with 11GB of RAM.”
“Remarkably, such a simple architecture
is sufficient for learning rich distributions
of images and 3D shapes.“
“Use the exact same model for both
images and 3D shapes except for the
input and output dimensions of the
function representation.”
Implementation Setups
Results
2D Image generation
• Samples from our model trained on CelebAHQ.
• 64×64 (top) and 128×128 (bottom)
• Each image corresponds to a function which
was sampled from our model and then
evaluated on the grid.
• To produce this figure we sampled 5 batches
and chose the best batch by visual inspection.
Results
“To the infinity and beyond!”
- Buzz Lightyear, Toy Story
Super-resolution
NFD
64×64
NFD
256×256
Bicubic
256×256
NFD
28×28
NFD
256×256
Bicubic
256×256
Results
3D shapes
Voxel grids from Choy et al. (2016) representing the chairs category from the ShapeNet (Chang et al.,
2015) dataset. The dataset contains 6778 chairs each of dimension 32#
. For each 3D model, uniformly
subsample K = 4096 points among 32# = 32,768 points and use them for training.
• A step towards making implicit neural representation methods genuinely useful
for modeling datasets rather than individual data points.
• The first framework to model data of this complexity in an entirely continuous
fashion.
• The ability of being independent to resolution and operating outside of a grid.
• A unique way of using point cloud discriminators.
Conclusion
Summary of Contributions (I think)
Things to discuss about…
• What kinds of study would be derived from this?
• Architectural developments (better quality)?
• Then How? Or what would be helpful?
• Other applications?
• Again, compute-driven AI vs human-knowledge based?
• Big model vs inductive bias?
• Etc.?

More Related Content

PPTX
【DL輪読会】Scaling Laws for Neural Language Models
PDF
【DL輪読会】Scaling laws for single-agent reinforcement learning
PPTX
【DL輪読会】Variable Bitrate Neural Fields
PPTX
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
PPTX
深層学習の数理:カーネル法, スパース推定との接点
PDF
PRML 1.6 情報理論
PPTX
【DL輪読会】Transformers are Sample Efficient World Models
PPTX
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling laws for single-agent reinforcement learning
【DL輪読会】Variable Bitrate Neural Fields
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
深層学習の数理:カーネル法, スパース推定との接点
PRML 1.6 情報理論
【DL輪読会】Transformers are Sample Efficient World Models
【DL輪読会】大量API・ツールの扱いに特化したLLM

What's hot (20)

PDF
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
PDF
ELBO型VAEのダメなところ
PDF
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
PDF
実装レベルで学ぶVQVAE
PDF
RBM、Deep Learningと学習(全脳アーキテクチャ若手の会 第3回DL勉強会発表資料)
PPTX
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
PDF
クラシックな機械学習入門:付録:よく使う線形代数の公式
PDF
Bayes Independence Test - HSIC と性能を比較する-
PDF
道具としての機械学習:直感的概要とその実際
PDF
論文紹介 Semi-supervised Learning with Deep Generative Models
PPTX
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
PDF
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
PDF
Bayesian Neural Networks : Survey
PDF
[DL輪読会]Learning Transferable Visual Models From Natural Language Supervision
PDF
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
PPTX
[DL輪読会]Focal Loss for Dense Object Detection
PDF
[DL輪読会]Relational inductive biases, deep learning, and graph networks
PDF
SSII2022 [OS3-02] Federated Learningの基礎と応用
PPTX
【ICLR2023】論文紹介: Image as Set of Points
PPTX
DNNの曖昧性に関する研究動向
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
ELBO型VAEのダメなところ
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
実装レベルで学ぶVQVAE
RBM、Deep Learningと学習(全脳アーキテクチャ若手の会 第3回DL勉強会発表資料)
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
クラシックな機械学習入門:付録:よく使う線形代数の公式
Bayes Independence Test - HSIC と性能を比較する-
道具としての機械学習:直感的概要とその実際
論文紹介 Semi-supervised Learning with Deep Generative Models
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
Bayesian Neural Networks : Survey
[DL輪読会]Learning Transferable Visual Models From Natural Language Supervision
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Relational inductive biases, deep learning, and graph networks
SSII2022 [OS3-02] Federated Learningの基礎と応用
【ICLR2023】論文紹介: Image as Set of Points
DNNの曖昧性に関する研究動向
Ad

Similar to [PR12] Generative Models as Distributions of Functions (20)

PPTX
Image captioning
PDF
Machine learning in science and industry — day 4
PDF
CIKM-keynote-Nov2014- Large Scale Deep Learning.pdf
PDF
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
PPTX
Automatic Attendace using convolutional neural network Face Recognition
PDF
Variants of GANs - Jaejun Yoo
PDF
Crafting Recommenders: the Shallow and the Deep of it!
PPTX
Disentangled Representation Learning of Deep Generative Models
PDF
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PDF
Convolutional_neural_network mechanism.pptx.pdf
PPTX
brief Introduction to Different Kinds of GANs
PDF
[PR12] understanding deep learning requires rethinking generalization
PDF
introduction to deeplearning
PDF
Deep Learning and Reinforcement Learning
PPTX
Image classification with Deep Neural Networks
PDF
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
PPTX
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
PPT
NIPS2007: deep belief nets
PDF
MLIP - Chapter 3 - Introduction to deep learning
PPTX
Deep learning tutorial 9/2019
Image captioning
Machine learning in science and industry — day 4
CIKM-keynote-Nov2014- Large Scale Deep Learning.pdf
stable_diffusion_a_tutorial, How stable_diffusion works, build stable_diffusi...
Automatic Attendace using convolutional neural network Face Recognition
Variants of GANs - Jaejun Yoo
Crafting Recommenders: the Shallow and the Deep of it!
Disentangled Representation Learning of Deep Generative Models
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
Convolutional_neural_network mechanism.pptx.pdf
brief Introduction to Different Kinds of GANs
[PR12] understanding deep learning requires rethinking generalization
introduction to deeplearning
Deep Learning and Reinforcement Learning
Image classification with Deep Neural Networks
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
NIPS2007: deep belief nets
MLIP - Chapter 3 - Introduction to deep learning
Deep learning tutorial 9/2019
Ad

More from JaeJun Yoo (12)

PDF
[CVPR2020] Simple but effective image enhancement techniques
PDF
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
PDF
Super resolution in deep learning era - Jaejun Yoo
PDF
A beginner's guide to Style Transfer and recent trends
PDF
[PR12] Spectral Normalization for Generative Adversarial Networks
PDF
Introduction to ambient GAN
PDF
[PR12] categorical reparameterization with gumbel softmax
PDF
[PR12] Capsule Networks - Jaejun Yoo
PPTX
[PR12] Inception and Xception - Jaejun Yoo
PDF
[PR12] PixelRNN- Jaejun Yoo
PPTX
[Pr12] dann jaejun yoo
PDF
[PR12] intro. to gans jaejun yoo
[CVPR2020] Simple but effective image enhancement techniques
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Super resolution in deep learning era - Jaejun Yoo
A beginner's guide to Style Transfer and recent trends
[PR12] Spectral Normalization for Generative Adversarial Networks
Introduction to ambient GAN
[PR12] categorical reparameterization with gumbel softmax
[PR12] Capsule Networks - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo
[PR12] PixelRNN- Jaejun Yoo
[Pr12] dann jaejun yoo
[PR12] intro. to gans jaejun yoo

Recently uploaded (20)

PPTX
Fluid dynamics vivavoce presentation of prakash
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
C1 cut-Methane and it's Derivatives.pptx
PPTX
Application of enzymes in medicine (2).pptx
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
2Systematics of Living Organisms t-.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
An interstellar mission to test astrophysical black holes
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
famous lake in india and its disturibution and importance
PPTX
2. Earth - The Living Planet Module 2ELS
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PDF
Sciences of Europe No 170 (2025)
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
2. Earth - The Living Planet earth and life
PPTX
Overview of calcium in human muscles.pptx
Fluid dynamics vivavoce presentation of prakash
Introduction to Cardiovascular system_structure and functions-1
C1 cut-Methane and it's Derivatives.pptx
Application of enzymes in medicine (2).pptx
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
TOTAL hIP ARTHROPLASTY Presentation.pptx
2Systematics of Living Organisms t-.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Placing the Near-Earth Object Impact Probability in Context
An interstellar mission to test astrophysical black holes
POSITIONING IN OPERATION THEATRE ROOM.ppt
ECG_Course_Presentation د.محمد صقران ppt
famous lake in india and its disturibution and importance
2. Earth - The Living Planet Module 2ELS
lecture 2026 of Sjogren's syndrome l .pdf
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Sciences of Europe No 170 (2025)
Phytochemical Investigation of Miliusa longipes.pdf
2. Earth - The Living Planet earth and life
Overview of calcium in human muscles.pptx

[PR12] Generative Models as Distributions of Functions

  • 1. Generative Models as Distributions of Functions PR12와 함께 이해하는 Jaejun Yoo (current) Postdoc. @EPFL (from July) Assistant Prof., @UNIST PR-312, 11th April, 2021
  • 2. Today’s contents “For all datasets, we use an MLP with 3 hidden layers of size 128 … and an MLP with 2 hidden layers of size 256 and 512” “We performed all training on a single 2080Ti GPU with 11GB of RAM.”
  • 3. Motivation and Main Problem “Conventional signal representations are usually discrete.” However, Mother Nature is continuous! (well… up to planck constant…?) 2D Images Audio 3D Shapes
  • 4. Motivation and Main Problem Of course, these functions are usually not analytically tractable. it is impossible to "write down" the function that parameterizes a natural image as a mathematical formula. Continuous representation? Why hard?
  • 5. Motivation and Main Problem Why important? • independent of spatial resolution (infinite resolution) • Geometric transformation of images: zoom, rotation, super-resolution. • Derivatives are well-defined.
  • 6. Motivation and Main Problem Why important? • independent of spatial resolution (infinite resolution) • Geometric transformation of images: zoom, rotation, super-resolution. • Derivatives are well-defined.
  • 7. Motivation and Main Problem Why important? Piecewise Constant Bilinear Cubic Spline
  • 8. Motivation and Main Problem Why important? Piecewise constant Bilinear Cubic Spline
  • 9. Motivation and Main Problem Why important?
  • 10. Motivation and Main Problem Spline: Variational properties Explicit derivative!
  • 11. Motivation and Main Problem Spline: Variational properties
  • 12. Continuous representation? • DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation (Park et al. 2019) • Occupancy Networks: Learning 3D Reconstruction in Function Space (Mescheder et al. 2019) • IM-Net: Learning Implicit Fields for Generative Shape Modeling (Chen et al. 2018) • … NeRF (PR-302)… “Implicit Neural Representations approximate this function via a neural network!” Motivation and Main Problem Implicit Neural Representation!
  • 13. Implicit Neural Representation - Remarkably, the representation !" is independent of the number of pixels. The representation !" therefore, unlike most image representations, does not depend on the resolution of the image. - The core property of these representations is that they scale with signal complexity and not with signal resolution.
  • 14. Learning Distributions of Functions 1. Parameterizing a distribution over neural networks with a hypernetwork (Ha et al., 2017) Overall Scheme “Sample the weights of a neural network” to obtain a function. : Learning a distribution over functions !" is equivalent to learning a distribution over weights #(%). : Then, #(%), where % = () * , is refer to as a neural function distribution (NFD).
  • 15. Learning Distributions of Functions 1. Parameterizing a distribution over neural networks with a hypernetwork (Ha et al., 2017) Overall Scheme “Sample the weights of a neural network” to obtain a function. : Learning a distribution over functions !" is equivalent to learning a distribution over weights #(%). : Then, #(%), where % = () * , is refer to as a neural function distribution (NFD). However! How do we get access to the ground truth functions to train the network?
  • 16. Learning Distributions of Functions 1. Parameterizing a distribution over neural networks with a hypernetwork (Ha et al., 2017) Overall Scheme “We do have access to input/output pairs of these functions through the coordinates and features, allowing us to learn function distributions without operating directly on the functions!”
  • 17. Learning Distributions of Functions 1. Parameterizing a distribution over neural networks with a hypernetwork (Ha et al., 2017) Overall Scheme 2. Training this distribution with an adversarial approach (Goodfellow et al., 2014). “We do have access to input/output pairs of these functions through the coordinates and features, allowing us to learn function distributions without operating directly on the functions!”
  • 18. Learning Distributions of Functions 1. Parameterizing a distribution over neural networks with a hypernetwork (Ha et al., 2017) Overall Scheme 2. Training this distribution with an adversarial approach (Goodfellow et al., 2014). * is a kind of position encoding (Fourier feature). “We do have access to input/output pairs of these functions through the coordinates and features, allowing us to learn function distributions without operating directly on the functions!”
  • 19. Learning Distributions of Functions Overall Scheme NFD Now we know how to design a network to learn continuous functions!
  • 20. Learning Distributions of Functions Overall Scheme Discriminator But, the data we consider may not necessarily lie on a grid…
  • 21. Learning Distributions of Functions Overall Scheme Discriminator … in which case it is not possible to use convolutional discriminators.
  • 22. Learning Distributions of Functions Overall Scheme Discriminator Our discriminator should be able to distinguish between real and fake sets of coordinate and feature pairs.
  • 23. Point Cloud Discriminator Point Convolution In contrast to regular convolutions, where the convolution kernels are only defined at certain grid locations, the convolution filters in PointConv are parameterized by an MLP mapping coordinates to kernel values:
  • 24. Experiments “For all datasets, we use an MLP with 3 hidden layers of size 128 … and an MLP with 2 hidden layers of size 256 and 512” “We performed all training on a single 2080Ti GPU with 11GB of RAM.” “Remarkably, such a simple architecture is sufficient for learning rich distributions of images and 3D shapes.“ “Use the exact same model for both images and 3D shapes except for the input and output dimensions of the function representation.” Implementation Setups
  • 25. Results 2D Image generation • Samples from our model trained on CelebAHQ. • 64×64 (top) and 128×128 (bottom) • Each image corresponds to a function which was sampled from our model and then evaluated on the grid. • To produce this figure we sampled 5 batches and chose the best batch by visual inspection.
  • 26. Results “To the infinity and beyond!” - Buzz Lightyear, Toy Story Super-resolution NFD 64×64 NFD 256×256 Bicubic 256×256 NFD 28×28 NFD 256×256 Bicubic 256×256
  • 27. Results 3D shapes Voxel grids from Choy et al. (2016) representing the chairs category from the ShapeNet (Chang et al., 2015) dataset. The dataset contains 6778 chairs each of dimension 32# . For each 3D model, uniformly subsample K = 4096 points among 32# = 32,768 points and use them for training.
  • 28. • A step towards making implicit neural representation methods genuinely useful for modeling datasets rather than individual data points. • The first framework to model data of this complexity in an entirely continuous fashion. • The ability of being independent to resolution and operating outside of a grid. • A unique way of using point cloud discriminators. Conclusion Summary of Contributions (I think)
  • 29. Things to discuss about… • What kinds of study would be derived from this? • Architectural developments (better quality)? • Then How? Or what would be helpful? • Other applications? • Again, compute-driven AI vs human-knowledge based? • Big model vs inductive bias? • Etc.?