[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies

DEEP LEARNING JP
[DL Papers]
”Life-Long Disentangled Representation Learning with
Cross-Domain Latent Homologies” (NIPS2018)
Yusuke Iwasawa, Matsuo Lab
http://deeplearning.jp/

DEEP LEARNING JP
[DL Papers]
“Unsupervised Disentangled Representation Learning”
Yusuke Iwasawa, Matsuo Lab
http://deeplearning.jp/

書誌情報
• Title: “Life-Long Disentangled Representation Learning with
Cross-Domain Latent Homologies”
• Authors:
– Alessandro Achille, Tom Eccles, Loic Matthey, Christopher P Burgess,
Nick Watters, Alexander Lerchner, Irina Higgins
– 1stはUCLS、残りがDeepMind
• 選定理由
– Disentangleという文字がNIPSで目立った
– Lifelong大事（知能の研究という意味で）
3

Disentanglement in NIPS2018
VAE（β-VAE）系
• “Life-Long Disentangled Representation Learning with Cross-Domain Latent
Homologies”
• “Isolating Sources of Disentanglement in Variational Autoencoders”
• “Learning Disentangled Joint Continuous and Discrete Representations”
• “Learning to Decompose and Disentangle Representations for Video Prediction”
その他
• “A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation”
• “Image-to-image translation for cross-domain disentanglement”
• “Learning Deep Disentangled Embeddings with the F-Statistic Loss”
4

Agenda
• Disentangle Representation Learning
• Method for Disentangle Representation Learning
– InfoGAN [Chen, NIPS2016]
– β-VAE [Higgins, ICLR2017]
– Advance of β-VAE [Chen, ICML2018]
• Disentanglement for Lifelong Learning [Achille, NIPS2018]
5

What is Disentangled Representation Learning?
• disentangle = もつれを解く
• Disentangled RL：もつれのない表現を学習
• Example: 顔画像を構成する要素
– 性別
– 顔の向き
– 髪の長さ
– メガネの有無
– Etc…
6
これらの要素は本来的に互いに独立に制御できる
=> NNが学ぶ表現もそうなっていてほしい

Why Disentanglement is Important?
1. 人間もそういう表現学習している気がする
– 顔の位置と目の大きさは多分分けて表現されている
2. 解釈しやすい
3. 効率が良い（最小限のユニットで表現できる）
4. 後継タスクが解きやすくなる（ような気がする）
– 特に、転移を考える場合には複数の因子が混ざっていると厄介
• 具体的な応用研究
– Concept Learning [Higgins, ICLR2018]
– Reinforcement Learning [Higgins, ICML2017]
– Lifelong Learning [Achille, NIPS2018] 7

Difficulty
1. 教師なしである必要がある/望ましい
– DLに勝手に表現のもつれを解く可能性はある（特に教師あり）
– いちいち各画像に各因子をラベル付けするとかやってられない
2. 予測できる方法である必要がある
– やってみたらdisentangleされていた、ではなくdisentangleされると言う根
拠がほしい
8

Agenda
• Method for Unsup. Disentangle Representation Learning
9

代表的な2系譜
• InfoGAN [Chen, NIPS2016]
– GANベース
– ある因数分解可能な潜在コードから生成された画像が
元の潜在コードに関する情報を持つように
• β-VAE [Higgins, ICLR2017]
– VAEベース
– 事後分布q(z|x)が因数分解可能な事前分布p(z)に近づくように
10

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative
Adversarial Nets (NIPS2016)
11
Xi Chen et al.,
普通のGAN 潜在コードcと生成画像
の相互情報量最大化
• D：Discriminator
• G：Generator
• z：ノイズ
• c：分解可能な潜在コード（例：c ~ Cat(K=10, p=0.1) or c ~ Unif(-1, 1)）
• λ：重み付けパラメタ

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative
Adversarial Nets (NIPS2016)
12
Xi Chen et al.,

InfoGANの問題点
• GANベースなので学習が難しい
– W-GANとかそのへんにより緩和されている気もする
– 相互情報量の制約をつけるとサンプルの多様性も減る（らしい）
（by βVAE論文、単純なノイズzの大きさに依存する気もする）
• Prior p(c)の選択が難しい（タスクに関する知識を使ってる）
– 例：MNISTならカテゴリ10個
• GANベースなので推論分布（ネットワーク）がない
– ALIとかもあるけどイマイチ普及してない 13

β-VAE: LEARNING BASIC VISUAL CONCEPTS WITH
A CONSTRAINED VARIATIONAL FRAMEWORK (ICLR2016)
• 基本的な考え方：得られる潜在変数zが因数分解可能な分布に近づくよう
に制約を付与すればよい
• ラグランジュの未定乗数法を使うと次のようになる
14
Irina Higgins et al.,
（βというパラメータを持つ）VAE！！！

β-VAE: LEARNING BASIC VISUAL CONCEPTS WITH
A CONSTRAINED VARIATIONAL FRAMEWORK (ICLR2016)
15
Irina Higgins et al.,

β-VAEの問題点：βによるトレードオフ
16
• Β=150の場合再構築があまりうまく言ってない • ガウス分布に単に近づけようと思うと、q(z|x)の分布が平
らになる（異なるzが重なるようになる）
図は“Understanding disentangling in β-VAE”より抜粋

β- VAEの問題：βによるトレードオフ
17
• KLはxとzの相互情報量とq(z)とp(z)のKLに分解可能
• 相互情報量は維持しないと再構成できないのは当然
=> KL(q(z)||p(z))の方だけ制約かけたい
“Disentangling by Factorizing”より抜粋

対策論文
[Burgess+, NIPS2017] “Understanding disentangling in β-VAE”
[Kim+, ICML2018] “Disentangling by Factorizing”
[Chen+, NIPS2018] “Isolating Sources of Disentanglement in VAE”
18

Understanding disentangling in β-VAE (NIPS2017)
19
Christopher P. Burgess et al.,
KLがターゲットCに近づくように
（zの情報ボトルネックを緩和）
• C：ターゲット情報量
• 学習中にはCを徐々に大きくする
• （zは徐々に大きな情報を獲得することを許容される）
• 実験的には線形に大きくする
Controlled Capacity Increase β-VAE (CCI-VAE)

Understanding disentangling in β-VAE (NIPS2017)
20
Christopher P. Burgess et al.,

Isolating Sources of Disentanglement in VAE (NIPS2018)
21
Hyunjik Kim and Andriy Mnih
p(x)とq(z)が独立=>0
（小さくなると☓）
zi同士の独立性（Total Correlationと呼ばれる）
=> 小さくなってほしい
β-TCVAE
※ q(z)は重点サンプリングで求める
※ α=γ=1にしてβだけ大きくする

Isolating Sources of Disentanglement in VAE (NIPS2018)
22

Disentangling by Factorizing (ICML2018)
23
https://www.slideshare.net/DeepLearningJP2016/dldisentangling-by-factorising
• q(z)をどう求めるか？
• MCMCとかはだるい（そもそも多峰分布）
=> Density Ratio Trick（図参照）
Total Correlation
普通のVAE

Disentangling by Factorizing (ICML2018)
24
Hyunjik Kim and Andriy Mnih, ICML2018

ここまでのまとめ
• disentangle大事
• 代表手法１：InfoGAN
– GANに起因する難しさ（最適化、推論ネットワークがない）
• 代表手法２：βVAE
– 再構築とdisentanglementのトレードオフ
=> 種々の研究
25

Agenda
• Method for Disentangle Representation Learning
26

What is Lifelong Learning (Continuous Learning)?
• Aspect1: “The ability to acquire new knowledge from a sequence
of experiences to solve progressively more tasks, while
maintaining performance on previous ones”
• Aspect2: “The ability to sensibly reuse previously learnt
representations in new domains”
• 次々と現れるタスクを解くのに必要な知識を過去の情報を
忘れずにかつ高速に獲得する
27

Why Lifelong Learning is Important?
• 科学的：人間もそうしている（again
– 知能っぽい
– どちらかというと汎用AIっぽい方向性
• 工学的：過去の知識をうまく生かせないといつまでもデータが大量
に必要
28

Proposal：Disentanglement for Lifelong Learning
• 現実世界で起こるタスクの系列は何らかの因子を共有しているはず
– a.k.a 物理/化学法則は同じ
• 各タスクを最小で記述するdisentangleされた表現（と各タスクでどの因子が
有用かを判定する手段）があればいろいろなタスクを忘却無しで解けるの
では？
Disentanglement Prior
29

Difficulty
• β-VAE（あるいは普通のdisentanglement）はデータの分布や生
成過程が変化しないことを仮定している
• Lifelong学習では明らかに偽（タスクが変わるので）
• => β-VAEをLifelong学習に拡張
30

具体的な方法：データ分布に関する仮定
• S = {s1, s2, s3, …, sk}：K個の環境（タスク）
• Z = {z1, z2, z3, …, zk}：全環境共通のデータ生成因子
• Zs in Z：環境sに関係する潜在因子
• as：an
s = 1 if zn in Zs
• xs ~ p(.|zs, s)という生成過程
– つまりデータ依存のzから環境sの
データは生成されている
31
データ分布に関する仮定の図示

再パラメータ化
32
• ほぼほぼCCI-VAE
• ただし、q(zs|xs)のモデル化とsの推定法はnon trivial
Variational Autoencoder with Shared Embeddings (VSAE)
参考：Controlled Capacity Increase β-VAE (CCI-VAE)

q(z|s)
33
ただし、asは以下の基準で定めるatypicality scoreが
(1) 一定以下の場合には1、(2) その他の場合に0とする
xの生成過程に入っている
（と思われる）zについては
普通のVAE
入ってないと思われるやつは
単にPrior
※ atipicality
= any state that is not typical
あるsにおけるあるzの平均的な振る舞いとPriorのKL
（気持ち：あるzがxの生成過程に含まれているならば学習が進めば平均的にはpriorに近づくはず）

Catacrotic Forgettingへの対策：hallucinating
• 過去の情報を忘れてしまうのは困る
• 過去のスナップショットから生成されるサンプルが現在のバージョ
ンでも正しくモデル化できることを定期的に保証
35

実験3. Dealing with ambiguity
39

実験5. Imagination-driven-exploration
41

まとめ
• β-VAEをLifelong学習に適した形で拡張
– 普通のβ-VAEはデータの分布が変化することを仮定していない
– 具体的には複数の環境が生成因子を共有しているという仮定をおいて、
学習
• 破滅的忘却はDreamingにより回避
• 大量の実験により有効性を確認
– 詳しくは論文参照してください
42

[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies

More Related Content

What's hot (20)

Similar to [DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies (20)

More from Deep Learning JP (20)

[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies