Description
Currently, all the Aesara RandomVariable
Op
s are being converted into in-place Op
s (i.e. when a graph containing one of them is compiled and evaluated, the underlying shared variable RNG state is updated in place). This can be a mild convenience—and a potential efficiency gain—but it's not a good use of the RandomVariable
API, because it can lead to confusion.
For example,
import numpy as np
import aesara
import aesara.tensor as at
import pymc3 as pm
with pm.Model() as test_model:
X_rv = pm.Normal("x")
Y_rv = pm.Normal("y")
test_fn = test_model.fn(Y_rv + 2 * X_rv)
aesara.dprint(test_fn.f)
# Elemwise{Composite{(i0 + (i1 * i0))}}[(0, 0)] [id A] '' 1
# |normal_rv.1 [id B] 'y' 0
# | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE160>) [id C]
# | |TensorConstant{[]} [id D]
# | |TensorConstant{11} [id E]
# | |TensorConstant{0} [id F]
# | |TensorConstant{1.0} [id G]
# |TensorConstant{2.0} [id H]
# RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE160>) [id C]
The compiled graph contains only one RandomVariable
.
The merge optimizations removed the second RandomVariable
because it was identical to the first, and we need only produce the same samples once.
If we want to make sure that Aesara knows these two RandomVariable
s are distinct terms, we can provide a distinct RNG state for each:
with pm.Model() as test_model:
X_rv = pm.Normal("x")
rng = aesara.shared(np.random.RandomState(2023532), borrow=True)
# or
# rng = X_rv.owner.outputs[0]
Y_rv = pm.Normal("y", rng=rng)
test_fn = test_model.fn(Y_rv + 2 * X_rv)
aesara.dprint(test_fn.f)
# Elemwise{Composite{(i0 + (i1 * i2))}}[(0, 0)] [id A] '' 2
# |normal_rv.1 [id B] 'y' 1
# | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE490>) [id C]
# | |TensorConstant{[]} [id D]
# | |TensorConstant{11} [id E]
# | |TensorConstant{0} [id F]
# | |TensorConstant{1.0} [id G]
# |TensorConstant{2.0} [id H]
# |normal_rv.1 [id I] 'x' 0
# |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE5A0>) [id J]
# |TensorConstant{[]} [id D]
# |TensorConstant{11} [id E]
# |TensorConstant{0} [id F]
# |TensorConstant{1.0} [id G]
# RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE5A0>) [id J]
In other words, we shouldn't rely on in-place RandonVariable
Op
s, and we should use explicitly updated RandomState
s. This doesn't prevent us from using in-place updates for efficiency, especially since there's already an optimization that converts non-in-place RandomVariable
s to in-place ones.
Finally, we can automatically make sure that each RandomVariable
created within a Model
context is distinct by simply updating Model.default_rng
after each RandomVariable
is created in Distribution.__new__
(i.e. after this step).