Skip to content

Make sure Model RVs are distinct via their RNGs #4728

Closed
@brandonwillard

Description

@brandonwillard

Currently, all the Aesara RandomVariable Ops are being converted into in-place Ops (i.e. when a graph containing one of them is compiled and evaluated, the underlying shared variable RNG state is updated in place). This can be a mild convenience—and a potential efficiency gain—but it's not a good use of the RandomVariable API, because it can lead to confusion.

For example,

import numpy as np

import aesara
import aesara.tensor as at

import pymc3 as pm


with pm.Model() as test_model:
    X_rv = pm.Normal("x")
    Y_rv = pm.Normal("y")


test_fn = test_model.fn(Y_rv + 2 * X_rv)

aesara.dprint(test_fn.f)
# Elemwise{Composite{(i0 + (i1 * i0))}}[(0, 0)] [id A] ''   1
#  |normal_rv.1 [id B] 'y'   0
#  | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE160>) [id C]
#  | |TensorConstant{[]} [id D]
#  | |TensorConstant{11} [id E]
#  | |TensorConstant{0} [id F]
#  | |TensorConstant{1.0} [id G]
#  |TensorConstant{2.0} [id H]
# RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE160>) [id C]

The compiled graph contains only one RandomVariable.

The merge optimizations removed the second RandomVariable because it was identical to the first, and we need only produce the same samples once.

If we want to make sure that Aesara knows these two RandomVariables are distinct terms, we can provide a distinct RNG state for each:

with pm.Model() as test_model:
    X_rv = pm.Normal("x")

    rng = aesara.shared(np.random.RandomState(2023532), borrow=True)
    # or
    # rng = X_rv.owner.outputs[0]

    Y_rv = pm.Normal("y", rng=rng)

test_fn = test_model.fn(Y_rv + 2 * X_rv)

aesara.dprint(test_fn.f)
# Elemwise{Composite{(i0 + (i1 * i2))}}[(0, 0)] [id A] ''   2
#  |normal_rv.1 [id B] 'y'   1
#  | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE490>) [id C]
#  | |TensorConstant{[]} [id D]
#  | |TensorConstant{11} [id E]
#  | |TensorConstant{0} [id F]
#  | |TensorConstant{1.0} [id G]
#  |TensorConstant{2.0} [id H]
#  |normal_rv.1 [id I] 'x'   0
#    |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE5A0>) [id J]
#    |TensorConstant{[]} [id D]
#    |TensorConstant{11} [id E]
#    |TensorConstant{0} [id F]
#    |TensorConstant{1.0} [id G]
# RandomStateSharedVariable(<RandomState(MT19937) at 0x7F1A343DE5A0>) [id J]

In other words, we shouldn't rely on in-place RandonVariable Ops, and we should use explicitly updated RandomStates. This doesn't prevent us from using in-place updates for efficiency, especially since there's already an optimization that converts non-in-place RandomVariables to in-place ones.

Finally, we can automatically make sure that each RandomVariable created within a Model context is distinct by simply updating Model.default_rng after each RandomVariable is created in Distribution.__new__ (i.e. after this step).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions