Rewrite core of PyMC3 (upcoming v4) to rely on aesara (among many other improvements) #4500

twiecki · 2021-03-03T13:20:44Z

No description provided.

ricardoV94 · 2021-03-16T08:05:06Z

In case this helps:

Since a80cf9a, prior sampling returns the same value for every draw.

with pm.Model() as m:
    x = pm.Normal('x', 0, 1)
    prior = pm.sample_prior_predictive(10)
    print(prior)
    
{'x': array([0.019066, 0.019066, 0.019066, 0.019066, 0.019066, 0.019066,
       0.019066, 0.019066, 0.019066, 0.019066])}

For the HalfCauchy (and maybe other distributions) there may be a disconnect between the logp and the random arguments. Our logp considers only the scale argument, but the random method (from scipy) takes loc as well and that's the first optional argument. I don't think there is yet any logic to connect the arguments between the the random and logp, right?

brandonwillard · 2021-03-16T15:20:05Z

Since a80cf9a, prior sampling returns the same value for every draw.

Yes, that's because it's using the same RNG seed every time. The reason why: RandomVariable.inplace == False. This code was supposed to take care of that.

brandonwillard · 2021-03-16T15:21:05Z

For the HalfCauchy (and maybe other distributions) there may be a disconnect between the logp and the random arguments. Our logp considers only the scale argument, but the random method (from scipy) takes loc as well and that's the first optional argument. I don't think there is yet any logic to connect the arguments between the the random and logp, right?

Is there an example of this, or a test that's failing?

ricardoV94 · 2021-03-16T16:01:37Z

For the HalfCauchy (and maybe other distributions) there may be a disconnect between the logp and the random arguments. Our logp considers only the scale argument, but the random method (from scipy) takes loc as well and that's the first optional argument. I don't think there is yet any logic to connect the arguments between the the random and logp, right?

Is there an example of this, or a test that's failing?

The random tests are still disabled, so this wouldn't show up right? I wanted to check with the prior predictive sampling but right now that's also unusable.

There was an issue in the logp of the HalfCauchy that lead to the addition of an unused parameter (not sure if related):
https://github.com/pymc-devs/pymc3/pull/4508/files/3db730e1b632e0fa023cd181c60d19de4aab6e48#r594550615

ricardoV94 · 2021-03-16T16:29:05Z

Managed to do it with .dist

import pymc3 as pm
import scipy.stats as st

pymc_samples = pm.HalfCauchy.dist(beta=.01, size=10_000).eval()
print(pymc_samples.mean(), pymc_samples.std())
# (6.647935041418325, 81.64505597965584)

# Shoud match this 
scipy_samples = st.halfcauchy(loc=0, scale=.01).rvs(10_000)
print(scipy_samples.mean(), scipy_samples.std())
# (0.049328030173446065, 0.3650001369075893)

# But matches this
scipy_samples = st.halfcauchy(loc=0.01, scale=1).rvs(10_000)
print(scipy_samples.mean(), scipy_samples.std())
# (5.537612320455185, 73.96359661595363)

# V3
pymc_samples = pm.HalfCauchy.dist(beta=.01, shape=10_000).random()      
print(pymc_samples.mean(), pymc_samples.std())                          
# (0.05829908302484347, 1.122931711761449)

The HalfCauchy is notoriously hard to evaluate, but with these parameters it seems reliable.

brandonwillard · 2021-03-16T17:14:59Z

The random tests are still disabled, so this wouldn't show up right?

The tests in pymc3.tests.test_distributions perform some sampling and those are running for all the converted Distributions; otherwise, I believe you're talking about pymc3.tests.test_distributions_random. Those tests are not enabled because they're redundant (for the RandomVariables that already exist in Aesara, at least).

I wanted to check with the prior predictive sampling but right now that's also unusable.

It's not unusable; you only need to set/change the seed between calls to pm.sample_*_predictive. Regardless, I've pushed a commit that correctly sets RandomVariable.inplace = True, so you shouldn't need to do that anymore.

ricardoV94 · 2021-03-16T17:23:05Z

Thanks Brandon.

I didn't mean it's unusable. I just didn't know how to make use of it :b

I confirmed my suspicion about the Half Cauchy though. There is a mismatch between the old logp and the new randomOp, as the later expects two arguments, while the former expects only one.

That's why the logp test would fail if the unused alpha argument was not added here: https://github.com/pymc-devs/pymc3/blob/a80cf9ac7dec48cc71a2924362884a4d8cc6e30d/pymc3/distributions/continuous.py#L2276

It's not too difficult to add this loc parameter to the logp, but just wanted to make sure we want to do that. Do we want to adapt to the number of arguments expected in aesara or just customize our own random ops in terms of the arguments we want to use?

brandonwillard · 2021-03-16T17:25:48Z

It's not too difficult to add this loc parameter to the logp, but just wanted to make sure we want to do that. Do we want to adapt to the number of arguments expected in aesara or just customize our own random ops in terms of the arguments we want to use?

If an adaptation is something as simple as inverting a parameter, then we'll always want to do that, but this case sounds like we'll need to do the latter (i.e. customize a RandomVariable). There's now an example of such a customization for Multinomial.

ricardoV94 · 2021-03-16T19:20:26Z

On another front:

import pymc3 as pm
with pm.Model() as m:
    x = pm.Uniform('x', 0, 1)
    y = pm.Uniform('y', 0, x)
    
m.logp({'x_interval': 0, 'y_interval': 0})
# array(-2.83990034)
m.logp({'x_interval': 0, 'y_interval': 0})
# array(-2.09721045)
m.logp({'x_interval': 0, 'y_interval': 0})
# array(-2.23865389)

Also it ignores when I set transform=None, and creates the interval variables anyway.

michaelosthege · 2021-03-16T19:24:36Z

@ricardoV94 were the trailing __ on the transformed variables removed?

On master:

>>> m.logp({'x_interval': 0, 'y_interval': 0})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\osthege\Repos\pymc3-dev\pymc3\model.py", line 1565, in __call__
    return self.f(**point)
  File "C:\Users\osthege\AppData\Local\Continuum\miniconda3\envs\pm3-dev\lib\site-packages\aesara\compile\function\types.py", line 956, in __call__
    raise TypeError(
TypeError: Missing required input: x_interval__ ~ TransformedDistribution
>>> m.logp({'x_interval__': 0, 'y_interval__': 0})
array(-2.77258872)
>>> m.logp({'x_interval__': 0, 'y_interval__': 0})
array(-2.77258872)
>>> m.logp({'x_interval__': 0, 'y_interval__': 0})
array(-2.77258872)

ricardoV94 · 2021-03-16T19:26:03Z

Yes they were removed for the time being.

brandonwillard · 2021-03-16T21:44:01Z

Also it ignores when I set transform=None, and creates the interval variables anyway.

I'll add fixes for those in a minute.

michaelosthege

Made these comments a while back, but forgot to hit submit..

pymc3/blocking.py

pymc3/distributions/__init__.py

This value was not representative of its name.

…d basic dists These changes can be summarized as follows: - `Model` objects now track fully functional Theano graphs that represent all relationships between random and "deterministic" variables. These graphs are called these "sample-space" graphs. `Model.unobserved_RVs`, `Model.basic_RVs`, `Model.free_RVs`, and `Model.observed_RVs` contain these graphs (i.e. `TensorVariable`s), which are generated by `RandomVariable` `Op`s. - For each random variable, there is now a corresponding "measure-space" variable (i.e. a `TensorVariable` that corresponds to said variable in a log-likelihood graph). These variables are available as `rv_var.tag.value_var`, for each random variable `rv_var`, or via `Model.vars`. - Log-likelihood (i.e. measure-space) graphs are now created for individual random variables by way of the generic functions `logpt`, `logcdf`, `logp_nojac`, and `logpt_sum` in `pymc3.distributions`. - Numerous uses of concrete shape information stemming from `Model` objects (e.g. `Model.size`) have been removed/refactored. - Use of `FreeRV`, `ObservedRV`, `MultiObservedRV`, and `TransformedRV` has been deprecated. The information previously stored in these classes is now tracked using `TensorVariable.tag`, and log-likelihoods are generated using the aforementioned `log*` generic functions.

This commit changes `DictToArrayBijection` so that it returns a `RaveledVars` datatype that contains the original raveled and concatenated vector along with the information needed to revert it back to dictionay/variables form. Simply put, the variables-to-single-vector mapping steps have been pushed away from the model object and its symbolic terms and closer to the (sampling) processes that produce and work with `ndarray` values for said terms. In doing so, we can operate under fewer unnecessarily strong assumptions (e.g. that the shapes of each term are static and equal to the initial test points), and let the sampling processes that require vector-only steps deal with any changes in the mappings.

The approach currently being used is rather inefficient. Instead, we should change the `size` parameters for `RandomVariable` terms in the sample-space graph(s) so that they match arrays of the inputs in the trace and the desired number of output samples. This would allow the compiled graph to vectorize operations (when it can) and sample variables more efficiently in large batches.

Classes and functions removed: - PyMC3Variable - ObservedRV - FreeRV - MultiObservedRV - TransformedRV - ArrayOrdering - VarMap - DataMap - _DrawValuesContext - _DrawValuesContextBlocker - is_fast_drawable - _compile_theano_function - vectorize_theano_function - get_vectorize_signature - _draw_value - draw_values - generate_samples - fast_sample_posterior_predictive Modules removed: - pymc3.distributions.posterior_predictive - pymc3.tests.test_random

* Refactor Flat and HalfFlat distributions * Re-enable Gumbel logp test * Remove redundant test

Co-authored-by: Thomas Wiecki <[email protected]>

Co-authored-by: Ricardo <[email protected]> Co-authored-by: Thomas Wiecki <[email protected]>

Co-authored-by: Thomas Wiecki <[email protected]> Co-authored-by: Ricardo <[email protected]>

Co-authored-by: Michael Osthege <[email protected]>

Co-authored-by: Oriol Abril-Pla <[email protected]>

brandonwillard added the v4 label Mar 4, 2021

brandonwillard self-assigned this Mar 4, 2021

michaelosthege added this to the vNext (4.0.0) milestone Mar 8, 2021

brandonwillard force-pushed the v4 branch 4 times, most recently from e6d7ae8 to af23c2f Compare March 27, 2021 08:55

michaelosthege reviewed Mar 27, 2021

View reviewed changes

pymc3/blocking.py Show resolved Hide resolved

pymc3/distributions/__init__.py Outdated Show resolved Hide resolved

pymc3/distributions/__init__.py Outdated Show resolved Hide resolved

brandonwillard and others added 9 commits March 29, 2021 11:32

Temporarily disable CI tests

3398c04

Rename Model.ndim to Model.size

191a18d

This value was not representative of its name.

Update competence methods to work with RandomVariables

5d16410

Removed redundant bound in Wald distribution

10d5451

Refactor tests for compatibility with logp dispatch and RandomVariables

7f301d5

brandonwillard force-pushed the v4 branch from d87d0d0 to 93a096d Compare March 29, 2021 17:40

themrzmaster and others added 22 commits May 17, 2021 18:33

Refactor LogitNormal (#4703)

79245ce

let pandas_to_array take pandas Index

c2b840f

Update Aesara requirement to 2.0.9

0b8bed3

Use aesara.tensor.atleast_1d in pymc3.aesaraf.change_rv_size

b06c928

Add testval to flaky test (#4707)

0970af0

Refactor Flat and HalfFlat distributions (#4723)

d95827f

* Refactor Flat and HalfFlat distributions * Re-enable Gumbel logp test * Remove redundant test

Remove unnecessary gammaln and psi from pymc3.distributions.special

7e88fa5

Incrementally update the RNG state in Model

c08c149

Convert RandomVariables to in-place during graph optimization

26e5235

Introduce Model.initial_values and deprecate testval in favor of initval

313e007

Replace uses of testval with initval

ad9b919

Disallow "__sample__" as a dimension name

f8e1a81

Co-authored-by: Thomas Wiecki <[email protected]>

Add OS-independent #4652 regression tests

738c9de

Make change_rv_size more robust

131c829

Co-authored-by: Ricardo <[email protected]> Co-authored-by: Thomas Wiecki <[email protected]>

Introduce ShapeWarning

0a8d1d4

Refactor test to use InferenceData

a4e9fba

Implement backwards-compatble shape and Ellipsis-enabled dims

3adca2a

Co-authored-by: Thomas Wiecki <[email protected]> Co-authored-by: Ricardo <[email protected]>

Add Ellipsis-support for the shape kwarg

784dec3

Separate shape logic into a separate file (#4708)

65f4f2b

Co-authored-by: Michael Osthege <[email protected]>

Use initial instead of testval internally

f97b283

Co-authored-by: Oriol Abril-Pla <[email protected]>

Use the correct givens in Model.set_initval

15088b8

Update version string and add warning on import.

9425009

twiecki marked this pull request as ready for review June 5, 2021 09:03

twiecki changed the title ~~V4 tracking PR~~ Rewrite core of PyMC3 (upcoming v4) to rely on aesara (among many other improvements) Jun 5, 2021

Add workaround to make pylint pass until gp.predict is refactored

4dfc8e0

michaelosthege approved these changes Jun 5, 2021

View reviewed changes

twiecki merged commit 25eaa71 into master Jun 5, 2021

eigenfoo mentioned this pull request Jun 11, 2021

New commits to pymc3/sampling.py or pymc3/step_methods/hmc/ eigenfoo/littlemcmc#111

Open

ricardoV94 mentioned this pull request Jun 21, 2021

Add V4 distribution implementation developer guide #4783

Merged

thomasaarholt mentioned this pull request Sep 3, 2024

Remove deprecated Distribution kwargs #7488

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Rewrite core of PyMC3 (upcoming v4) to rely on aesara (among many other improvements) #4500

Rewrite core of PyMC3 (upcoming v4) to rely on aesara (among many other improvements) #4500

Uh oh!

twiecki commented Mar 3, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021 •

edited

Loading

Uh oh!

brandonwillard commented Mar 16, 2021

Uh oh!

brandonwillard commented Mar 16, 2021 •

edited by ricardoV94

Loading

Uh oh!

ricardoV94 commented Mar 16, 2021 •

edited

Loading

Uh oh!

ricardoV94 commented Mar 16, 2021 •

edited

Loading

Uh oh!

brandonwillard commented Mar 16, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021

Uh oh!

brandonwillard commented Mar 16, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021

Uh oh!

michaelosthege commented Mar 16, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021

Uh oh!

brandonwillard commented Mar 16, 2021 •

edited

Loading

Uh oh!

michaelosthege left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Rewrite core of PyMC3 (upcoming v4) to rely on aesara (among many other improvements) #4500

Rewrite core of PyMC3 (upcoming v4) to rely on aesara (among many other improvements) #4500

Uh oh!

Conversation

twiecki commented Mar 3, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Mar 16, 2021

Uh oh!

brandonwillard commented Mar 16, 2021 • edited by ricardoV94 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Mar 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Mar 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Mar 16, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021

Uh oh!

brandonwillard commented Mar 16, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021

Uh oh!

michaelosthege commented Mar 16, 2021

Uh oh!

ricardoV94 commented Mar 16, 2021

Uh oh!

brandonwillard commented Mar 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelosthege left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Mar 16, 2021 •

edited

Loading

brandonwillard commented Mar 16, 2021 •

edited by ricardoV94

Loading

ricardoV94 commented Mar 16, 2021 •

edited

Loading

ricardoV94 commented Mar 16, 2021 •

edited

Loading

brandonwillard commented Mar 16, 2021 •

edited

Loading