Skip to content

20x Performance degradation when using theano shared variables #3818

Closed
@jvans1

Description

@jvans1

Description of your problem

Hi,

I noticed a pretty significant performance hit when using theano shared variables. Please correct me if I'm doing something wrong. If this is a bug, I am happy to dig into this a bit more if someone can perhaps point me in the right direction

Please provide a minimal, self-contained, and reproducible example.

import pymc3 as pm
import numpy as np
Y = 95
N = 100
with pm.Model() as binomial_model1:
    pct = pm.Beta("pct", alpha=2, beta=2)
    pm.Binomial("obs", n=N, p=pct, observed=Y)
    binomial_traces1 = pm.sample(2000, tune=500, cores=2)
%%timeit
pm.sample_posterior_predictive(binomial_traces1, samples=5000, model=binomial_model1, progressbar=False)

This returns:
1.66 s ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

When I do the same thing but used theano shared variables I see the perf hit:

Y = 95
N = 100
with pm.Model() as binomial_model2:
    Ys = pm.Data('Ys', Y)
    ns = pm.Data('Ns', N)
    pct = pm.Beta("pct", alpha=2, beta=2)
    pm.Binomial("obs", n=ns, p=pct, observed=Ys)
    binomial_traces2 = pm.sample(2000, tune=500, cores=2)
%%timeit
pm.sample_posterior_predictive(binomial_traces2, samples=5000, model=binomial_model2, progressbar=False)

This results in:

31.7 s ± 498 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

My notebook for this is here

Versions and main components

  • PyMC3 Version: 3.8
  • Theano Version: 1.0.4
  • Python Version: 3.7.4
  • Operating system: Ubuntu 18.04
  • How did you install PyMC3: conda

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions