Skip to content

LaTeX and str representations for v4 #4692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 212 commits into from

Conversation

Spaak
Copy link
Member

@Spaak Spaak commented May 13, 2021

This PR adds LaTeX and string representations for distributions and Models for v4, fixing #4494. Specifically, it adds the following functionality:

model = pm.Model()
with model:
    b_mu = pm.Normal('b_mu', mu=0, sigma=10)
    b_sigma = pm.HalfNormal('b_sigma', sigma=10)

    a_mu = pm.Normal('a_mu', mu=0, sigma=10)
    a_sigma = pm.HalfNormal('a_sigma', sigma=10)

    b = pm.Normal('b', mu=b_mu, sigma=b_sigma)
    a = pm.Normal('a', mu=a_mu, sigma=a_sigma)

    y_sigma = pm.HalfNormal('y_sigma', sigma=10)
    y = pm.Normal('y', mu=a+b*obs_x, sigma=y_sigma, observed=obs_y)

after which (in interactive console session), we can do:

>>> y
y ~ N(f(a, b), y_sigma)

>>> model
   b_mu ~ N(0, 10)
b_sigma ~ N**+(0, 10)
   a_mu ~ N(0, 10)
a_sigma ~ N**+(0, 10)
      b ~ N(b_mu, b_sigma)
      a ~ N(a_mu, a_sigma)
y_sigma ~ N**+(0, 10)
      y ~ N(f(a, b), y_sigma)

Additionally, in Jupyter notebooks, the model now renders as LaTeX (MathJax, to be precise):

image

There are some changes w.r.t. how this functionality was implemented in v3:

  • The distribution's names are now provided by the Aesara RandomVariable._print_name.
  • Distribution parameters are printed without names (e.g. N(0, 1) instead of N(mu=0, sigma=1). This is a consequence of RandomVariable's interface. I considered using inspection to extract the parameter names (like before). However, Aesara follows numpy's naming (e.g. "loc" and "scale" for NormalRV), while PyMC3 uses its own ("mu" and "sigma"). For now I decided not to include parameter names to avoid this mismatch, but am open to implement it after all. In general I would actually prefer to have the parameter names.
  • Previously, in more complex models, parameter strings would often become very long, in the form of f(f(f(f(f(a)), f(f(f(b))), .... I've now changed this such that the entire graph going into a parameter is flattened, and only f(a1, a2, ..., aN) is left, for every RandomVariable a occurring in the graph. This should make the strings a lot easier to read.
  • I've grouped all printing-related code in a new module printing.

I've not yet updated the tests (I think all str/latex-related ones are all failing now on v4); once we agree this is a good way forward I will.

brandonwillard and others added 30 commits March 29, 2021 11:32
This value was not representative of its name.
…d basic dists

These changes can be summarized as follows:
- `Model` objects now track fully functional Theano graphs that represent all
relationships between random and "deterministic" variables.  These graphs are
called these "sample-space" graphs.  `Model.unobserved_RVs`, `Model.basic_RVs`,
`Model.free_RVs`, and `Model.observed_RVs` contain these
graphs (i.e. `TensorVariable`s), which are generated by `RandomVariable` `Op`s.
- For each random variable, there is now a corresponding "measure-space"
variable (i.e. a `TensorVariable` that corresponds to said variable in a
log-likelihood graph).  These variables are available as `rv_var.tag.value_var`,
for each random variable `rv_var`, or via `Model.vars`.
- Log-likelihood (i.e. measure-space) graphs are now created for individual
random variables by way of the generic functions `logpt`, `logcdf`,
`logp_nojac`, and `logpt_sum` in `pymc3.distributions`.
- Numerous uses of concrete shape information stemming from `Model`
objects (e.g. `Model.size`) have been removed/refactored.
- Use of `FreeRV`, `ObservedRV`, `MultiObservedRV`, and `TransformedRV` has been
deprecated.  The information previously stored in these classes is now tracked
using `TensorVariable.tag`, and log-likelihoods are generated using the
aforementioned `log*` generic functions.
This commit changes `DictToArrayBijection` so that it returns a `RaveledVars`
datatype that contains the original raveled and concatenated vector along with
the information needed to revert it back to dictionay/variables form.

Simply put, the variables-to-single-vector mapping steps have been pushed away
from the model object and its symbolic terms and closer to the (sampling)
processes that produce and work with `ndarray` values for said terms.  In doing
so, we can operate under fewer unnecessarily strong assumptions (e.g. that the
shapes of each term are static and equal to the initial test points), and let
the sampling processes that require vector-only steps deal with any changes in
the mappings.
The approach currently being used is rather inefficient.  Instead, we should
change the `size` parameters for `RandomVariable` terms in the sample-space
graph(s) so that they match arrays of the inputs in the trace and the desired
number of output samples.  This would allow the compiled graph to vectorize
operations (when it can) and sample variables more efficiently in large batches.
Classes and functions removed:
- PyMC3Variable
- ObservedRV
- FreeRV
- MultiObservedRV
- TransformedRV
- ArrayOrdering
- VarMap
- DataMap
- _DrawValuesContext
- _DrawValuesContextBlocker
- is_fast_drawable
- _compile_theano_function
- vectorize_theano_function
- get_vectorize_signature
- _draw_value
- draw_values
- generate_samples
- fast_sample_posterior_predictive

Modules removed:
- pymc3.distributions.posterior_predictive
- pymc3.tests.test_random
* refactor Skewnormal and Rice distribution
* add test for rice b and skewnorm tau params
* change default parameter to b
* Add float32 xfail to `test_rice`

Co-authored-by: Farhan Reynaldo <[email protected]>
Co-authored-by: Ricardo <[email protected]>
@twiecki
Copy link
Member

twiecki commented May 17, 2021

I agree with @Spaak that this code seems a bit leaner for what we're trying to accomplish here. And if I understand correctly the linked code is not part of Aesara. @brandonwillard or is there existing functionality you are referring to we can leverage here?

I really like the refactoring into a separate file as well as the new output which looks much better.

twiecki
twiecki previously approved these changes May 17, 2021
@ricardoV94
Copy link
Member

ricardoV94 commented May 17, 2021

Seems like this needs to be rebased. Curious if the nuts failing test will re-emerge... The other failing test was introduced by the temporarily reverted size PR so it shouldn't be an issue for now.

Edit: Parameter names would indeed be nice to have

@Spaak
Copy link
Member Author

Spaak commented May 17, 2021

Seems like this needs to be rebased. Curious if the nuts failing test will re-emerge... The other failing test was introduced by the temporarily reverted size PR so it shouldn't be an issue for now.

I just rebased, curious to see what will happen with the tests indeed.

Edit: Parameter names would indeed be nice to have

Yep, I agree. I think that if all ✔️ this could be merged and I'll work on adding the parameter names after that. I may have to touch some Aesara code for that, too.

@brandonwillard
Copy link
Contributor

brandonwillard commented May 18, 2021

It seemed to me that the bulk of the actual printing was done inside symbolic_pymc.printing

Yes, that's why I provided a link to it in that comment.

i.e. in novel code that would have to be added to PyMC3, anyway. (So "without adding much code" doesn't hold, as far as I understood it.)

The code in symbolic_pymc.theano.printing is doing considerably more—feature-wise—than this code, so a superficial glance at the number of lines is no way to reason about it. Instead, for a reasonable comparison with this PR, start by taking a look at how it provides custom printing to Theano graphs.

To clarify, RandomVariablePrinter.process provides the kind of random variable-only printing that this PR does. It does so using class-level fields that already exist in the RandomVariables implemented in Aesara: RandomVariable._print_name. That's essentially all that's needed to produce custom LaTeX output for RandomVariables. If you want a much more minimal example, see NormalRVPrinter.

Now, within RandomVariablePrinter.process, you'll notice that there's also logic for preambles and shapes. That logic, along with the corresponding VariableWithShapePrinter and PreamblePPrinter classes, takes up the vast majority of the symbolic_pymc.theano.printing module, but—again—they're additional features that go well beyond the scope of this PR.

You can see some of those features in this old example, where their dtypes are in mathematical format (e.g. multi-dimensional real-value spaces are formatted), the dimensions are labeled distinctly (e.g. Y.shape[0] is displayed as n^{Y_0} wherever it appears throughout the printed graph), etc. Also, the entire graph is printed; you can see the "deterministic" parts both separately and inside the random variables that depend on them. This isn't feasible outside of this graph printing framework—at least not without effectively reproducing it.

More importantly, the [theano|aesara].printing approach doesn't involve invasively changing a class type. That's a very extreme change with serious repercussions and limitations. For example, clones of these dynamically changed variables will not retain their new classes, so, any time a model's graph is manipulated in any way (e.g. when it's optimized, when the size of a random variable is changed, etc.), all these printing changes are gone. One would need to overwrite class types in numerous places to make these changes stick.

Also, what happens when this dynamic class change is applied to something that's already a subclass of TensorVariable? Will isinstance still work against the original subclass, or does this approach remove the subclass information? If it's the latter, this code will break things in very difficult to debug ways.

These concerns—along with a few others (e.g. lack of extensibility, costly graph walking within the built-in __str__ method, etc.)—make this approach very challenging.

@brandonwillard
Copy link
Contributor

By the way, by "...get everything we want/need without adding much/any code to PyMC3..." I mean that we can add the generic LaTeX printing features from symbolic-pymc to Aesara and simply use them in PyMC3. That has been the plan for a while, but I haven't had the time to set up that PR—along with numerous other things from symbolic-pymc that should be in Aesara.

@twiecki
Copy link
Member

twiecki commented May 18, 2021

@brandonwillard How do you mean "invasively changing a class type"? Are you referring to https://github.com/pymc-devs/pymc3/pull/4692/files#diff-2cba9a36ecb19a6894133cbc03512d4e0dd70c9f681250bf45dcd72f75a56a3dR165?

If this introduces unwanted side-effects we should talk about them. However, I think this PR will make future refactorings (once aesara has that functionality) much simpler because now everything is contained in one place. So in the interest of moving us forward I think we should merge.

@Spaak
Copy link
Member Author

Spaak commented May 18, 2021

@brandonwillard:

It does so using class-level fields that already exist in the RandomVariables implemented in Aesara: RandomVariable._print_name.

This PR uses the same Aesara RandomVariable._print_name.

Also, the entire graph is printed

My point about this PR being leaner was specifically that it does not provide support for entire computational graphs, but only a quick overview of PyMC3 variables and probabilistic models (I wasn't making a point about number of lines of code).

By the way, by "...get everything we want/need without adding much/any code to PyMC3..." I mean that we can add the generic LaTeX printing features from symbolic-pymc to Aesara and simply use them in PyMC3. That has been the plan for a while, but I haven't had the time to set up that PR—along with numerous other things from symbolic-pymc that should be in Aesara.

Ah, that clarifies things, you meant to add the code from symbolic-pymc to Aesara, rather than to PyMC3 (thereby leading to less code needed in PyMC3). Once the support is there in Aesara, we can always drop the functionality in this PR. Especially now that it's refactored into one module that should be easy to do, as @twiecki points out.

@twiecki & @brandonwillard:

How do you mean "invasively changing a class type"? Are you referring to https://github.com/pymc-devs/pymc3/pull/4692/files#diff-2cba9a36ecb19a6894133cbc03512d4e0dd70c9f681250bf45dcd72f75a56a3dR165?

Yes, that's what's being referred to, and this is worth more discussion. I agree that this may have side effects that are so far not showing up. The (only) reason that I used this dynamic type patching is that Python's builtin str will always call the __str__ that is defined on the object's class (see this answer on SO). If we were to do myvar.__str__ = somefun then still str(myvar) would not call somefun (unlike any other method assignment). The reason that overriding built-in str and repr (which behaves in the same way) is desireable is for console support; only that way can we get behaviour like:

>>> y
y ~ N(f(a, b), y_sigma)

otherwise the user would have to type >>> pretty_print(y) or so to get the same.

In v3, I used the same technique but only for Deterministic here. In v3, all Distributions were built up in an object-oriented hierarchy, so it was straightforward to override __str__ in such a way that str would use them. Now with v4, the objects in the user's interactive namespace are no longer instances of PyMC3 Distribution but Aesara RandomVariable TensorVariable. So in order to get custom PyMC3-behaviour in str, we need to patch the type.

I agree that this type patching is not the cleanest solution. (And actually Python's refusal to use patched var.__str__ while this works for any other method seems unpythonic to me, but beyond our control.) One solution I can think of to ditch this is to implement something like the following in Aesara's RandomVariable Variable:

def _str_repr(self):
    # everything now in Variable.__str__

def __str__(self):
    return self._str_repr()

(plus analogous change for __repr__, and optionally with kwarg formatting in _str_repr for uniform plain string/latex interface.) Then, descendants/instances can simply patch var._str_repr to whatever is needed without having to change the type.

If we want to go head with the general (simple) functionality in this PR (for now), but with the dynamic type patching taken out, I'll file an Aesara PR with the above change see pymc-devs/pytensor#417. If that one can be merged, I can update this PR.

@twiecki
Copy link
Member

twiecki commented May 18, 2021

Ah, that clarifies things, you meant to add the code from symbolic-pymc to Aesara, rather than to PyMC3 (thereby leading to less code needed in PyMC3).

Is that a path that you think you can help with? Just from looking at the code so far. Wondering how much work that would be and if we should just get it right from the beginning.

Alternatively, the current solution with pymc-devs/pytensor#417 sounds like it would reduce the risk of side-effects and improve the status quo so I'm comfortable with that path too.

@michaelosthege michaelosthege changed the base branch from v4 to main June 19, 2021 16:28
@michaelosthege
Copy link
Member

@Spaak I changed the base to main so we can delete the v4 branch.
Please check if it makes sense to rebase & revive this PR or close it.

I'm not sure what the state of LaTeX representations and also pm.model_to_graphviz is right now, but I think there are some things that we should continue to solve at the PyMC3 level. The registration of dims with the model object, for example, might be useful for pm.model_to_graphviz.

@michaelosthege michaelosthege mentioned this pull request Jun 29, 2021
6 tasks
@Spaak
Copy link
Member Author

Spaak commented Jul 7, 2021

I'm closing this PR (rebasing too much headache) and will file a new one in a few days. I intend to take the following approach:

  • Avoid any type patching or overriding of __str__ or __repr__, to allay @brandonwillard's valid worries about maintainability. This means we'll only get "nice" string representations on IPython consoles (or Jupyter etc.), but not in plain Python shells.
  • Focus on the functionality/refactoring already in this PR (i.e. add relatively limited functionality to pymc3 itself), rather than immediately move the code from symbolic_pymc.theano.printing over to aesara, in the interest of quickly getting UX back to what it was in v3. We can later easily refactor anyway, after the follow-up PR is merged.

@Spaak Spaak closed this Jul 7, 2021
@twiecki
Copy link
Member

twiecki commented Jul 7, 2021

@Spaak Sounds great. CC @brandonwillard about this approach.

@Spaak Spaak deleted the pretty_printing_v4 branch July 13, 2021 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.