Skip to content

BUG: Issue with Ordered Transform in Ordered Logistic API docs example #6610

Closed
@NathanielF

Description

@NathanielF

Describe the issue:

The API docs for the ordered logistic class recommends using the transform ordered to provide cutpoints for the ordinal regression. But the provided example breaks with an error reporting that the random variable for the cutpoints lacks a shape.

image

On the latest version:
image

Reproduceable code example:

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
import numpy as np
import pytensor as pt

# Generate data for a simple 1 dimensional example problem
n1_c = 300; n2_c = 300; n3_c = 300
cluster1 = np.random.randn(n1_c) + -1
cluster2 = np.random.randn(n2_c) + 0
cluster3 = np.random.randn(n3_c) + 2

x = np.concatenate((cluster1, cluster2, cluster3))
y = np.concatenate((1*np.ones(n1_c),
                    2*np.ones(n2_c),
                    3*np.ones(n3_c))) - 1

# Ordered logistic regression
with pm.Model() as model:
    cutpoints = pm.Normal("cutpoints", mu=[-1,1], sigma=10, shape=2, 
                          transform=pm.distributions.transforms.Ordered)
    y_ = pm.OrderedLogistic("y", cutpoints=cutpoints, eta=x, observed=y)
    idata = pm.sample()

Error message:

Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[10], line 14
     12 # Ordered logistic regression
     13 with pm.Model() as model:
---> 14     cutpoints = pm.Normal("cutpoints", mu=[-1,1], sigma=10, shape=2, 
     15                           transform=pm.distributions.transforms.Ordered)
     16     y_ = pm.OrderedLogistic("y", cutpoints=cutpoints, eta=x, observed=y)
     17     idata = pm.sample()

File ~/mambaforge/envs/pymc_examples_new/lib/python3.11/site-packages/pymc/distributions/distribution.py:312, in Distribution.__new__(cls, name, rng, dims, initval, observed, total_size, transform, *args, **kwargs)
    308         kwargs["shape"] = tuple(observed.shape)
    310 rv_out = cls.dist(*args, **kwargs)
--> 312 rv_out = model.register_rv(
    313     rv_out,
    314     name,
    315     observed,
    316     total_size,
    317     dims=dims,
    318     transform=transform,
    319     initval=initval,
    320 )
    322 # add in pretty-printing support
    323 rv_out.str_repr = types.MethodType(str_for_dist, rv_out)
...
---> 95     y = at.zeros(value.shape)
     96     y = at.inc_subtensor(y[..., 0], value[..., 0])
     97     y = at.inc_subtensor(y[..., 1:], at.log(value[..., 1:] - value[..., :-1]))

AttributeError: 'RandomGeneratorSharedVariable' object has no attribute 'shape'

PyMC version information:

Last updated: Fri Mar 17 2023

Python implementation: CPython
Python version : 3.11.0
IPython version : 8.11.0

pytensor: 2.10.1

numpy : 1.24.2
arviz : 0.15.1
matplotlib: 3.7.1
pandas : 1.5.3
pymc : 5.1.1
pytensor : 2.10.1

Watermark: 2.3.1

Context for the issue:

I was going to try and write up docs on the technique of ordinal regression, but i think failure of the ordered transform makes the entire class of models less straightforward to implement. I'm pretty sure it's related to this line:

x = pt.zeros(value.shape)

But i don't know enough about the random variable implementation to know why the shape attribute is not available now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions