Skip to content

Local Pytest and docker TravisCI testing #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Jan 19, 2019

Conversation

canyon289
Copy link
Member

Adding the following in this PR. Will unmark WIP when asking for a merge

@eigenfoo eigenfoo added the WIP Work in progress label Jan 14, 2019
@canyon289
Copy link
Member Author

canyon289 commented Jan 14, 2019

This is the issue we were having on ArviZ with tensorflow, which is why we ended up moving to docker testing. Seems like its a problem here too

Tests run fine locally but not in TravisCI for some reason

https://travis-ci.org/pymc-devs/pymc4/builds/479554525#L609

@canyon289 canyon289 changed the title [WIP] Pylint pytest [WIP] Local Pytest and docker TravisCI testing Jan 16, 2019
Copy link
Contributor

@springcoil springcoil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOOKS COOL

@canyon289
Copy link
Member Author

canyon289 commented Jan 17, 2019

Well at this point tests and lint are working if anyone wants to take a look.

Please don't merge yet though. I need to do a few more things to button up this pull request, along the lines of add documentation and propose a couple of things. Please feel free to review in the meanwhile though

Copy link
Contributor

@springcoil springcoil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -0,0 +1,38 @@
#!/usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required if the docker image pulls from the miniconda image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, but I left it there in case anyone was using it to install miniconda locally.

Since you called me out on it though I'm ok with deleting it

def tf_supported_args():
"""Provide arugments for each supported Tensorflow distribution"""
_tfp_supported_args = {
"Bernoulli": {"probs": 0.5},
Copy link
Member

@twiecki twiecki Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not directly map the RV classes to their parameters? Saves the getattr lookup below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds way better. Will do

Copy link
Member Author

@canyon289 canyon289 Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember why I didn't do this.

I wanted a couple things

  1. An ordered object of distribution strings so I can pass them to the id argument function
  2. A mapping of parameters to distributions
  3. A guarantee that when testing a method, the id string matches the distribution being tested
  4. Only one place where order is defined. If order is defined in two places, then coordination of order is human reliant, which means it'll definitely get messed up.

If I mapped directly from rv.classes to their parameters the tests would still work, but I wouldn't have a way to get the right distribution name string to the id argument.

The other reason I felt this might be better is that when testing the distribution name is visible, even in pdb. If I mapped RV Classes, in a debugger they would show up as memory addresses since we haven't defined a __repr__ for RandomVariable yet

There also is the possibility I'm wrong or explaining things poorly so let me know if this doesn't make sense.

Copy link
Member

@twiecki twiecki Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow that reasoning.

An ordered object of distribution strings so I can pass them to the id argument function

This isn't ordered once you created it as it's a dict. You could sort it but I don't see why you would want to do that.

A mapping of parameters to distributions

This is a mapping of parameters to strings of distributions.

A guarantee that when testing a method, the id string matches the distribution being tested

I don't see the benefit of that. You only increase the chance of having typos in the strings.

Only one place where order is defined. If order is defined in two places, then coordination of order is human reliant, which means it'll definitely get messed up.

Where is order defined here in the first place? And why would that be important?

If I mapped directly from rv.classes to their parameters the tests would still work, but I wouldn't have a way to get the right distribution name string to the id argument.

That's true, but do we need that?

The other reason I felt this might be better is that when testing the distribution name is visible, even in pdb.
We do have the class name that will get printed.

In sum, I don't really see the benefit, other than easier lookup when trying to get parameters manually but I don't see the use-case as it's just a test. But I do think it's more brittle to have that level of indirection.

Copy link
Member Author

@canyon289 canyon289 Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair! The additional layer of indirection is more brittle. The benefit I saw was allowing tests to be named by pytest making it simpler to figure out which test is failing in a log, and select tests when using the -k command line flag with for pytest

Here's an example of current implementation output
namedtests

Here's an example of what I think the proposed implementation is asking for. In this case I don't like how the output log doesn't tell me which distribution is being tested. tf_distribution[2] is a lot less meaningful than 'Categorical'

*But I may have implemented incorrectly. Let me know if this is what you had in mind!
https://github.com/canyon289/pymc4/blob/pytest_alternative/pymc4/tests/test_random_variables.py
nonnamedtests

Copy link
Member

@twiecki twiecki Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that looks way nicer. I wonder if we should just add a__repr__ instead though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I reformulated this to meet the ask. Let me know what you think!


@pytest.fixture(scope="function")
def tf_session():
sess = tf.Session()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

with tf.Graph().as_default():
    with tf.Session() as sess:
        yield sess

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to open and close session each time? I figured one session was fine if we cleared the graph, and opening and closing them would be a waste of computational resources, but I could be wrong

If we do implement this I'd prefer sess.close() in the finalizer since it'll make it easier to avoid stacked context managers, which then makes debugging easier and mocking easier imo

Copy link
Member

@ferrine ferrine Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for the current solution with graph reset. There is no difference between them. I see no computational resources waste... (If changes are already commited that is no problem)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's quite easy to run into memory problems here if we keep adding to graph or create multiple sessions in these tests.

Copy link
Member

@eigenfoo eigenfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

One comment: there's a binom_test.py in the top-level directory: is that supposed to be there?

@canyon289
Copy link
Member Author

Looks good!

One comment: there's a binom_test.py in the top-level directory: is that supposed to be there?

Whoops no. I'll delete it, thanks for catching that

@ferrine
Copy link
Member

ferrine commented Jan 18, 2019

What conda channels do we plan on using? It appears the conda docker is only pulling from defaults, but we may want to grab things (tensorflow?) from conda-forge.

Fair point, right now we had some discussions about using tf-nightly and tfp-nightly and then switching to tf2 when tfp becomes compatible.

Besides those to requirements though nothing is installed by conda specifically, other than the base env packages. Everything else is done by pip and pypi

Should we then develop on top of eager mode?

@canyon289
Copy link
Member Author

@eigenfoo referenced this ticket somehow but I can't figure it out so adding this reference as well

tensorflow/probability#81

@twiecki twiecki merged commit 58f2fbb into pymc-devs:master Jan 19, 2019
@twiecki
Copy link
Member

twiecki commented Jan 19, 2019

Thanks!

@eigenfoo
Copy link
Member

btw @canyon289 I just noticed this but fun fact, sed doesn't need to have a / follow the s: it can be any character so long as you're consistent. So

's/\/opt\/pymc4/\/home\/travis\/build\/pymc-devs\/pymc4/g'

could also just be

's_/opt/pymc4_/home/travis/build/pymc-devs/pymc4_g'

which is more readable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants