-
-
Notifications
You must be signed in to change notification settings - Fork 112
Local Pytest and docker TravisCI testing #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8060579
to
af08d56
Compare
This is the issue we were having on ArviZ with tensorflow, which is why we ended up moving to docker testing. Seems like its a problem here too Tests run fine locally but not in TravisCI for some reason https://travis-ci.org/pymc-devs/pymc4/builds/479554525#L609 |
7e0126b
to
aa3bacc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOOKS COOL
Well at this point tests and lint are working if anyone wants to take a look. Please don't merge yet though. I need to do a few more things to button up this pull request, along the lines of add documentation and propose a couple of things. Please feel free to review in the meanwhile though |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
scripts/install_miniconda.sh
Outdated
@@ -0,0 +1,38 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required if the docker image pulls from the miniconda image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not, but I left it there in case anyone was using it to install miniconda locally.
Since you called me out on it though I'm ok with deleting it
pymc4/tests/test_random_variables.py
Outdated
def tf_supported_args(): | ||
"""Provide arugments for each supported Tensorflow distribution""" | ||
_tfp_supported_args = { | ||
"Bernoulli": {"probs": 0.5}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not directly map the RV classes to their parameters? Saves the getattr
lookup below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds way better. Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember why I didn't do this.
I wanted a couple things
- An ordered object of distribution strings so I can pass them to the id argument function
- A mapping of parameters to distributions
- A guarantee that when testing a method, the id string matches the distribution being tested
- Only one place where order is defined. If order is defined in two places, then coordination of order is human reliant, which means it'll definitely get messed up.
If I mapped directly from rv.classes to their parameters the tests would still work, but I wouldn't have a way to get the right distribution name string to the id argument.
The other reason I felt this might be better is that when testing the distribution name is visible, even in pdb. If I mapped RV Classes, in a debugger they would show up as memory addresses since we haven't defined a __repr__
for RandomVariable
yet
There also is the possibility I'm wrong or explaining things poorly so let me know if this doesn't make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow that reasoning.
An ordered object of distribution strings so I can pass them to the id argument function
This isn't ordered once you created it as it's a dict. You could sort it but I don't see why you would want to do that.
A mapping of parameters to distributions
This is a mapping of parameters to strings of distributions.
A guarantee that when testing a method, the id string matches the distribution being tested
I don't see the benefit of that. You only increase the chance of having typos in the strings.
Only one place where order is defined. If order is defined in two places, then coordination of order is human reliant, which means it'll definitely get messed up.
Where is order defined here in the first place? And why would that be important?
If I mapped directly from rv.classes to their parameters the tests would still work, but I wouldn't have a way to get the right distribution name string to the id argument.
That's true, but do we need that?
The other reason I felt this might be better is that when testing the distribution name is visible, even in pdb.
We do have the class name that will get printed.
In sum, I don't really see the benefit, other than easier lookup when trying to get parameters manually but I don't see the use-case as it's just a test. But I do think it's more brittle to have that level of indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair! The additional layer of indirection is more brittle. The benefit I saw was allowing tests to be named by pytest making it simpler to figure out which test is failing in a log, and select tests when using the -k
command line flag with for pytest
Here's an example of current implementation output
Here's an example of what I think the proposed implementation is asking for. In this case I don't like how the output log doesn't tell me which distribution is being tested. tf_distribution[2] is a lot less meaningful than 'Categorical'
*But I may have implemented incorrectly. Let me know if this is what you had in mind!
https://github.com/canyon289/pymc4/blob/pytest_alternative/pymc4/tests/test_random_variables.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that looks way nicer. I wonder if we should just add a__repr__
instead though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I reformulated this to meet the ask. Let me know what you think!
|
||
@pytest.fixture(scope="function") | ||
def tf_session(): | ||
sess = tf.Session() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
with tf.Graph().as_default():
with tf.Session() as sess:
yield sess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to open and close session each time? I figured one session was fine if we cleared the graph, and opening and closing them would be a waste of computational resources, but I could be wrong
If we do implement this I'd prefer sess.close()
in the finalizer since it'll make it easier to avoid stacked context managers, which then makes debugging easier and mocking easier imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote for the current solution with graph reset. There is no difference between them. I see no computational resources waste... (If changes are already commited that is no problem)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ferrine this suggests they are different: https://stackoverflow.com/questions/42706761/closing-session-in-tensorflow-doesnt-reset-graph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's quite easy to run into memory problems here if we keep adding to graph or create multiple sessions in these tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
One comment: there's a binom_test.py
in the top-level directory: is that supposed to be there?
Whoops no. I'll delete it, thanks for catching that |
eef4623
to
6b0c5f4
Compare
Should we then develop on top of eager mode? |
@eigenfoo referenced this ticket somehow but I can't figure it out so adding this reference as well |
Thanks! |
btw @canyon289 I just noticed this but fun fact,
could also just be
which is more readable! |
Adding the following in this PR. Will unmark WIP when asking for a merge