-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add benchmarks #1829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarks #1829
Conversation
Two benchmarks so far: one is a parametrized example just testing overhead of sampling from a normal distribution with different samplers, the other is recreating this example https://colcarroll.github.io/pymc3-benchmarks/ |
This looks incredible. Can't wait to see where the performance regressions and speed-ups happened. |
6ffdfe4
to
5f0117f
Compare
While it's not completely true that this is beginner friendly, it's a task that someone with experience of adding benchmarks could help on. It's ok to remove beginner friendly if you think it's wrong though @ColCarroll |
Spent much of the weekend on this and couldn't quite get things to work:
My plan was to run this script once daily on as small of a machine as possible, so the $5/month machine is on all the time, and spawns the $120/month box for ~1hr/day. I'm going to take some time away from this, but happy to share the work I've already done if someone is excited or has better AWS knowledge. One thing is that I think I will try to use a dedicated instance, since bidding on a spot instance was a bit... spotty. |
Removed "beginner friendly", if only for my pride. |
We could just do things the old fashioned way and a commandline benchmark suite that users run and post on the Github wiki or issue. This would have the benefit of allowing people to benchmark various hardware configurations. E.g. the extra boilerplate for CI might not be necessary. |
@ColCarroll Sounds like at least some progress. Could you add your code to this PR if someone else wants to help with this? As @kyleabeauchamp, we could always run this manually from time to time if AWS is too much trouble. |
That could be a good incremental step. I'll tidy up and comment the script.
…On Wed, Mar 22, 2017 at 6:42 AM Thomas Wiecki ***@***.***> wrote:
@ColCarroll <https://github.com/ColCarroll> Sounds like at least some
progress. Could you add your code to this PR if someone else wants to help
with this?
As @kyleabeauchamp <https://github.com/kyleabeauchamp>, we could always
run this manually from time to time if AWS is too much trouble.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1829 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACMHEI_GyP2w1hgH06QySSRpo_2eTa2Yks5roPsygaJpZM4MMkIl>
.
|
FYI, I'm working to make the benchmark running and publishing easier. The basic idea is that NumFOCUS will pay for a machine, and projects will share time on it. The benchmarks will be run daily and the results pushed to github pages. I'll have more information in the next couple of weeks (once it's built 😄 ). For now I'd say get a basic set of benchmarks you would like to have run written, and we'll get the running and publishing stuff sorted out later. |
Hey @TomAugspurger what's the best place to continue this conversation? Benchmarking will likely be more important for us while figuring out the right path with respect to theano... |
@ColCarroll once they're ready let me know. Then I'll add pymc3 to https://github.com/TomAugspurger/asv-runner/blob/master/tests/full.yml and get them running on our machine. |
Sounds great! We'll probably land something similar to this soon, so that others can contribute test cases. It looks as though our measurements will be somewhat grosser than others ( |
The machine is still idle for the majority of the day (pandas is the longest). I'll check in on it the first few days and see how things are going. |
benchmarks/asv.conf.json
Outdated
// If missing or the empty string, the tool will be automatically | ||
// determined by looking for tools on the PATH environment | ||
// variable. | ||
"environment_type": "virtualenv", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the PyMC devs have a strong preference here? I haven't setup the benchmark server with virtualenv, so conda
would be slightly easier for me. Though it probably isn't much work to get virtualenv working too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think virtualenv
is my preference, everyone else uses conda
, and I'm happy to go there as well!
5f0117f
to
7d5daa2
Compare
@TomAugspurger I think this is ready with at least a "starter" set of benchmarks -- let me know if it looks good and I can merge. I included a screenshot of the current set of benchmarks: two "case study" examples, and one just sampling from a normal distribution, parametrized across |
@ColCarroll cool. Feel free to merge whenever. FYI, the runner is a bit backlogged at the moment, since there was a power outage last week and my auto-restart script had a bug. But I can still get it added to the queue whenever, and I'll let you know when things start running (probably tomorrow or Wednesday). |
Thanks -- let me know if I can help with this further (or when we can start seeing benchmarks!) |
Great. I added it to the runner repo in asv-runner/asv-runner@9ce6c1a and am redeploying now. I'll let you know when they're running. The results will be published to http://pandas.pydata.org/speed/pymc3. The HTML files generated from asv are uploaded to https://github.com/tomaugspurger/asv-collection, if you want to clone that host them yourselves somewhere. |
They should be available on http://pandas.pydata.org/speed/pymc3/ now.
…On Mon, Oct 16, 2017 at 8:42 AM, Colin ***@***.***> wrote:
Thanks -- let me know if I can help with this further (or when we can
start seeing benchmarks!)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1829 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIiZgVeirnbgWe5sSFjxkv_U_Cv32ks5ss11XgaJpZM4MMkIl>
.
|
* Add benchmark skeleton * Add another benchmark, switch to conda environment
@TomAugspurger it looks like the site hasn't updated for any project since mid february, and most of our benchmark history is gone. do you know the right place to talk about this, and perhaps volunteer to help fix? it has been a super useful service! |
Sorry about that, it was my fault. I was attempting to add email notifications on failures and (ironically) broke runner, and didn't notice until earlier this week. I thought I fixed it, but apparently not. Looking into it now. More generally, I'm not really sure what the best way to manage this is, given everyone's time constraints and how poorly documented the setup it. I'm hoping that with the email notifications working properly, I'll be able to respond more quickly. It'd be good to get a more formal setup / documentation / access for the projects using the build machine. |
No problem at all! It seems like you're donating computer and devops time for no personal gain. Please follow up if there's some way you can offload some of that work. |
Hopefully back up and running now. |
Benchmarks would be useful for a number of reasons, the two most prominent being detecting performance regressions, and as a sort of end to end test. I have been mucking about with airspeed velocity this weekend, and have the start of a benchmark suite written for
pymc3
.There's not great documentation on actually deploying this. The configuration file is largely copied from numpy's benchmarks. It looks like it is somewhat standard to keep the benchmarks with the project, and push results to a separate repo to host the static site.
From what I have read, we would still need:
pymc-devs/pymc3-benchmarks
would make sense).asv
output is a nice static html page.The most helpful github discussions I found were here about deploying
sympy
benchmarks on a raspberry pi (!), and this on getting a machine to run benchmarks for dask.I'm running these particular benchmarks on the last 10 commits right now, and will try to get a sample site up soon.