Skip to content

Add benchmarks #1829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 16, 2017
Merged

Add benchmarks #1829

merged 2 commits into from
Oct 16, 2017

Conversation

ColCarroll
Copy link
Member

Benchmarks would be useful for a number of reasons, the two most prominent being detecting performance regressions, and as a sort of end to end test. I have been mucking about with airspeed velocity this weekend, and have the start of a benchmark suite written for pymc3.

There's not great documentation on actually deploying this. The configuration file is largely copied from numpy's benchmarks. It looks like it is somewhat standard to keep the benchmarks with the project, and push results to a separate repo to host the static site.

From what I have read, we would still need:

  1. A separate repo to push the benchmarks to (pymc-devs/pymc3-benchmarks would make sense). asv output is a nice static html page.
  2. A dedicated(ish) machine to run benchmarks. See the discussions below. We (I?) could certainly run things manually for a while, but you don't want too many other tasks running at the same time.
  3. A separate script to run a daily cron task, benchmarking the latest commits.

The most helpful github discussions I found were here about deploying sympy benchmarks on a raspberry pi (!), and this on getting a machine to run benchmarks for dask.

I'm running these particular benchmarks on the last 10 commits right now, and will try to get a sample site up soon.

@ColCarroll
Copy link
Member Author

Two benchmarks so far: one is a parametrized example just testing overhead of sampling from a normal distribution with different samplers, the other is recreating this example

https://colcarroll.github.io/pymc3-benchmarks/

@twiecki
Copy link
Member

twiecki commented Feb 27, 2017

This looks incredible. Can't wait to see where the performance regressions and speed-ups happened.

@springcoil
Copy link
Contributor

While it's not completely true that this is beginner friendly, it's a task that someone with experience of adding benchmarks could help on. It's ok to remove beginner friendly if you think it's wrong though @ColCarroll

@ColCarroll
Copy link
Member Author

Spent much of the weekend on this and couldn't quite get things to work:

  • had a bash script spin up a sizeable spot image on AWS,
  • set up a github deploy key for the benchmarks repo (which is separate from the main pymc3 repo)
  • clone and sync the benchmarks that have been run in the past
  • clone and install pymc3
  • run all new benchmarks
  • push new benchmark results to benchmark repo
  • shut down spot instance

My plan was to run this script once daily on as small of a machine as possible, so the $5/month machine is on all the time, and spawns the $120/month box for ~1hr/day.

I'm going to take some time away from this, but happy to share the work I've already done if someone is excited or has better AWS knowledge. One thing is that I think I will try to use a dedicated instance, since bidding on a spot instance was a bit... spotty.

@ColCarroll
Copy link
Member Author

Removed "beginner friendly", if only for my pride.

@kyleabeauchamp
Copy link
Contributor

We could just do things the old fashioned way and a commandline benchmark suite that users run and post on the Github wiki or issue. This would have the benefit of allowing people to benchmark various hardware configurations. E.g. the extra boilerplate for CI might not be necessary.

@twiecki
Copy link
Member

twiecki commented Mar 22, 2017

@ColCarroll Sounds like at least some progress. Could you add your code to this PR if someone else wants to help with this?

As @kyleabeauchamp, we could always run this manually from time to time if AWS is too much trouble.

@ColCarroll
Copy link
Member Author

ColCarroll commented Mar 22, 2017 via email

@TomAugspurger
Copy link

FYI, I'm working to make the benchmark running and publishing easier.

The basic idea is that NumFOCUS will pay for a machine, and projects will share time on it. The benchmarks will be run daily and the results pushed to github pages. I'll have more information in the next couple of weeks (once it's built 😄 ). For now I'd say get a basic set of benchmarks you would like to have run written, and we'll get the running and publishing stuff sorted out later.

@ColCarroll
Copy link
Member Author

Hey @TomAugspurger what's the best place to continue this conversation? Benchmarking will likely be more important for us while figuring out the right path with respect to theano...

@TomAugspurger
Copy link

@ColCarroll once they're ready let me know. Then I'll add pymc3 to https://github.com/TomAugspurger/asv-runner/blob/master/tests/full.yml and get them running on our machine.

@ColCarroll
Copy link
Member Author

Sounds great! We'll probably land something similar to this soon, so that others can contribute test cases.

It looks as though our measurements will be somewhat grosser than others (O(60s), instead of O(1ms)) -- will try to keep the total suite time reasonable, but any thoughts you have would be welcome.

@TomAugspurger
Copy link

The machine is still idle for the majority of the day (pandas is the longest). I'll check in on it the first few days and see how things are going.

// If missing or the empty string, the tool will be automatically
// determined by looking for tools on the PATH environment
// variable.
"environment_type": "virtualenv",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the PyMC devs have a strong preference here? I haven't setup the benchmark server with virtualenv, so conda would be slightly easier for me. Though it probably isn't much work to get virtualenv working too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think virtualenv is my preference, everyone else uses conda, and I'm happy to go there as well!

@ColCarroll
Copy link
Member Author

@TomAugspurger I think this is ready with at least a "starter" set of benchmarks -- let me know if it looks good and I can merge. I included a screenshot of the current set of benchmarks: two "case study" examples, and one just sampling from a normal distribution, parametrized across NUTS, HMC, Slice, and Metropolis. I only ran on one commit, so the line chart isn't much to see!

screen shot 2017-10-15 at 7 55 04 pm

@TomAugspurger
Copy link

@ColCarroll cool. Feel free to merge whenever.

FYI, the runner is a bit backlogged at the moment, since there was a power outage last week and my auto-restart script had a bug. But I can still get it added to the queue whenever, and I'll let you know when things start running (probably tomorrow or Wednesday).

@ColCarroll ColCarroll merged commit c97a0b5 into pymc-devs:master Oct 16, 2017
@ColCarroll ColCarroll deleted the add_benchmarks branch October 16, 2017 13:41
@ColCarroll
Copy link
Member Author

Thanks -- let me know if I can help with this further (or when we can start seeing benchmarks!)

@TomAugspurger
Copy link

Great. I added it to the runner repo in asv-runner/asv-runner@9ce6c1a and am redeploying now.

I'll let you know when they're running. The results will be published to http://pandas.pydata.org/speed/pymc3. The HTML files generated from asv are uploaded to https://github.com/tomaugspurger/asv-collection, if you want to clone that host them yourselves somewhere.

@TomAugspurger
Copy link

TomAugspurger commented Oct 17, 2017 via email

@junpenglao junpenglao mentioned this pull request Oct 18, 2017
4 tasks
ColCarroll added a commit that referenced this pull request Nov 9, 2017
* Add benchmark skeleton

* Add another benchmark, switch to conda environment
@ColCarroll
Copy link
Member Author

@TomAugspurger it looks like the site hasn't updated for any project since mid february, and most of our benchmark history is gone. do you know the right place to talk about this, and perhaps volunteer to help fix? it has been a super useful service!

@TomAugspurger
Copy link

Sorry about that, it was my fault. I was attempting to add email notifications on failures and (ironically) broke runner, and didn't notice until earlier this week. I thought I fixed it, but apparently not. Looking into it now.

More generally, I'm not really sure what the best way to manage this is, given everyone's time constraints and how poorly documented the setup it. I'm hoping that with the email notifications working properly, I'll be able to respond more quickly. It'd be good to get a more formal setup / documentation / access for the projects using the build machine.

@ColCarroll
Copy link
Member Author

No problem at all! It seems like you're donating computer and devops time for no personal gain. Please follow up if there's some way you can offload some of that work.

@TomAugspurger
Copy link

Hopefully back up and running now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants