Skip to content

faster rand(::MersenneTwister, UnitRange{<:BitInteger}) (fix #22600) #22612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 12, 2017

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Jun 29, 2017

This is a solution simirar to #14772: generation with MersenneTwister is fast, so it's cheaper to waste entropy bits rather than using divisions to save those bits as much as we can (it may have been different when MersenneTwister couldn't use the fast SIMD-optimized array functions).

I intend to extend this work (when applicable) before the next release, but this first step will probably be the most visible.

Few timings I get (using in fact @btime from BenchmarkTools, and m being a MersenneTwister object), showing master/PR, or master/randi/PR when applicable (using randi from StatsBase):

julia> @time rand(m, false:true)
21ns / 9ns

julia> @time rand(m, 1:2)
33ns / 21ns / 8.5ns

julia> @time rand(m, 1:2^30)
33ns / 29ns / 8.5ns

julia> @time rand(m, 1:2^30+1)
37ns / 29ns / 19ns
 
julia> @time rand(m, 1:2^52)
38ns / 31ns / 8.5ns

julia> @time rand(m, 1:2^52+1)
38ns / 31ns / 28ns

julia> @time rand(m, 1:2^53)
38ns / 31ns / 13ns

julia> @time rand(m, 1:Int128(2)^104)
102ns / 19ns

julia> @time rand(m, 1:Int128(2)^104+1)
101ns / 39ns

julia> @time rand(m, 1:Int128(2)^105)
102ns / 24ns

Basically, with the rejection sampling in this PR, when the lenght of the interval is a bit above a power of 2, then about 1 sample over two is rejected, which explains the change of performances around a power of 2. Also, this takes advantage of the fact that MersenneTwister nativily generates 52 bits of entropy, hence a change in performances after lengths of 2^52 and 2^104.

@rfourquet rfourquet added the randomness Random number generation and the Random stdlib label Jun 29, 2017
@rfourquet
Copy link
Member Author

cc @nalimilan @mschauer

@rfourquet rfourquet added the performance Must go faster label Jun 29, 2017
@nalimilan
Copy link
Member

Sorry, I'm really not qualified to comment on the RNG code.

@GregPlowman
Copy link
Contributor

Would this function also benefit from using the new method for UnitRange, rather than using RangeGenerator?

function rand!(rng::AbstractRNG, A::AbstractArray, r::AbstractArray)
     g = RangeGenerator(1:(length(r)))
     for i in eachindex(A)
         @inbounds A[i] = r[rand(rng, g)]
     end
     return A
 end

Or perhaps a new RangeGenerator appropriate for new method?

@rfourquet
Copy link
Member Author

Would this function also benefit from using the new method for UnitRange, rather than using RangeGenerator

According to my preliminary testings, mostly, but with less speed-up and even a slow-down in some cases, e.g. for the range 1:2^52+1 (when calling rand!(m, A, 1:2^52+1)). So I'm not yet sure what's the best strategy. Concerning modifying RangeGenerator, it would also require checking that this new version is allways faster for all (most) the RNGs types we can test, or to introduce a trait for different RNGs, and call e.g. RangeGenerator(1:2, rng), to choose the best implementation depending on the rng type. It will require a bit more time, so I prefered proposing first the simplest fix (which is unambiguously faster).

@ViralBShah
Copy link
Member

Perhaps @simonbyrne can also take a look.

@rfourquet rfourquet changed the title faster rand(::MersenneTwister, BitInteger) (fix #22600) faster rand(::MersenneTwister, UnitRange{<:BitInteger}) (fix #22600) Jul 4, 2017
@mschauer
Copy link
Contributor

mschauer commented Jul 6, 2017

Probably we get only one shot at changing this so I think it is worthwhile to think a bit about the problem space.

Basically we have to keep track of three sources of cost:

  1. computation of the generator (the cost of mm(n,k) = div(n,k)*k)
  2. resampling (e.g. rand_ui52_raw)
  3. modulo operation

in your approach all costs are source 2, previously dominant costs are 1. and 3 and this might well be the right solution.

On this machine, costs for mm, rand_ui52_raw and % k are roughly proportional to 5 ; 1 and 1/4.

There is not much what can be done about 3, but (on this machine) it is also only 1/4 of the sampling costs, so worthwhile to do if it decreases the number of rejections by a factor.

mm(n,k) is expensive so it is worthwhile if more than say a handful random numbers are needed. As a basic use case is sampling random integeras from all ranges from 1:1 to 1:n your pull request might be the right thing to do. I just wanted to point out that in this case also the mm(n,k) computation becomes cheaper. The basic observation is that the mm(n,k) can be computed recursivley.

I have a "proof of concept" implemenation here https://gist.github.com/mschauer/d6a7a091bada2a828e1ac74e7baa106b

It defines an iterator which counts downwards from i = k, k-1, k-2 and gives pairs of (i, rand(1:i)) in each iteration. It is cheap if k is small and i is close to k.

So in praxis one might divide k by a factor and restart the iterator when i becomes relatively small. I have not fully understood this yet but wanted to invite you to think about it, in the meanwhile I continue reading here.

@ggggggggg
Copy link
Contributor

If this is PR/master, doesn't this show that master is faster than your PR?

julia> @time rand(m, false:true)
21ns / 9ns

@rfourquet
Copy link
Member Author

Probably we get only one shot at changing this

What do you mean?

The basic observation is that the mm(n,k) can be computed recursivley.

Very interesting! I will take time to read your proof of concept, and compare efficiency. That said, I guess the simplicity of this PR may be preferable for the general case.

If this is PR/master, doesn't this show that master is faster than your PR?

My mistake, thanks for reporting (editing OP).

@mschauer
Copy link
Contributor

mschauer commented Jul 6, 2017

one shot

There are some reasons not to change the random number generator too liberally, because old versions (at least say from julia 1.0 onwards) should be available for reproducibility of numerical experiments. R still has a normal variate generator called "Buggy Kinderman-Ramage" (kept for reproducibility...)

@rfourquet
Copy link
Member Author

old versions (at least say from julia 1.0 onwards) should be available for reproducibility of numerical experiments

I was vaguely thinking of this reason, but didn't remember having seen this stated clearly in the forums. Do you think the generators could be changed (i.e. the reproducibility broken) between 1.0 and 2.0 ?

@StefanKarpinski
Copy link
Member

I would really like for RNG to be explicitly set when srand is called so that we can change RNGs and other algorithms in the future without breaking code. As a second-best, if we could change them without breaking code silently, that would also be acceptable. The main problem with this is that we don't generally pass RNGs around explicitly – we often use the global default RNG whose type cannot be changed. This seems like something that we very much need a solution to.

@rfourquet
Copy link
Member Author

@StefanKarpinski I'm afraid I don't understand your concern here: in which case code will be broken?

@StefanKarpinski
Copy link
Member

What I was trying to explain (poorly) is that if everything one needs to know about how random values are generated is specified when seeding the RNG, then we're free to change anything we want. In this case, we could have one RNG type for using resampling and one RNG type for using a pre-computed generator. This not only allows the user to decide what they want based on their use case, but also means that if we change the default in the future, user code won't break since it will keep using whatever they chose when they seed their RNG.

@rfourquet
Copy link
Member Author

I think I understand a bit better (this also relates to point 3. of #17918 (comment)), but not fully yet. But by "breaking code", do you mean that the stream of random numbers is modified when we change the implementation (i.e. non-reproducibility) ? Would you like to prevent the use of srand([seed]), and force something like srand(MersenneTwister, [seed]) to explicitly tell which RNG type to use as the implicit one? then, if the implementation changes, the old one is kept with the old name, and the new updated RNG is named e.g. MersenneTwister2. Then, we could still allow srand(), with proper documentation, meaning whatever best current RNG is available as a default, when the user doesn't care about this problem.

@rfourquet
Copy link
Member Author

I feel incline to merge this soon, as that's the best we have for now, while still exploring @mschauer idea.

@tkelman
Copy link
Contributor

tkelman commented Jul 12, 2017

are there benchmarks that would track this? @nanosoldier runbenchmarks(ALL, vs = ":master")

@rfourquet
Copy link
Member Author

are there benchmarks that would track this?

under review, see JuliaCI/BaseBenchmarks.jl#82.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@rfourquet
Copy link
Member Author

@nanosoldier runbenchmarks("random", vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @ararslan

@rfourquet rfourquet merged commit 143c9e2 into master Aug 12, 2017
@fredrikekre fredrikekre deleted the rf/MT-randrange branch August 12, 2017 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants