Julia outperforms Rust in generating a vector of random numbers

Hi,

I'm new to Rust from Julia and Python, and I'd love some help comparing Julia/Rust code for generating vectors of normally distributed random variables. Currently on an M1 Pro, Julia is 5 times faster at ~ 109us and Rust is 546us

The Rust code I'm using is:

use rand::distributions::Standard;
use rand::prelude::*;

pub fn randn_vec(n: usize) -> Vec<f64> {
    thread_rng().sample_iter(&Standard).take(n).collect()
}

I'm using the below Criterion code to benchmark:

use criterion::{black_box, criterion_group, criterion_main, Criterion};
use returns::randn;
// use lib::euler1; // function to profile

fn criterion_benchmark(c: &mut Criterion) {
    let n = 50000;
    c.bench_function("randn", |b| b.iter(|| randn(black_box(n))));
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

The Julia code I'm using is:

julia> using BenchmarkTools
julia> @benchmark randn(50000)

Any improvements in the Rust code and how to compare the two would be greatly appreciated!

1 Like

No experience with Julia, but one thing comes to mind: what is the relative performance of the PRNGs being used? I don't know what Julia uses, but I believe rand uses a cryptographically secure PRNG for thread_rng, rather than one focused on performance. 1/5th the speed wouldn't be surprising if Julia is using a performance-focused PRNG.

4 Likes

Julia uses Xoshiro256++ PRNG by default which indeed is very fast and not cryptographically secure. It makes sense, there's no need to worry about malicious users on scientific computing.

2 Likes

Try a less secure random number generator:

or enable and use rand's SmallRng.

You could also try Vec::with_capacity(50000); vec.extend(rand.yada.yada) to make sure it allocates once.

3 Likes

Thanks for the excellent answers and great to know! I’ll retest the code and share the results tomorrow!

I disagree, I think a cryptographically secure PRNG is a much better default for scientific computing.

Performance of random number generators is hardly ever going to be the bottleneck. Here, the slower numbers are showing about 0.7 gigabytes per second -- do you really need more? That's orders of magnitude more than you typically need for anything in scientific computing.

Malicious users are not the only problem with weak generators. What you want is the lack of any detectable deviations from uniformity and independence -- especially in science!

And that is precisely what cryptographically strong PRNGs provide. By definition of CSPRNG, either the random numbers will be indistinguishable from truly random, or you will have broken the security of the algorithm. So either the results of your scientific computing will be the same as if you had used true randomness, or you will have broken some cryptographic primitive. Either way a win!

If you don't have that property, then you can't really rely on any probabilistic analysis of what the algorithm will do, or on the scientific conclusions of your Monte Carlo simulation. You are risking all kinds of statistical anomalies.

I think of non-crypto PRNGs as wanna-be simplistic crypto algorithms that don't quite get there -- for instance, xoshiro256++ uses very similar operations as ChaCha20 (xor, rotate, etc). Papers analyzing them look at very similar things (bit-mixing, etc). It's kind of like a poor-man's, roll your own mini-crypto.

I am surprised they are still so popular, given that they only give a 5x speed-up.

8 Likes

I can certainly appreciate the arguments for an RNG that better approximates the desired distribution. For those who are curious, Rust outperformed at 59us. The Xoshiro 256++ required 6,000 simulations to pass the standard error threshold of 0.001, while Thread_rng required 1,000. Anecdotally, it could be said you get your speed back with smaller simulation requirements, but I'm sure that's been tested rigorously elsewhere.

Note that ThreadRng also periodically re-seeds from OS randomness source which adds its own overhead. For scientific simulations you should use your own PRNG (e.g. from rand_chacha), likely initialized with a fixed seed for reproducibility.

3 Likes

Of course, for Monte Carlo sometimes it can be fine to use something even less random, like https://extremelearning.com.au/unreasonable-effectiveness-of-quasirandom-sequences/

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.