This should be a single memory load from a known address, because the compiler should know that pointer_from_objref(global_xorostate) cannot change, and it also should know that pointer(global_xorostate) cannot change, nor the size (because it is not one-dimensional).
So, my question: How do I tell julia 1.4/master that it is perfectly fine to chase these pointers at compile time instead of runtime? How do I get rid of the invoke?
Or should I try something else?
My problem with struct is that I donβt know how to force 64 byte alignment. A secondary problem is that I donβt know how to avoid an additional indirection through Threads.threadid() with structs. Threads.threadid() should only be used as an offset for loads of payload, not as an offset to load a pointer to payload. (I mostly know the desired assembly code, and my issue is how to coax julia into emitting that)
and a global_rng.ptr = call(:posix_memalign, ...) in the __init__ function, plus a definition default_rng() = convert(Ptr{rng_instance}, global_rng.ptr + (Threads.threadid() - 1%Int16)*320?
Master hasnβt solved this issue either: There is a pesky invoke in Base.default_rng() as well (probably why rand() is so slow with multithreading).