NumPy Random Seed: How It Works and Why to Stop Using It

Q: What does np.random.seed() do in NumPy?

In Python, np.random.seed() sets a global seed that ensures random number generation is reproducible across runs. However, it affects all random calls using np.random, which can be unintentionally altered by other scripts or packages.

Q: What’s the recommended way to generate reproducible random numbers in NumPy?

To generate reproducible random numbers in NumPy, use np.random.default_rng(). This method creates a local generator and isolates random number generation, which improves reproducibility and avoids interference from other code.

Summary: A NumPy random seed is a numerical value in Python that sets the starting state for generating random numbers, ensuring reproducible results. Using np.random.seed() affects global state, while using np.random.default_rng() creates isolated generators for more reliable, modular code.

Using np.random.seed(number) has been a best practice when using NumPy in Python to create reproducible work. Setting the random seed means that your work is reproducible to others who use your code. But now when you look at the docs for np.random.seed, the description reads:

This is a convenient, legacy function.

The best practice is to not reseed a BitGenerator, but rather to recreate a new one. This method is here for legacy reasons only.

So what’s changed? I’ll explain the old method and the issues with it. Then I’ll demonstrate the new best practice and its benefits.

Why to Stop Using NumPy’s Global Random Seed

Using np.random.seed(number) sets what NumPy calls the global random seed, which affects all uses to the np.random.* module. If imported code or other scripts explicitly call np.random.seed(), they can overwrite the global random state, potentially breaking reproducibility.

Random Seed Method in Python [NumPy + Random module]. | Video: Koolac

How NumPy Random Seed Works in Python

If you look up tutorials using np.random you see many of them using np.random.seed to set the seed for reproducible work. We can see how this works:

>>> import numpy as np

>>> import numpy as np
>>> np.random.rand(4)
array([0.96176779, 0.7088082 , 0.06416725, 0.82679036])

>>> np.random.rand(4)
array([0.15051909, 0.77788803, 0.67073372, 0.32134285])

As you can see, two calls to the function lead to two completely different answers. If you want somebody to be able to reproduce your projects, you can set the seed with the following code snippet:

>>> np.random.seed(2021)
>>> np.random.rand(4)
array([0.60597828, 0.73336936, 0.13894716, 0.31267308])


>>> np.random.seed(2021)
>>> np.random.rand(4)
array([0.60597828, 0.73336936, 0.13894716, 0.31267308])

You see the results are the same. If you need to prove this to yourself, you can enter the above code on your Python setup.

Setting the seed means the next random call is the same; it sets the sequence of random numbers such that any code that produces or uses random numbers (with NumPy) will now produce the same sequence of numbers. For example, look at the following:

>>> np.random.seed(2021)
>>> np.random.rand(4)
array([0.60597828, 0.73336936, 0.13894716, 0.31267308])
>>> np.random.rand(4)
array([0.99724328, 0.12816238, 0.17899311, 0.75292543])
>>> np.random.rand(4)
array([0.66216051, 0.78431013, 0.0968944 , 0.05857129])
>>> np.random.rand(4)
array([0.96239599, 0.61655744, 0.08662996, 0.56127236])
>>> np.random.seed(2021)
>>> np.random.rand(4)
array([0.60597828, 0.73336936, 0.13894716, 0.31267308])
>>> np.random.rand(4)
array([0.99724328, 0.12816238, 0.17899311, 0.75292543])
>>> np.random.rand(4)
array([0.66216051, 0.78431013, 0.0968944 , 0.05857129])
>>> np.random.rand(4)
array([0.96239599, 0.61655744, 0.08662996, 0.56127236])

The Problem With NumPy’s Global Random Seed

While global seeds work in isolated scripts, they fall short in modular, multi-script workflows. You can create reproducible calls, which means that all random numbers generated after setting the seed will be the same on any machine. For the most part, this is true; and for many projects, you may not need to worry about this.

The problem comes in larger projects or projects with imports that could also set the seed. Using np.random.seed(number) sets what NumPy calls the global random seed, which affects all uses to the np.random.* module. Some imported packages or other scripts could reset the global random seed to another random seed with np.random.seed(another_number), which may lead to undesirable changes to your output and your results becoming unreproducible. For the most part, you will only need to ensure you use the same random numbers for specific parts of your code (like tests or functions).

Np.random.default_rng(): The Solution to NumPy Random Seed

This is one of the reasons NumPy has moved toward advising users to create a random number generator for specific tasks (or to even pass around when you need parts to be reproducible).

“The preferred best practice for getting reproducible pseudorandom numbers is to instantiate a generator object with a seed and pass it around.” — Robert Kern, NEP19.

Using this new best practice looks like this:

import numpy as np
>>> rng = np.random.default_rng(2021)
>>> rng.random(4)
array([0.75694783, 0.94138187, 0.59246304, 0.31884171])

As you can see, these numbers are different from the earlier example because NumPy introduced default_rng() in version 1.17 as the preferred generator interface, though np.random continues to use RandomState by default for backward compatibility.

>>> rng = np.random.RandomState(2021)
>>> rng.rand(4)
array([0.60597828, 0.73336936, 0.13894716, 0.31267308])

Use RandomState only when maintaining legacy code — it does not offer the improvements or statistical guarantees of the newer Generator API.

The Benefits of Using np.random.default_rng() vs. NumPy Random Seed

You can pass random number generators around between functions and classes, meaning each individual or function could have its own random state without resetting the global seed. In addition, each script could pass a random number generator to functions that need to be reproducible. The benefit is you know exactly what random number generator is used in each part of your project.

def f(x, rng): return rng.random(1)

#Intialise a random number generator
rng = np.random.default_rng(2021)

#pass the rng to functions which you would like to use it
random_number = f(x, rng)

Other benefits arise with parallel processing, as Albert Thomas shows us.

Using independent random number generators can help improve the reproducibility of your results. You can do this by not relying on the global random state (which can be reset or used without knowing). Passing around a random number generator means you can keep track of when and how it was used and ensure your results are the same.

Frequently Asked Questions

What does np.random.seed() do in NumPy?

In Python, np.random.seed() sets a global seed that ensures random number generation is reproducible across runs. However, it affects all random calls using np.random, which can be unintentionally altered by other scripts or packages.

What’s the recommended way to generate reproducible random numbers in NumPy?

To generate reproducible random numbers in NumPy, use np.random.default_rng(). This method creates a local generator and isolates random number generation, which improves reproducibility and avoids interference from other code.