The Correct Way to Generate Random Numbers in Python with NumPy
Random number generation is a common programming task that is required for many different programs and applications. In Python, the most common way to generate random numbers is arguably the NumPy module. NumPy is fast, reliable, easy to install, and relied on by many programs.
As of 2022 (numpy
version 1.22) the proper way to generate random numbers with NumPy has changed. This tutorial will demonstrate the basics of using NumPy to generate random decimals, integers, and distributions for your Python program or application.
Follow along with the Jupyter Notebook below to begin learning all about random number generation.
How to Generate Random Numbers in Python
To start, import numpy
import numpy as np
The first thing we need to do to generate random numbers in Python with numpy
is to initialize a Random Generator. This Generator will allow us to generate random numbers using many different methods. For these examples we are going use np.random.default_rng()
. You can check out the numpy
documentation for more information about other methods for generating random numbers.
Let’s start by intializing a np.random.default_rng()
as the rng
variable.
rng = np.random.default_rng()
Random decimal (floating point, or float) numbers¶
Now, we can use rng
to generate some random numbers. Let’s generate a random float (decimal) as an example. To generate a random float we use the random
method. This will generate a decimal number greater than zero and less than one.
rng.random()
0.07931932739439529
To generate an array of random float we can pass a size argument to random
. The next example generates an array that contains 10 random floats.
rng.random(10)
array([0.91394253, 0.80736524, 0.21402387, 0.86334362, 0.75628216, 0.18238176, 0.37567793, 0.36385296, 0.39890072, 0.78487536])
To produce a multidimensional array of random values, simply pass a shape tuple instead of an integer for the size parameter. This example produces a 2-dimensional with 3 rows and 2 columns of random floats.
rng.random((3, 2))
array([[0.46270073, 0.21953817], [0.57626831, 0.72370476], [0.02004972, 0.65277897]])
Seeding for reproducibility¶
There are times you may want to generate a random number for a piece of code, but keep that number the same so the code reproduces the same output. This is a common need when testing code and algorithms. If you run the code above again (by running your script or restarting your notebook) you will notice that the code still produces random numbers, but they are different numbers than were produced the last time you ran it.
If you need the number to stay the same for testing purposes, this can be problematic. But don’t worry, there is a way to get your code to generate the same random number every time.
The way to get your Python code to produce the same random number(s) each time it is run is to seed the random number generator when it is created. To seed the generator, simply pass an integer (numpy
examples use a 5-digit integer but it doesn’t have to be 5-digits) when you create the generator. Like so.
rng_seed = np.random.default_rng(12345)
rng_seed.random()
0.22733602246716966
Notice that if you run this script again (or restart the notebook). The same random number will be produced. It might seem a little counter-intuitive to lock-in a random number like this (why not just define a variable with a set number?), but it makes code transparent and shows that a random number was generated.
Random integers¶
We can also use np.random.default_rng()
to generate random integers with Python. Notice that I’m using the same instance of np.random.default_rng()
that was created at the beginning of this notebook.
Now, I’ll use the integers
method to create a random integer. With the integers
method you’ll need to specify a low
(inclusive) and high
(exclusive) value between which to generate the integer. Below, I generate a random number greater than 0 less than 10.
rng.integers(0, 10)
9
To generate an array of random integers in Python, use the integers
method and specify the size
argument. The code below generates an array of 10 random integers between the values of 0 and 10.
rng.integers(low=0, high=10, size=10)
array([8, 0, 7, 3, 5, 7, 2, 7, 8, 6], dtype=int64)
Random numbers from statistical distributions¶
With np.random.default_rng()
we can also use Python to generate random numbers from several different statistical distributions. This functionality is very useful and prevents you from creating a lot of extra code for random number generation. Some of the statistical distributions that you generate random numbers from with numpy
are beta, binomial, chisquare, exponential, gamma, logistic, lognormal, poison, power, uniform, wald, weibull, and normal (of course). You can view more available distributions in the documentation.
Let’s start out with an example using the standard normal distribution. The standard normal distribution is simply a normal distribution with mean of 0.0 and standard deviation of 1.0.
We can generate a single random value from the standard normal distribution as follows.
rng.standard_normal()
-3.5685995085116997
Now, let’s plot the distribution of a larger sample to see that it follows the distribution. First, import matplotlib
.
import matplotlib.pyplot as plt
Let’s generate 1,000 random numbers from the standard normal distribution and plot a histogram.
plt.hist(rng.standard_normal(1000))
(array([ 2., 32., 104., 220., 248., 209., 136., 36., 12., 1.]), array([-3.16226945, -2.4803122 , -1.79835495, -1.11639769, -0.43444044, 0.24751682, 0.92947407, 1.61143132, 2.29338858, 2.97534583, 3.65730309]), <BarContainer object of 10 artists>)
As you can see, we get histogram with a normal distribution centered at 0.0. To generate random numbers from other distributions, you just need to specify the distribution parameters (usually a mean/center and a variance). I’ll demonstrate with the normal distribution.
The previous example demonstrated the standard normal distribution, which is widely used. If we wanted to pull random numbers from a normal distribution with a different mean and variance we just need to use the normal
function. Let’s generate and plot 1,000 random numbers from a normal distribution with mean 12.0 and variance 3.5. Notice that size
, how many random numbers to generate (1,000), is the last argument passed.
plt.hist(rng.normal(12.0, 3.5, 1000))
(array([ 18., 49., 113., 213., 225., 201., 123., 50., 6., 2.]), array([ 2.157391 , 4.38762617, 6.61786135, 8.84809652, 11.07833169, 13.30856686, 15.53880203, 17.76903721, 19.99927238, 22.22950755, 24.45974272]), <BarContainer object of 10 artists>)
Let’s do one more demonstration using the exponential distribution. The exponential distribution has a single scale parameter. We’ll specify the scale parameter and generate 10,000 random values.
plt.hist(rng.exponential(100.0, 1000))
(array([651., 221., 83., 29., 10., 3., 2., 0., 0., 1.]), array([8.32129326e-03, 1.01972594e+02, 2.03936867e+02, 3.05901140e+02, 4.07865413e+02, 5.09829686e+02, 6.11793959e+02, 7.13758231e+02, 8.15722504e+02, 9.17686777e+02, 1.01965105e+03]), <BarContainer object of 10 artists>)
Conclusion
NumPy offers a lot of functionality for generating random numbers in Python. Additionally, it is fast and easy to use. This tutorial has explored and demonstrated the basics of random number generation in Python. Continue learning and exploring these ideas by developing your own applications and projects!
Whether you’re looking to take your GIS skills to the next level, or just getting started with GIS, we have a course for you! We’re constantly creating and curating more courses to help you improve your geospatial skills.
All of our courses are taught by industry professionals and include step-by-step video instruction so you don’t get lost in YouTube videos and blog posts, downloadable data so you can reproduce everything the instructor does, and code you can copy so you can avoid repetitive typing