How to Use numpy.where() in Python with Examples
The where
function from numpy
is a powerful way to vectorize if/else statements across entire arrays. There are two primary ways to use numpy.where
. First, numpy.where
can be used to idenefity array indices where a condition is true (or false). Second, it can be used to index and change values where a condition is met.
Multiple applicaitons of numpy.where
are exaplained and demonstrated in this article for both 1-dimensional and multi-dimensional arrays.
numpy.where
Basic Syntax and Usage¶
The general usage of numpy.where
is as follows: numpy.where(condition, value if true (optional), value if false (optional) )
. The condition is applied to a numpy array and must evaluate to a boolean. For example a > 5
where a is a numpy
array. The result of a call to numpy.where
is an array.
If the true value and false value are not specified an array (or tuple of arrays if the conditional array is multidimensional) of indices where the condition evaluates to true is returned. If true and false values are specified the result is an array of the same shape as the conditional array with updated values.
Let’s go through some examples to demonstrate this in different scenarios.
Import numpy
¶
The first step is to import Python’s numpy
module, as shown below. If you are not sure if you have numpy
installed, follow the directions in this article to find out.
import numpy as np
numpy.where
with 1D Arrays¶
Let’s create a simple 1-dimensional array. This array will be the square of sequential integers. I’ve squared the integers so that the values in the array do not correspond directly to the values of the array indices (that would be a little confusing).
a1d = np.square(np.arange(10))
a1d
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81], dtype=int32)
Now we can use np.where
to identify the array indices where a1d
is greater than 5. You’ll notice the result is a tuple with a single array that contains index values 3 and greater. The first 3 elements in the array (a1d
) with values of 1, 2, and 4 are not returned because the values of those elements are less than 5.
np.where(a1d > 5)
(array([3, 4, 5, 6, 7, 8, 9], dtype=int64),)
Because the above call to np.where
returns array indices we can use this call to index an array. This allows us to get the actual values of a1d
where the condition is true. Below you will notice that instead of returning index values the actual values of a1d
are returned.
a1d[np.where(a1d > 5)]
array([ 9, 16, 25, 36, 49, 64, 81], dtype=int32)
Now let’s specify the true and false values. When we do this np.where
will return an array of values instead of a tuple of index arrays. Let’s give it a try.
Here we use the same conditional expression (a1d > 5
) but specify the result array should have a value of 0 where the condition is true and value of 1 where the condition is false. The result is an array with a value of 1 where a1d
is less than 5 and a value of 0 everywhere else.
np.where(a1d > 5, 0, 1)
array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0])
We can also keep the value in the original array for one of the results. Here, we keep the value of a1d
if the condition resolves to false.
np.where(a1d > 5, 0, a1d)
array([0, 1, 4, 0, 0, 0, 0, 0, 0, 0], dtype=int32)
numpy.where
with 2D Arrays¶
Applicaiton of numpy’s where
function to multidimensional arrays is very similar to the 1D array applications presented above. Nevertheless, I’ll go through a few example with 2D arrays to illustrate some minor differences.
To start, let’s create a 2D array that is similar to the 1D array we’ve already been working with. The following code creates a numpy
array with 5 rows and 3 columns where the value of each element is equal to the square of the element index.
a2d = np.square(np.arange(15)).reshape((5, 3))
a2d
array([[ 0, 1, 4], [ 9, 16, 25], [ 36, 49, 64], [ 81, 100, 121], [144, 169, 196]], dtype=int32)
As with a 1D array, the basic call of np.where
on a 2D array returns the indicies where the condition evaluates to true. Notice here that a tuple containing two arrays is returned because the conditional array has two dimensions. The first array contains the index of rows where the condition evaluates as true and the second contains the corresponding column for the row indices in the first array.
np.where(a2d > 65)
(array([3, 3, 3, 4, 4, 4], dtype=int64), array([0, 1, 2, 0, 1, 2], dtype=int64))
Below we use the result of the np.where
call to retrieve the values of a2d
where the condition evaluates to true. In this case a 1D array is returned.
a2d[np.where(a2d > 65)]
array([ 81, 100, 121, 144, 169, 196], dtype=int32)
Now let’s add the value-if-true and value-if-false parameters to the np.where
call. This returns a 2D array with the same shape as the conditional array (a2d
). As we did with the 1D array example, the value-if-true and value-if-false parameters can also be arrays instead of single values.
np.where(a2d > 65, 1, 0)
array([[0, 0, 0], [0, 0, 0], [0, 0, 0], [1, 1, 1], [1, 1, 1]])
Conclusion¶
The numpy.where
function is very powerful and should be used to apply if/else and conditional statements across numpy
arrays. As you can see, it is quite simple to use. Once you get the hang of it you will be using it all over the place in no time.
Whether you’re looking to take your GIS skills to the next level, or just getting started with GIS, we have a course for you! We’re constantly creating and curating more courses to help you improve your geospatial skills.
All of our courses are taught by industry professionals and include step-by-step video instruction so you don’t get lost in YouTube videos and blog posts, downloadable data so you can reproduce everything the instructor does, and code you can copy so you can avoid repetitive typing