where function from
numpy is a powerful way to vectorize if/else statements across entire arrays. There are two primary ways to use
numpy.where can be used to idenefity array indices where a condition is true (or false). Second, it can be used to index and change values where a condition is met.
Multiple applicaitons of
numpy.where are exaplained and demonstrated in this article for both 1-dimensional and multi-dimensional arrays.
numpy.where Basic Syntax and Usage¶
The general usage of
numpy.where is as follows:
numpy.where(condition, value if true (optional), value if false (optional) ). The condition is applied to a numpy array and must evaluate to a boolean. For example
a > 5 where a is a
numpy array. The result of a call to
numpy.where is an array.
If the true value and false value are not specified an array (or tuple of arrays if the conditional array is multidimensional) of indices where the condition evaluates to true is returned. If true and false values are specified the result is an array of the same shape as the conditional array with updated values.
Let’s go through some examples to demonstrate this in different scenarios.
The first step is to import Python’s
numpy module, as shown below. If you are not sure if you have
numpy installed, follow the directions in this article to find out.
import numpy as np
numpy.where with 1D Arrays¶
Let’s create a simple 1-dimensional array. This array will be the square of sequential integers. I’ve squared the integers so that the values in the array do not correspond directly to the values of the array indices (that would be a little confusing).
a1d = np.square(np.arange(10)) a1d
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81], dtype=int32)
Now we can use
np.where to identify the array indices where
a1d is greater than 5. You’ll notice the result is a tuple with a single array that contains index values 3 and greater. The first 3 elements in the array (
a1d) with values of 1, 2, and 4 are not returned because the values of those elements are less than 5.
np.where(a1d > 5)
(array([3, 4, 5, 6, 7, 8, 9], dtype=int64),)
Because the above call to
np.where returns array indices we can use this call to index an array. This allows us to get the actual values of
a1d where the condition is true. Below you will notice that instead of returning index values the actual values of
a1d are returned.
a1d[np.where(a1d > 5)]
array([ 9, 16, 25, 36, 49, 64, 81], dtype=int32)
Now let’s specify the true and false values. When we do this
np.where will return an array of values instead of a tuple of index arrays. Let’s give it a try.
Here we use the same conditional expression (
a1d > 5) but specify the result array should have a value of 0 where the condition is true and value of 1 where the condition is false. The result is an array with a value of 1 where
a1d is less than 5 and a value of 0 everywhere else.
np.where(a1d > 5, 0, 1)
array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0])
We can also keep the value in the original array for one of the results. Here, we keep the value of
a1d if the condition resolves to false.
np.where(a1d > 5, 0, a1d)
array([0, 1, 4, 0, 0, 0, 0, 0, 0, 0], dtype=int32)
numpy.where with 2D Arrays¶
Applicaiton of numpy’s
where function to multidimensional arrays is very similar to the 1D array applications presented above. Nevertheless, I’ll go through a few example with 2D arrays to illustrate some minor differences.
To start, let’s create a 2D array that is similar to the 1D array we’ve already been working with. The following code creates a
numpy array with 5 rows and 3 columns where the value of each element is equal to the square of the element index.
a2d = np.square(np.arange(15)).reshape((5, 3)) a2d
array([[ 0, 1, 4], [ 9, 16, 25], [ 36, 49, 64], [ 81, 100, 121], [144, 169, 196]], dtype=int32)
As with a 1D array, the basic call of
np.where on a 2D array returns the indicies where the condition evaluates to true. Notice here that a tuple containing two arrays is returned because the conditional array has two dimensions. The first array contains the index of rows where the condition evaluates as true and the second contains the corresponding column for the row indices in the first array.
np.where(a2d > 65)
(array([3, 3, 3, 4, 4, 4], dtype=int64), array([0, 1, 2, 0, 1, 2], dtype=int64))
Below we use the result of the
np.where call to retrieve the values of
a2d where the condition evaluates to true. In this case a 1D array is returned.
a2d[np.where(a2d > 65)]
array([ 81, 100, 121, 144, 169, 196], dtype=int32)
Now let’s add the value-if-true and value-if-false parameters to the
np.where call. This returns a 2D array with the same shape as the conditional array (
a2d). As we did with the 1D array example, the value-if-true and value-if-false parameters can also be arrays instead of single values.
np.where(a2d > 65, 1, 0)
array([[0, 0, 0], [0, 0, 0], [0, 0, 0], [1, 1, 1], [1, 1, 1]])
numpy.where function is very powerful and should be used to apply if/else and conditional statements across
numpy arrays. As you can see, it is quite simple to use. Once you get the hang of it you will be using it all over the place in no time.
Learn GIS From Industry Professionals
Whether you’re looking to take your GIS skills to the next level, or just getting started with GIS, we have a course for you! We’re constantly creating and curating more courses to help you improve your geospatial skills.
All of our courses are taught by industry professionals and include step-by-step video instruction so you don’t get lost in YouTube videos and blog posts, downloadable data so you can reproduce everything the instructor does, and code you can copy so you can avoid repetitive typing