9. Numpy: Boolean Indexing
By Bernd Klein. Last modified: 24 Mar 2022.
import numpy as np
A = np.array([4, 7, 3, 4, 2, 8])
print(A == 4)
OUTPUT:
[ True False False True False False]
Every element of the Array A is tested, if it is equal to 4. The results of these tests are the Boolean elements of the result array.
Of course, it is also possible to check on "<", "<=", ">" and ">=".
print(A < 5)
OUTPUT:
[ True False True True True False]
It works also for higher dimensions:
B = np.array([[42,56,89,65],
[99,88,42,12],
[55,42,17,18]])
print(B>=42)
OUTPUT:
[[ True True True True] [ True True True False] [ True True False False]]
It is a convenient way to threshold images.
import numpy as np
A = np.array([
[12, 13, 14, 12, 16, 14, 11, 10, 9],
[11, 14, 12, 15, 15, 16, 10, 12, 11],
[10, 12, 12, 15, 14, 16, 10, 12, 12],
[ 9, 11, 16, 15, 14, 16, 15, 12, 10],
[12, 11, 16, 14, 10, 12, 16, 12, 13],
[10, 15, 16, 14, 14, 14, 16, 15, 12],
[13, 17, 14, 10, 14, 11, 14, 15, 10],
[10, 16, 12, 14, 11, 12, 14, 18, 11],
[10, 19, 12, 14, 11, 12, 14, 18, 10],
[14, 22, 17, 19, 16, 17, 18, 17, 13],
[10, 16, 12, 14, 11, 12, 14, 18, 11],
[10, 16, 12, 14, 11, 12, 14, 18, 11],
[10, 19, 12, 14, 11, 12, 14, 18, 10],
[14, 22, 12, 14, 11, 12, 14, 17, 13],
[10, 16, 12, 14, 11, 12, 14, 18, 11]])
B = A < 15
B.astype(np.int8)
OUTPUT:
array([[1, 1, 1, 1, 0, 1, 1, 1, 1], [1, 1, 1, 0, 0, 0, 1, 1, 1], [1, 1, 1, 0, 1, 0, 1, 1, 1], [1, 1, 0, 0, 1, 0, 0, 1, 1], [1, 1, 0, 1, 1, 1, 0, 1, 1], [1, 0, 0, 1, 1, 1, 0, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 0, 0, 0, 0, 0, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1], [1, 0, 1, 1, 1, 1, 1, 0, 1]], dtype=int8)
If you have a close look at the previous output, you will see, that it the upper case 'A' is hidden in the array B.
Fancy Indexing
We will index an array C in the following example by using a Boolean mask. It is called fancy indexing, if arrays are indexed by using boolean or integer arrays (masks). The result will be a copy and not a view.
In our next example, we will use the Boolean mask of one array to select the corresponding elements of another array. The new array R contains all the elements of C where the corresponding value of (A<=5) is True.
C = np.array([123,188,190,99,77,88,100])
A = np.array([4,7,2,8,6,9,5])
R = C[A<=5]
print(R)
OUTPUT:
[123 190 100]
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Indexing with an Integer Array
In the following example, we will index with an integer array:
C[[0, 2, 3, 1, 4, 1]]
OUTPUT:
array([123, 190, 99, 188, 77, 188])
Indices can appear in every order and multiple times!
Exercises
Extract from the array np.array([3,4,6,10,24,89,45,43,46,99,100]) with Boolean masking all the number
-
which are not divisible by 3
-
which are divisible by 5
-
which are divisible by 3 and 5
-
which are divisible by 3 and set them to 42
Solutions
import numpy as np
A = np.array([3,4,6,10,24,89,45,43,46,99,100])
div3 = A[A%3!=0]
print("Elements of A not divisible by 3:")
print(div3)
div5 = A[A%5==0]
print("Elements of A divisible by 5:")
print(div5)
print("Elements of A, which are divisible by 3 and 5:")
print(A[(A%3==0) & (A%5==0)])
print("------------------")
#
A[A%3==0] = 42
print("""New values of A after setting the elements of A,
which are divisible by 3, to 42:""")
print(A)
OUTPUT:
Elements of A not divisible by 3: [ 4 10 89 43 46 100] Elements of A divisible by 5: [ 10 45 100] Elements of A, which are divisible by 3 and 5: [45] ------------------ New values of A after setting the elements of A, which are divisible by 3, to 42: [ 42 4 42 10 42 89 42 43 46 42 100]
nonzero and where
There is an ndarray method called nonzero and a numpy method with this name. The two functions are equivalent.
For an ndarray a both numpy.nonzero(a) and a.nonzero() return the indices of the elements of a that are non-zero. The indices are returned as a tuple of arrays, one for each dimension of 'a'. The corresponding non-zero values can be obtained with:
a[numpy.nonzero(a)]
import numpy as np
a = np.array([[0, 2, 3, 0, 1],
[1, 0, 0, 7, 0],
[5, 0, 0, 1, 0]])
print(a.nonzero())
OUTPUT:
(array([0, 0, 0, 1, 1, 2, 2]), array([1, 2, 4, 0, 3, 0, 3]))
If you want to group the indices by element, you can use transpose:
transpose(nonzero(a))
A two-dimensional array is returned. Every row corresponds to a non-zero element.
np.transpose(a.nonzero())
OUTPUT:
array([[0, 1], [0, 2], [0, 4], [1, 0], [1, 3], [2, 0], [2, 3]])
The corresponding non-zero values can be retrieved with:
a[a.nonzero()]
OUTPUT:
array([2, 3, 1, 1, 7, 5, 1])
The function 'nonzero' can be used to obtain the indices of an array, where a condition is True. In the following script, we create the Boolean array B >= 42:
B = np.array([[42,56,89,65],
[99,88,42,12],
[55,42,17,18]])
print(B >= 42)
OUTPUT:
[[ True True True True] [ True True True False] [ True True False False]]
np.nonzero(B >= 42) yields the indices of the B where the condition is true:
Exercise
Calculate the prime numbers between 0 and 100 by using a Boolean array.
Solution:
import numpy as np
is_prime = np.ones((100,), dtype=bool)
# Cross out 0 and 1 which are not primes:
is_prime[:2] = 0
# cross out its higher multiples (sieve of Eratosthenes):
nmax = int(np.sqrt(len(is_prime)))
for i in range(2, nmax):
is_prime[2*i::i] = False
print(np.nonzero(is_prime))
OUTPUT:
(array([ 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]),)
Flatnonzero and count_nonzero
similar functions:
-
flatnonzero :
Return indices that are non-zero in the flattened version of the input array.
-
count_nonzero :
Counts the number of non-zero elements in the input array.
Live Python training
Enjoying this page? We offer live Python training courses covering the content of this site.
Upcoming online Courses