{ } Raw JSON

bundles / numpy 2.4.4 / numpy / isin

_ArrayFunctionDispatcher

numpy:isin

source: /numpy/lib/_arraysetops_impl.py :958

Signature

def   isin ( element test_elements assume_unique = False invert = False * kind = None )

Summary

Calculates element in test_elements, broadcasting over element only. Returns a boolean array of the same shape as element that is True where an element of element is in test_elements and False otherwise.

Parameters

element : array_like

Input array.

test_elements : array_like

The values against which to test each value of element. This argument is flattened if it is an array or array_like. See notes for behavior with non-array-like parameters.

assume_unique : bool, optional

If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.

invert : bool, optional

If True, the values in the returned array are inverted, as if calculating element not in test_elements. Default is False. np.isin(a, b, invert=True) is equivalent to (but faster than) np.invert(np.isin(a, b)).

kind : {None, 'sort', 'table'}, optional

The algorithm to use. This will not affect the final result, but will affect the speed and memory use. The default, None, will select automatically based on memory considerations.

  • If 'sort', will use a mergesort-based approach. This will have a memory usage of roughly 6 times the sum of the sizes of element and test_elements, not accounting for size of dtypes.

  • If 'table', will use a lookup table approach similar to a counting sort. This is only available for boolean and integer arrays. This will have a memory usage of the size of element plus the max-min value of test_elements. assume_unique has no effect when the 'table' option is used.

  • If None, will automatically choose 'table' if the required memory allocation is less than or equal to 6 times the sum of the sizes of element and test_elements, otherwise will use 'sort'. This is done to not use a large amount of memory by default, even though 'table' may be faster in most cases. If 'table' is chosen, assume_unique will have no effect.

Returns

isin : ndarray, bool

Has the same shape as element. The values element[isin] are in test_elements.

Notes

isin is an element-wise function version of the python keyword in. isin(a, b) is roughly equivalent to np.array([item in b for item in a]) if a and b are 1-D sequences.

element and test_elements are converted to arrays if they are not already. If test_elements is a set (or other non-sequence collection) it will be converted to an object array with one element, rather than an array of the values contained in test_elements. This is a consequence of the array constructor's way of handling non-sequence collections. Converting the set to a list usually gives the desired behavior.

Using kind='table' tends to be faster than kind='sort' if the following relationship is true: log10(len(test_elements)) > (log10(max(test_elements)-min(test_elements)) - 2.27) / 0.927, but may use greater memory. The default value for kind will be automatically selected based only on memory usage, so one may manually set kind='table' if memory constraints can be relaxed.

Examples

import numpy as np
element = 2*np.arange(4).reshape((2, 2))
element
test_elements = [1, 2, 4, 8]
mask = np.isin(element, test_elements)
mask
element[mask]
The indices of the matched values can be obtained with `nonzero`:
np.nonzero(mask)
The test can also be inverted:
mask = np.isin(element, test_elements, invert=True)
mask
element[mask]
Because of how `array` handles sets, the following does not work as expected:
test_set = {1, 2, 4, 8}
np.isin(element, test_set)
Casting the set to a list gives the expected result:
np.isin(element, list(test_set))

Aliases

  • numpy.isin