Doc

Array iterator API

docs/reference:c-api:iterator

Array iterator

The array iterator encapsulates many of the key features in ufuncs, allowing user code to support features like output parameters, preservation of memory layouts, and buffering of data with the wrong alignment or type, without requiring difficult coding.

This page documents the API for the iterator. The iterator is named NpyIter and functions are named NpyIter_*.

There is an introductory guide to array iteration which may be of interest for those using this C API. In many instances, testing out ideas by creating the iterator in Python is a good idea before writing the C iteration code.

Iteration example

The best way to become familiar with the iterator is to look at its usage within the NumPy codebase itself. For example, here is a slightly tweaked version of the code for PyArray_CountNonzero, which counts the number of non-zero elements in an array.

npy_intp PyArray_CountNonzero(PyArrayObject* self)
{
    /* Nonzero boolean function */
    PyArray_NonzeroFunc* nonzero = PyArray_DESCR(self)->f->nonzero;

    NpyIter* iter;
    NpyIter_IterNextFunc *iternext;
    char** dataptr;
    npy_intp nonzero_count;
    npy_intp* strideptr,* innersizeptr;

    /* Handle zero-sized arrays specially */
    if (PyArray_SIZE(self) == 0) {
        return 0;
    }

    /*
     * Create and use an iterator to count the nonzeros.
     *   flag NPY_ITER_READONLY
     *     - The array is never written to.
     *   flag NPY_ITER_EXTERNAL_LOOP
     *     - Inner loop is done outside the iterator for efficiency.
     *   flag NPY_ITER_NPY_ITER_REFS_OK
     *     - Reference types are acceptable.
     *   order NPY_KEEPORDER
     *     - Visit elements in memory order, regardless of strides.
     *       This is good for performance when the specific order
     *       elements are visited is unimportant.
     *   casting NPY_NO_CASTING
     *     - No casting is required for this operation.
     */
    iter = NpyIter_New(self, NPY_ITER_READONLY|
                             NPY_ITER_EXTERNAL_LOOP|
                             NPY_ITER_REFS_OK,
                        NPY_KEEPORDER, NPY_NO_CASTING,
                        NULL);
    if (iter == NULL) {
        return -1;
    }

    /*
     * The iternext function gets stored in a local variable
     * so it can be called repeatedly in an efficient manner.
     */
    iternext = NpyIter_GetIterNext(iter, NULL);
    if (iternext == NULL) {
        NpyIter_Deallocate(iter);
        return -1;
    }
    /* The location of the data pointer which the iterator may update */
    dataptr = NpyIter_GetDataPtrArray(iter);
    /* The location of the stride which the iterator may update */
    strideptr = NpyIter_GetInnerStrideArray(iter);
    /* The location of the inner loop size which the iterator may update */
    innersizeptr = NpyIter_GetInnerLoopSizePtr(iter);

    nonzero_count = 0;
    do {
        /* Get the inner loop data/stride/count values */
        char* data = *dataptr;
        npy_intp stride = *strideptr;
        npy_intp count = *innersizeptr;

        /* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */
        while (count--) {
            if (nonzero(data, self)) {
                ++nonzero_count;
            }
            data += stride;
        }

        /* Increment the iterator to the next inner loop */
    } while(iternext(iter));

    NpyIter_Deallocate(iter);

    return nonzero_count;
}

Multi-iteration example

Here is a copy function using the iterator. The order parameter is used to control the memory layout of the allocated result, typically NPY_KEEPORDER is desired.

PyObject *CopyArray(PyObject *arr, NPY_ORDER order)
{
    NpyIter *iter;
    NpyIter_IterNextFunc *iternext;
    PyObject *op[2], *ret;
    npy_uint32 flags;
    npy_uint32 op_flags[2];
    npy_intp itemsize, *innersizeptr, innerstride;
    char **dataptrarray;

    /*
     * No inner iteration - inner loop is handled by CopyArray code
     */
    flags = NPY_ITER_EXTERNAL_LOOP;
    /*
     * Tell the constructor to automatically allocate the output.
     * The data type of the output will match that of the input.
     */
    op[0] = arr;
    op[1] = NULL;
    op_flags[0] = NPY_ITER_READONLY;
    op_flags[1] = NPY_ITER_WRITEONLY | NPY_ITER_ALLOCATE;

    /* Construct the iterator */
    iter = NpyIter_MultiNew(2, op, flags, order, NPY_NO_CASTING,
                            op_flags, NULL);
    if (iter == NULL) {
        return NULL;
    }

    /*
     * Make a copy of the iternext function pointer and
     * a few other variables the inner loop needs.
     */
    iternext = NpyIter_GetIterNext(iter, NULL);
    innerstride = NpyIter_GetInnerStrideArray(iter)[0];
    itemsize = NpyIter_GetDescrArray(iter)[0]->elsize;
    /*
     * The inner loop size and data pointers may change during the
     * loop, so just cache the addresses.
     */
    innersizeptr = NpyIter_GetInnerLoopSizePtr(iter);
    dataptrarray = NpyIter_GetDataPtrArray(iter);

    /*
     * Note that because the iterator allocated the output,
     * it matches the iteration order and is packed tightly,
     * so we don't need to check it like the input.
     */
    if (innerstride == itemsize) {
        do {
            memcpy(dataptrarray[1], dataptrarray[0],
                                    itemsize * (*innersizeptr));
        } while (iternext(iter));
    } else {
        /* For efficiency, should specialize this based on item size... */
        npy_intp i;
        do {
            npy_intp size = *innersizeptr;
            char *src = dataptrarray[0], *dst = dataptrarray[1];
            for(i = 0; i < size; i++, src += innerstride, dst += itemsize) {
                memcpy(dst, src, itemsize);
            }
        } while (iternext(iter));
    }

    /* Get the result from the iterator object array */
    ret = NpyIter_GetOperandArray(iter)[1];
    Py_INCREF(ret);

    if (NpyIter_Deallocate(iter) != NPY_SUCCEED) {
        Py_DECREF(ret);
        return NULL;
    }

    return ret;
}

Multi index tracking example

This example shows you how to work with the NPY_ITER_MULTI_INDEX flag. For simplicity, we assume the argument is a two-dimensional array.

int PrintMultiIndex(PyArrayObject *arr) {
    NpyIter *iter;
    NpyIter_IterNextFunc *iternext;
    npy_intp multi_index[2];

    iter = NpyIter_New(
        arr, NPY_ITER_READONLY | NPY_ITER_MULTI_INDEX | NPY_ITER_REFS_OK,
        NPY_KEEPORDER, NPY_NO_CASTING, NULL);
    if (iter == NULL) {
        return -1;
    }
    if (NpyIter_GetNDim(iter) != 2) {
        NpyIter_Deallocate(iter);
        PyErr_SetString(PyExc_ValueError, "Array must be 2-D");
        return -1;
    }
    if (NpyIter_GetIterSize(iter) != 0) {
        iternext = NpyIter_GetIterNext(iter, NULL);
        if (iternext == NULL) {
            NpyIter_Deallocate(iter);
            return -1;
        }
        NpyIter_GetMultiIndexFunc *get_multi_index =
            NpyIter_GetGetMultiIndex(iter, NULL);
        if (get_multi_index == NULL) {
            NpyIter_Deallocate(iter);
            return -1;
        }

        do {
            get_multi_index(iter, multi_index);
            printf("multi_index is [%" NPY_INTP_FMT ", %" NPY_INTP_FMT "]\n",
                   multi_index[0], multi_index[1]);
        } while (iternext(iter));
    }
    if (!NpyIter_Deallocate(iter)) {
        return -1;
    }
    return 0;
}

When called with a 2x3 array, the above example prints:

multi_index is [0, 0]
multi_index is [0, 1]
multi_index is [0, 2]
multi_index is [1, 0]
multi_index is [1, 1]
multi_index is [1, 2]

Iterator data types

The iterator layout is an internal detail, and user code only sees an incomplete struct.

Construction and destruction

Creates an iterator for the given numpy array object op.
Flags that may be passed in flags are any combination of the global and per-operand flags documented in NpyIter_MultiNew, except for NPY_ITER_ALLOCATE.
Any of the NPY_ORDER enum values may be passed to order. For efficient iteration, NPY_KEEPORDER is the best option, and the other orders enforce the particular iteration pattern.
Any of the NPY_CASTING enum values may be passed to casting. The values include NPY_NO_CASTING, NPY_EQUIV_CASTING, NPY_SAFE_CASTING, NPY_SAME_KIND_CASTING, and NPY_UNSAFE_CASTING. To allow the casts to occur, copying or buffering must also be enabled.
If dtype isn't NULL, then it requires that data type. If copying is allowed, it will make a temporary copy if the data is castable. If NPY_ITER_UPDATEIFCOPY is enabled, it will also copy the data back with another cast upon iterator destruction.
Returns NULL if there is an error, otherwise returns the allocated iterator.
To make an iterator similar to the old iterator, this should work.
iter = NpyIter_New(op, NPY_ITER_READWRITE,
                        NPY_CORDER, NPY_NO_CASTING, NULL);
If you want to edit an array with aligned double code, but the order doesn't matter, you would use this.
dtype = PyArray_DescrFromType(NPY_DOUBLE);
    iter = NpyIter_New(op, NPY_ITER_READWRITE|
                        NPY_ITER_BUFFERED|
                        NPY_ITER_NBO|
                        NPY_ITER_ALIGNED,
                        NPY_KEEPORDER,
                        NPY_SAME_KIND_CASTING,
                        dtype);
    Py_DECREF(dtype);

Creates an iterator for broadcasting the nop array objects provided in op, using regular NumPy broadcasting rules.
Any of the NPY_ORDER enum values may be passed to order. For efficient iteration, NPY_KEEPORDER is the best option, and the other orders enforce the particular iteration pattern. When using NPY_KEEPORDER, if you also want to ensure that the iteration is not reversed along an axis, you should pass the flag NPY_ITER_DONT_NEGATE_STRIDES.
Any of the NPY_CASTING enum values may be passed to casting. The values include NPY_NO_CASTING, NPY_EQUIV_CASTING, NPY_SAFE_CASTING, NPY_SAME_KIND_CASTING, and NPY_UNSAFE_CASTING. To allow the casts to occur, copying or buffering must also be enabled.
If op_dtypes isn't NULL, it specifies a data type or NULL for each op[i].
Returns NULL if there is an error, otherwise returns the allocated iterator.
Flags that may be passed in flags, applying to the whole iterator, are:

Extends NpyIter_MultiNew with several advanced options providing more control over broadcasting and buffering.
If -1/NULL values are passed to oa_ndim, op_axes, itershape, and buffersize, it is equivalent to NpyIter_MultiNew.
The parameter oa_ndim, when not zero or -1, specifies the number of dimensions that will be iterated with customized broadcasting. If it is provided, op_axes must and itershape can also be provided. The op_axes parameter let you control in detail how the axes of the operand arrays get matched together and iterated. In op_axes, you must provide an array of nop pointers to oa_ndim-sized arrays of type npy_intp. If an entry in op_axes is NULL, normal broadcasting rules will apply. In op_axes[j][i] is stored either a valid axis of op[j], or -1 which means newaxis. Within each op_axes[j] array, axes may not be repeated. The following example is how normal broadcasting applies to a 3-D array, a 2-D array, a 1-D array and a scalar.
Note: Before NumPy 1.8 oa_ndim == 0 was used for signalling that op_axes and itershape are unused. This is deprecated and should be replaced with -1. Better backward compatibility may be achieved by using NpyIter_MultiNew for this case.
int oa_ndim = 3;               /* # iteration axes */
    int op0_axes[] = {0, 1, 2};    /* 3-D operand */
    int op1_axes[] = {-1, 0, 1};   /* 2-D operand */
    int op2_axes[] = {-1, -1, 0};  /* 1-D operand */
    int op3_axes[] = {-1, -1, -1}  /* 0-D (scalar) operand */
    int* op_axes[] = {op0_axes, op1_axes, op2_axes, op3_axes};
The itershape parameter allows you to force the iterator to have a specific iteration shape. It is an array of length oa_ndim. When an entry is negative, its value is determined from the operands. This parameter allows automatically allocated outputs to get additional dimensions which don't match up with any dimension of an input.
If buffersize is zero, a default buffer size is used, otherwise it specifies how big of a buffer to use. Buffers which are powers of 2 such as 4096 or 8192 are recommended.
Returns NULL if there is an error, otherwise returns the allocated iterator.

Resets the iterator and restricts it to the iterindex range [istart, iend). See NpyIter_Copy for an explanation of how to use this for multi-threaded iteration. This requires that the flag NPY_ITER_RANGED was passed to the iterator constructor.
If you want to reset both the iterindex range and the base pointers at the same time, you can do the following to avoid extra buffer copying (be sure to add the return code error checks when you copy this code).
/* Set to a trivial empty range */
    NpyIter_ResetToIterIndexRange(iter, 0, 0);
    /* Set the base pointers */
    NpyIter_ResetBasePointers(iter, baseptrs);
    /* Set to the desired range */
    NpyIter_ResetToIterIndexRange(iter, istart, iend);
Returns NPY_SUCCEED or NPY_FAIL. If errmsg is non-NULL, no Python exception is set when NPY_FAIL is returned. Instead, *errmsg is set to an error message. When errmsg is non-NULL, the function may be safely called without holding the Python GIL.

Resets the iterator back to its initial state, but using the values in baseptrs for the data instead of the pointers from the arrays being iterated. This functions is intended to be used, together with the op_axes parameter, by nested iteration code with two or more iterators.
Returns NPY_SUCCEED or NPY_FAIL. If errmsg is non-NULL, no Python exception is set when NPY_FAIL is returned. Instead, *errmsg is set to an error message. When errmsg is non-NULL, the function may be safely called without holding the Python GIL.
TODO: Move the following into a special section on nested iterators.
Creating iterators for nested iteration requires some care. All the iterator operands must match exactly, or the calls to NpyIter_ResetBasePointers will be invalid. This means that automatic copies and output allocation should not be used haphazardly. It is possible to still use the automatic data conversion and casting features of the iterator by creating one of the iterators with all the conversion parameters enabled, then grabbing the allocated operands with the NpyIter_GetOperandArray function and passing them into the constructors for the rest of the iterators.
WARNING: When creating iterators for nested iteration, the code must not use a dimension more than once in the different iterators. If this is done, nested iteration will produce out-of-bounds pointers during iteration.
WARNING: When creating iterators for nested iteration, buffering can only be applied to the innermost iterator. If a buffered iterator is used as the source for baseptrs, it will point into a small buffer instead of the array and the inner iteration will be invalid.
The pattern for using nested iterators is as follows.
NpyIter *iter1, *iter1;
    NpyIter_IterNextFunc *iternext1, *iternext2;
    char **dataptrs1;

    /*
     * With the exact same operands, no copies allowed, and
     * no axis in op_axes used both in iter1 and iter2.
     * Buffering may be enabled for iter2, but not for iter1.
     */
    iter1 = ...; iter2 = ...;

    iternext1 = NpyIter_GetIterNext(iter1);
    iternext2 = NpyIter_GetIterNext(iter2);
    dataptrs1 = NpyIter_GetDataPtrArray(iter1);

    do {
        NpyIter_ResetBasePointers(iter2, dataptrs1);
        do {
            /* Use the iter2 values */
        } while (iternext2(iter2));
    } while (iternext1(iter1));

Gets the iterindex sub-range that is being iterated. If NPY_ITER_RANGED was not specified, this always returns the range [0, NpyIter_IterSize(iter)).

Builds a set of strides which are the same as the strides of an output array created using the NPY_ITER_ALLOCATE flag, where NULL was passed for op_axes. This is for data packed contiguously, but not necessarily in C or Fortran order. This should be used together with NpyIter_GetShape and NpyIter_GetNDim with the flag NPY_ITER_MULTI_INDEX passed into the constructor.
A use case for this function is to match the shape and layout of the iterator and tack on one or more dimensions. For example, in order to generate a vector per input value for a numerical gradient, you pass in ndim*itemsize for itemsize, then add another dimension to the end with size ndim and stride itemsize. To do the Hessian matrix, you do the same thing but add two dimensions, or take advantage of the symmetry and pack it into 1 dimension with a particular encoding.
This function may only be called if the iterator is tracking a multi-index and if NPY_ITER_DONT_NEGATE_STRIDES was used to prevent an axis from being iterated in reverse order.
If an array is created with this method, simply adding 'itemsize' for each iteration will traverse the new array matching the iterator.
Returns NPY_SUCCEED or NPY_FAIL.

Functions for iteration

Returns a function pointer for iteration. A specialized version of the function pointer may be calculated by this function instead of being stored in the iterator structure. Thus, to get good performance, it is required that the function pointer be saved in a variable rather than retrieved for each loop iteration.
Returns NULL if there is an error. If errmsg is non-NULL, no Python exception is set when NPY_FAIL is returned. Instead, *errmsg is set to an error message. When errmsg is non-NULL, the function may be safely called without holding the Python GIL.
The typical looping construct is as follows.
NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL);
    char** dataptr = NpyIter_GetDataPtrArray(iter);

    do {
        /* use the addresses dataptr[0], ... dataptr[nop-1] */
    } while(iternext(iter));
When NPY_ITER_EXTERNAL_LOOP is specified, the typical inner loop construct is as follows.
NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL);
    char** dataptr = NpyIter_GetDataPtrArray(iter);
    npy_intp* stride = NpyIter_GetInnerStrideArray(iter);
    npy_intp* size_ptr = NpyIter_GetInnerLoopSizePtr(iter), size;
    npy_intp iop, nop = NpyIter_GetNOp(iter);

    do {
        size = *size_ptr;
        while (size--) {
            /* use the addresses dataptr[0], ... dataptr[nop-1] */
            for (iop = 0; iop < nop; ++iop) {
                dataptr[iop] += stride[iop];
            }
        }
    } while (iternext());
Observe that we are using the dataptr array inside the iterator, not copying the values to a local temporary. This is possible because when iternext() is called, these pointers will be overwritten with fresh values, not incrementally updated.
If a compile-time fixed buffer is being used (both flags NPY_ITER_BUFFERED and NPY_ITER_EXTERNAL_LOOP), the inner size may be used as a signal as well. The size is guaranteed to become zero when iternext() returns false, enabling the following loop construct. Note that if you use this construct, you should not pass NPY_ITER_GROWINNER as a flag, because it will cause larger sizes under some circumstances.
/* The constructor should have buffersize passed as this value */
    #define FIXED_BUFFER_SIZE 1024

    NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL);
    char **dataptr = NpyIter_GetDataPtrArray(iter);
    npy_intp *stride = NpyIter_GetInnerStrideArray(iter);
    npy_intp *size_ptr = NpyIter_GetInnerLoopSizePtr(iter), size;
    npy_intp i, iop, nop = NpyIter_GetNOp(iter);

    /* One loop with a fixed inner size */
    size = *size_ptr;
    while (size == FIXED_BUFFER_SIZE) {
        /*
         * This loop could be manually unrolled by a factor
         * which divides into FIXED_BUFFER_SIZE
         */
        for (i = 0; i < FIXED_BUFFER_SIZE; ++i) {
            /* use the addresses dataptr[0], ... dataptr[nop-1] */
            for (iop = 0; iop < nop; ++iop) {
                dataptr[iop] += stride[iop];
            }
        }
        iternext();
        size = *size_ptr;
    }

    /* Finish-up loop with variable inner size */
    if (size > 0) do {
        size = *size_ptr;
        while (size--) {
            /* use the addresses dataptr[0], ... dataptr[nop-1] */
            for (iop = 0; iop < nop; ++iop) {
                dataptr[iop] += stride[iop];
            }
        }
    } while (iternext());

Returns a function pointer for getting the current multi-index of the iterator. Returns NULL if the iterator is not tracking a multi-index. It is recommended that this function pointer be cached in a local variable before the iteration loop.
Returns NULL if there is an error. If errmsg is non-NULL, no Python exception is set when NPY_FAIL is returned. Instead, *errmsg is set to an error message. When errmsg is non-NULL, the function may be safely called without holding the Python GIL.

When the flag NPY_ITER_EXTERNAL_LOOP is used, the code needs to know the parameters for doing the inner loop. These functions provide that information.

Gets an array of strides which are fixed, or will not change during the entire iteration. For strides that may change, the value NPY_MAX_INTP is placed in the stride.
Once the iterator is prepared for iteration (after a reset if NPY_ITER_DELAY_BUFALLOC was used), call this to get the strides which may be used to select a fast inner loop function. For example, if the stride is 0, that means the inner loop can always load its value into a variable once, then use the variable throughout the loop, or if the stride equals the itemsize, a contiguous version for that operand may be used.
This function may be safely called without holding the Python GIL.

Converting from previous NumPy iterators

The old iterator API includes functions like PyArrayIter_Check, PyArray_Iter* and PyArray_ITER_*. The multi-iterator array includes PyArray_MultiIter*, PyArray_Broadcast, and PyArray_RemoveSmallest. The new iterator design replaces all of this functionality with a single object and associated API. One goal of the new API is that all uses of the existing iterator should be replaceable with the new iterator without significant effort. In 1.6, the major exception to this is the neighborhood iterator, which does not have corresponding features in this iterator.

Here is a conversion table for which functions to use with the new iterator:

=====================================  ===================================================
*Iterator Functions*
:c:func:`PyArray_IterNew`              :c:func:`NpyIter_New`
:c:func:`PyArray_IterAllButAxis`       :c:func:`NpyIter_New` + ``axes`` parameter **or**
                                       Iterator flag :c:data:`NPY_ITER_EXTERNAL_LOOP`
:c:func:`PyArray_BroadcastToShape`     **NOT SUPPORTED** (Use the support for
                                       multiple operands instead.)
:c:func:`PyArrayIter_Check`            Will need to add this in Python exposure
:c:func:`PyArray_ITER_RESET`           :c:func:`NpyIter_Reset`
:c:func:`PyArray_ITER_NEXT`            Function pointer from :c:func:`NpyIter_GetIterNext`
:c:func:`PyArray_ITER_DATA`            :c:func:`NpyIter_GetDataPtrArray`
:c:func:`PyArray_ITER_GOTO`            :c:func:`NpyIter_GotoMultiIndex`
:c:func:`PyArray_ITER_GOTO1D`          :c:func:`NpyIter_GotoIndex` or
                                       :c:func:`NpyIter_GotoIterIndex`
:c:func:`PyArray_ITER_NOTDONE`         Return value of ``iternext`` function pointer
*Multi-iterator Functions*
:c:func:`PyArray_MultiIterNew`         :c:func:`NpyIter_MultiNew`
:c:func:`PyArray_MultiIter_RESET`      :c:func:`NpyIter_Reset`
:c:func:`PyArray_MultiIter_NEXT`       Function pointer from :c:func:`NpyIter_GetIterNext`
:c:func:`PyArray_MultiIter_DATA`       :c:func:`NpyIter_GetDataPtrArray`
:c:func:`PyArray_MultiIter_NEXTi`      **NOT SUPPORTED** (always lock-step iteration)
:c:func:`PyArray_MultiIter_GOTO`       :c:func:`NpyIter_GotoMultiIndex`
:c:func:`PyArray_MultiIter_GOTO1D`     :c:func:`NpyIter_GotoIndex` or
                                       :c:func:`NpyIter_GotoIterIndex`
:c:func:`PyArray_MultiIter_NOTDONE`    Return value of ``iternext`` function pointer
:c:func:`PyArray_Broadcast`            Handled by :c:func:`NpyIter_MultiNew`
:c:func:`PyArray_RemoveSmallest`       Iterator flag :c:data:`NPY_ITER_EXTERNAL_LOOP`
*Other Functions*
:c:func:`PyArray_ConvertToCommonType`  Iterator flag :c:data:`NPY_ITER_COMMON_DTYPE`
=====================================  ===================================================