This is a pre-release version (latest). Go to latest (2.4.4)
{ } Raw JSON

bundles / numpy latest / docs

Doc

Python types and C-structures

docs/reference:c-api:types-and-structures

Several new types are defined in the C-code. Most of these are accessible from Python, but a few are not exposed due to their limited use. Every new Python type has an associated PyObject * with an internal structure that includes a pointer to a "method table" that defines how the new object behaves in Python. When you receive a Python object into C code, you always get a pointer to a PyObject structure. Because a PyObject structure is very generic and defines only PyObject_HEAD, by itself it is not very interesting. However, different objects contain more details after the PyObject_HEAD (but you have to cast to the correct type to access them --- or use accessor functions or macros).

New Python types defined

Python types are the functional equivalent in C of classes in Python. By constructing a new Python type you make available a new object for Python. The ndarray object is an example of a new type defined in C. New types are defined in C by two basic steps:

  • creating a C-structure (usually named Py{Name}Object) that is binary- compatible with the PyObject structure itself but holds the additional information needed for that particular object;

  • populating the PyTypeObject table (pointed to by the ob_type member of the PyObject structure) with pointers to functions that implement the desired behavior for the type.

Instead of special method names which define behavior for Python classes, there are "function tables" which point to functions that implement the desired results. The PyTypeObject itself is dynamic which allows C types that can be "sub-typed" from other C-types in C, and sub-classed in Python. The children types inherit the attributes and methods from their parent(s).

There are two major new types: the ndarray ( PyArray_Type ) and the ufunc ( PyUFunc_Type ). Additional types play a supportive role: the PyArrayIter_Type, the PyArrayMultiIter_Type, and the PyArrayDescr_Type . The PyArrayIter_Type is the type for a flat iterator for an ndarray (the object that is returned when getting the flat attribute). The PyArrayMultiIter_Type is the type of the object returned when calling broadcast. It handles iteration and broadcasting over a collection of nested sequences. Also, the PyArrayDescr_Type is the data-type-descriptor type whose instances describe the data and PyArray_DTypeMeta is the metaclass for data-type descriptors. There are also new scalar-array types which are new Python scalars corresponding to each of the fundamental data types available for arrays. Additional types are placeholders that allow the array scalars to fit into a hierarchy of actual Python types. Finally, the PyArray_DTypeMeta instances corresponding to the NumPy built-in data types are also publicly visible.

PyArray_Type and PyArrayObject

The PyArrayObject C-structure contains all of the required information for an array. All instances of an ndarray (and its subclasses) will have this structure. For future compatibility, these structure members should normally be accessed using the provided macros. If you need a shorter name, then you can make use of NPY_AO (deprecated) which is defined to be equivalent to PyArrayObject. Direct access to the struct fields are deprecated. Use the PyArray_*(arr) form instead. As of NumPy 1.20, the size of this struct is not considered part of the NumPy ABI (see note at the end of the member list).

typedef struct PyArrayObject {
       PyObject_HEAD
       char *data;
       int nd;
       npy_intp *dimensions;
       npy_intp *strides;
       PyObject *base;
       PyArray_Descr *descr;
       int flags;
       PyObject *weakreflist;
       /* version dependent private members */
   } PyArrayObject;

PyObject_HEAD

This is needed by all Python objects. It consists of (at least) a reference count member ( ob_refcnt ) and a pointer to the typeobject ( ob_type ). (Other elements may also be present if Python was compiled with special options see Include/object.h in the Python source tree for more information). The ob_type member points to a Python type object.

The PyArray_Type typeobject implements many of the features of Python objects <PyTypeObject> including the tp_as_number <PyTypeObject.tp_as_number>, tp_as_sequence <PyTypeObject.tp_as_sequence>, tp_as_mapping <PyTypeObject.tp_as_mapping>, and tp_as_buffer <PyTypeObject.tp_as_buffer> interfaces. The rich comparison <richcmpfunc>) is also used along with new-style attribute lookup for member (tp_members <PyTypeObject.tp_members>) and properties (tp_getset <PyTypeObject.tp_getset>). The PyArray_Type can also be sub-typed.

PyGenericArrType_Type

PyArrayDescr_Type and PyArray_Descr

PyArray_ArrFuncs

PyArrayMethod_Context and PyArrayMethod_Spec

PyArray_DTypeMeta and PyArrayDTypeMeta_Spec

Exposed DTypes classes (PyArray_DTypeMeta objects)

For use with promoters, NumPy exposes a number of Dtypes following the pattern PyArray_<Name>DType corresponding to those found in np.dtypes.

Additionally, the three DTypes, PyArray_PyLongDType, PyArray_PyFloatDType, PyArray_PyComplexDType correspond to the Python scalar values. These cannot be used in all places, but do allow for example the common dtype operation and implementing promotion with them may be necessary.

Further, the following abstract DTypes are defined which cover both the builtin NumPy ones and the python ones, and users can in principle subclass from them (this does not inherit any DType specific functionality): * PyArray_IntAbstractDType * PyArray_FloatAbstractDType * PyArray_ComplexAbstractDType

PyUFunc_Type and PyUFuncObject

PyArrayIter_Type and PyArrayIterObject

How to use an array iterator on a C-level is explained more fully in later sections. Typically, you do not need to concern yourself with the internal structure of the iterator object, and merely interact with it through the use of the macros PyArray_ITER_NEXT (it), PyArray_ITER_GOTO (it, dest), or PyArray_ITER_GOTO1D (it, index). All of these macros require the argument it to be a PyArrayIterObject *.

PyArrayMultiIter_Type and PyArrayMultiIterObject

PyArrayNeighborhoodIter_Type and PyArrayNeighborhoodIterObject

ScalarArrayTypes

There is a Python type for each of the different built-in data types that can be present in the array. Most of these are simple wrappers around the corresponding data type in C. The C-names for these types are Py{TYPE}ArrType_Type where {TYPE} can be

Bool, Byte, Short, Int, Long, LongLong, UByte, UShort, UInt, ULong, ULongLong, Half, Float, Double, LongDouble, CFloat, CDouble, CLongDouble, String, Unicode, Void, Datetime, Timedelta, and Object.

These type names are part of the C-API and can therefore be created in extension C-code. There is also a PyIntpArrType_Type and a PyUIntpArrType_Type that are simple substitutes for one of the integer types that can hold a pointer on the platform. The structure of these scalar objects is not exposed to C-code. The function PyArray_ScalarAsCtype (..) can be used to extract the C-type value from the array scalar and the function PyArray_Scalar (...) can be used to construct an array scalar from a C-value.

Other C-structures

A few new C-structures were found to be useful in the development of NumPy. These C-structures are used in at least one C-API call and are therefore documented here. The main reason these structures were defined is to make it easy to use the Python ParseTuple C-API to convert from Python objects to a useful C-Object.

PyArray_Dims

PyArray_Chunk

PyArrayInterface

Internally used structures

Internally, the code uses some additional Python objects primarily for memory management. These types are not accessible directly from Python, and are not exposed to the C-API. They are included here only for completeness and assistance in understanding the code.

NumPy C-API and C complex

When you use the NumPy C-API, you will have access to complex real declarations npy_cdouble and npy_cfloat, which are declared in terms of the C standard types from complex.h. Unfortunately, complex.h contains #define I ... (where the actual definition depends on the compiler), which means that any downstream user that does #include <numpy/arrayobject.h> could get I defined, and using something like declaring double I; in their code will result in an obscure compiler error like

This error can be avoided by adding

#undef I

to your code.