`scipy.sparse._csr:csr_array`

source: /scipy/sparse/_csr.py :324

Signature

   def     csr_array (    arg1    ,    shape    =  None   ,    dtype    =  None   ,    copy    =  False   ,  * ,     maxprint    =  None     )

Summary

Compressed Sparse Row array.

Extended Summary

This can be instantiated in several ways:

csr_array(D): where D is a 2-D ndarray
csr_array(S): with another sparse array or matrix S (equivalent to S.tocsr())
csr_array((M, N), [dtype]): to construct an empty array with shape (M, N) dtype is optional, defaulting to dtype='d'.
csr_array((data, (row_ind, col_ind)), [shape=(M, N)]): where data, row_ind and col_ind satisfy the relationship a[row_ind[k], col_ind[k]] = data[k].
csr_array((data, indices, indptr), [shape=(M, N)]): is the standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the array dimensions are inferred from the index arrays.

Attributes

dtype : dtype: Data type of the array
shape : 2-tuple: Shape of the array
ndim : int: Number of dimensions (this is always 2)
nnz
size
data: CSR format data array of the array
indices: CSR format index array of the array
indptr: CSR format index pointer array of the array
has_sorted_indices
has_canonical_format
T

Notes

Sparse arrays can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.

Advantages of the CSR format

efficient arithmetic operations CSR + CSR, CSR * CSR, etc.
efficient row slicing
fast matrix vector products

Disadvantages of the CSR format

slow column slicing operations (consider CSC)
changes to the sparsity structure are expensive (consider LIL or DOK)

Canonical Format

Within each row, indices are sorted by column.
There are no duplicate entries.

Examples

import numpy as np
from scipy.sparse import csr_array
csr_array((3, 4), dtype=np.int8).toarray()

✓

row = np.array([0, 0, 1, 2, 2, 2])
col = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csr_array((data, (row, col)), shape=(3, 3)).toarray()

✓

indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csr_array((data, indices, indptr), shape=(3, 3)).toarray()

✓

Duplicate entries are summed together:

row = np.array([0, 1, 2, 0])
col = np.array([0, 1, 1, 0])
data = np.array([1, 2, 4, 8])
csr_array((data, (row, col)), shape=(3, 3)).toarray()

✓

As an example of how to construct a CSR array incrementally, the following snippet builds a term-document array from texts:

docs = [["hello", "world", "hello"], ["goodbye", "cruel", "world"]]
indptr = [0]
indices = []
data = []
vocabulary = {}
for d in docs:
    for term in d:
        index = vocabulary.setdefault(term, len(vocabulary))
        indices.append(index)
        data.append(1)
    indptr.append(len(indices))
csr_array((data, indices, indptr), dtype=int).toarray()

✓

Aliases

scipy.sparse.csr_array

Referenced by

This package

tutorial:sparse

Other packages

skimage skimage.graph._graph:pixel_graph
skimage skimage.metrics._contingency_table:contingency_table