{ } Raw JSON

bundles / scipy 1.17.1 / scipy / sparse / _csr / csr_matrix

ABCMeta

scipy.sparse._csr:csr_matrix

source: /scipy/sparse/_csr.py :447

Signature

def   csr_matrix ( arg1 shape = None dtype = None copy = False * maxprint = None )

Summary

Compressed Sparse Row matrix.

Extended Summary

This can be instantiated in several ways:

csr_matrix(D)

where D is a 2-D ndarray

csr_matrix(S)

with another sparse array or matrix S (equivalent to S.tocsr())

csr_matrix((M, N), [dtype])

to construct an empty matrix with shape (M, N) dtype is optional, defaulting to dtype='d'.

csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])

where data, row_ind and col_ind satisfy the relationship a[row_ind[k], col_ind[k]] = data[k].

csr_matrix((data, indices, indptr), [shape=(M, N)])

is the standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.

Attributes

dtype : dtype

Data type of the matrix

shape : 2-tuple

Shape of the matrix

ndim : int

Number of dimensions (this is always 2)

nnz
size
data

CSR format data array of the matrix

indices

CSR format index array of the matrix

indptr

CSR format index pointer array of the matrix

has_sorted_indices
has_canonical_format
T

Notes

Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.

Advantages of the CSR format

  • efficient arithmetic operations CSR + CSR, CSR * CSR, etc.

  • efficient row slicing

  • fast matrix vector products

Disadvantages of the CSR format

  • slow column slicing operations (consider CSC)

  • changes to the sparsity structure are expensive (consider LIL or DOK)

Canonical Format

  • Within each row, indices are sorted by column.

  • There are no duplicate entries.

Examples

import numpy as np
from scipy.sparse import csr_matrix
csr_matrix((3, 4), dtype=np.int8).toarray()
row = np.array([0, 0, 1, 2, 2, 2])
col = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csr_matrix((data, (row, col)), shape=(3, 3)).toarray()
indptr = np.array([0, 2, 3, 6])
indices = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])
csr_matrix((data, indices, indptr), shape=(3, 3)).toarray()
Duplicate entries are summed together:
row = np.array([0, 1, 2, 0])
col = np.array([0, 1, 1, 0])
data = np.array([1, 2, 4, 8])
csr_matrix((data, (row, col)), shape=(3, 3)).toarray()
As an example of how to construct a CSR matrix incrementally, the following snippet builds a term-document matrix from texts:
docs = [["hello", "world", "hello"], ["goodbye", "cruel", "world"]]
indptr = [0]
indices = []
data = []
vocabulary = {}
for d in docs:
    for term in d:
        index = vocabulary.setdefault(term, len(vocabulary))
        indices.append(index)
        data.append(1)
    indptr.append(len(indices))
csr_matrix((data, indices, indptr), dtype=int).toarray()

Aliases

  • scipy.sparse.csr_matrix

Referenced by

This package