Matrices that mostly contain zeroes are said to be sparse.
Sparse matrices are commonly used in applied machine learning (such as in data containing data-encodings that map categories to count) and even in whole subfields of machine learning such as natural language processing (NLP).
Sparse matrices contain only a few non-zero values. Storing such data in a two-dimensional matrix data structure is a waste of space. Also, it is computationally expensive to represent and work with sparse matrices as though they are dense. A significant improvement in performance can be achieved by using representations and operations that specifically handle matrix sparsity.
import numpy as npfrom scipy.sparse import csr_matrix# create a 2-D representation of the matrixA = np.array([[1, 0, 0, 0, 0, 0], [0, 0, 2, 0, 0, 1],\[0, 0, 0, 2, 0, 0]])print("Dense matrix representation: \n", A)# convert to sparse matrix representationS = csr_matrix(A)print("Sparse matrix: \n",S)# convert back to 2-D representation of the matrixB = S.todense()print("Dense matrix: \n", B)
Free Resources