Inconsistent behavior of dot products with sparse matrices #881

rlouf · 2022-04-01T14:54:30Z

This issue was discovered while testing horsehoe_nbinom_gibbs in aemcmcm (code) with a sparse matrix input for the variable X. Let's consider the product of a sparse tensor with a dense tensor:

import numpy as np
import scipy

import aesara
import aesara.tensor as at

X = scipy.sparse.random(10, 7, density=0.1, format="csr")  # default "coo" not supported by aesara
b = np.random.random((7, 3))

X_csr = aesara.sparse.as_sparse(X)
b_tt = at.as_tensor(b)

aesara.tensor.dot is configured to work with both TensorType and SparseType. Indeed:

at.dot(X_csr, b_tt).eval()
"""
| 0.08221337 |  0.0144887 | 0.03528356 |
| 0.03397027 | 0.00598668 | 0.01457904 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
| 0.26292134 | 0.26791936 | 0.06541766 |
|  0.3189743 | 0.05621378 |  0.1368944 |
| 0.72413837 | 0.64081756 | 0.58794318 |
| 0.00118287 | 0.00174689 | 0.00191919 |
"""

Using the dot method of the sparse tensor also works:

X_csr.dot(b_tt).eval()
"""
| 0.08221337 |  0.0144887 | 0.03528356 |
| 0.03397027 | 0.00598668 | 0.01457904 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
| 0.26292134 | 0.26791936 | 0.06541766 |
|  0.3189743 | 0.05621378 |  0.1368944 |
| 0.72413837 | 0.64081756 | 0.58794318 |
| 0.00118287 | 0.00174689 | 0.00191919 |
"""

The operator @, however, raises a TypeError:

X_csr @ b_tt
# TypeError: Tensor type field must be a TensorType; found <class 'aesara.sparse.type.SparseType'>.

For some reason that is not yet clear to me, the __dot__ method of the _tensortype_operators_class is called:

def __dot__(left, right):
    return at.math.dense_dot(left, right)

instead of the _sparse_py_operators.

The text was updated successfully, but these errors were encountered:

rlouf · 2022-04-05T12:25:48Z

I noticed what appears to me to be another inconsistency.

import numpy as np
import scipy.sparse

import aesara.sparse
import aesara.tensor as at
from aesara.tensor.random import RandomStream

A sparse matrix multiplied by a matrix

Let us define X_csr, a SparseConstant, and y_tt a TensorConstant, both matrices

p = 5
srng = RandomStream(seed=12345)

X = scipy.sparse.random(p, p, density=.33, format="csr")
X_csr = aesara.sparse.as_sparse(X)

y = np.ones((p, p))
y_tt = at.as_tensor(y)

We saw in the previous message that the aesara.tensor.dot product works between a sparse matrix and a dense matrix:

at.dot(X_csr, y_tt).eval()
"""
| 1.33790965 | 1.33790965 | 1.33790965 | 1.33790965 | 1.33790965 |
| 0.47078111 | 0.47078111 | 0.47078111 | 0.47078111 | 0.47078111 |
| 0.98612278 | 0.98612278 | 0.98612278 | 0.98612278 | 0.98612278 |
| 0.72805556 | 0.72805556 | 0.72805556 | 0.72805556 | 0.72805556 |
| 0.22092613 | 0.22092613 | 0.22092613 | 0.22092613 | 0.22092613 |
"""

But that this does not with the @ operator

X_csr @ y_tt
# TypeError: The dense dot product is only supported for dense types

A sparse matrix multiplied by a vector

Now let y_tt be a vector:

y_vec = np.ones(p)
y_vec_tt = at.as_tensor(y_vec)

This does not work:

at.dot(X_csr, y_vec_tt).eval()
# TypeError: The dense dot product is only supported for dense types

But this does:

at.dot(X_csr, at.shape_padright(y_vec_tt)).eval()
"""
| 1.33790965 |
| 0.47078111 |
| 0.98612278 |
| 0.72805556 |
| 0.22092613 |
"""

For reference, let us make sure that the dot product does work with a dense tensor matrix and a tensor vector:

X_tt = at.as_tensor(np.random.random((p,p)))
at.dot(X_tt, y_vec_tt).eval()
# | 1.35070533 | 2.56274912 | 1.70920864 | 2.87279029 | 1.94839187 |

A possible explanation

scipy sparse matrices are converted by aesara to a SparseConstant. This class inherits from sparse_py_operators which uses structured_dot for the dot product. However SparseConstant also inherits from TensorConstant, which inherits from TensorVariable, which inherits in turn from tensor_py_operators. Which uses dense_dot

Maybe someone who knows the internals better can correct me, but there seems to be an assumption throughout the library that we should read DenseTensorX every time we see TensorX. And a lot of confusion ensues when you introduces SparseTensor; Tensor sounds more general so it seems to make sense to make SparseConstant inherit from TensorConstant... until you realize that the latter's operators are dense tensor operators. The library probably should be more explicit here.

brandonwillard · 2022-04-05T21:18:46Z

For some reason that is not yet clear to me, the __dot__ method of the _tensortype_operators_class is called:
def __dot__(left, right):
    return at.math.dense_dot(left, right)
instead of the _sparse_py_operators.

The reason is that _sparse_py_operators hasn't overridden _tensor_py_operators.__[r]matmul__, which uses at.math.dense_dot. This is what needs to be done in order to fix this issue.

Maybe someone who knows the internals better can correct me, but there seems to be an assumption throughout the library that we should read DenseTensorX every time we see TensorX. And a lot of confusion ensues when you introduces SparseTensor; Tensor sounds more general so it seems to make sense to make SparseConstant inherit from TensorConstant... until you realize that the latter's operators are dense tensor operators. The library probably should be more explicit here.

Yes, this is one of the compromises we made along the way toward combining/merging the tensor types and interfaces. The reason we chose to make all original/base tensor types dense is that it required fewer changes and wouldn't break dependent libraries.

Here's a statement to that effect in the relevant PR. In summary, we need a follow-up PR that introduces something like an AbstractTensorType, renames TensorType to DenseTensorType, and deprecates the old use of TensorType. It could work a few other ways, as well, but that's the basic idea.

brandonwillard · 2022-04-05T21:39:02Z

Here's an issue for those type changes: #886.

rlouf added bug Something isn't working sparse tensors labels Apr 1, 2022

This was referenced Apr 1, 2022

Use at.dot instead of @ to support sparse data matrices aesara-devs/aemcmc#19

Open

Make aemcmc compatible with sparse data input aesara-devs/aemcmc#22

Closed

rlouf linked a pull request Sep 17, 2022 that will close this issue

Clean up SparseTensorType #1192

Merged

rlouf removed a link to a pull request Sep 17, 2022

Clean up SparseTensorType #1192

Merged

rlouf mentioned this issue Sep 17, 2022

Clean up SparseTensorType #1192

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent behavior of dot products with sparse matrices #881

Inconsistent behavior of dot products with sparse matrices #881

rlouf commented Apr 1, 2022 •

edited

Loading

rlouf commented Apr 5, 2022 •

edited

Loading

brandonwillard commented Apr 5, 2022 •

edited

Loading

brandonwillard commented Apr 5, 2022

Inconsistent behavior of dot products with sparse matrices #881

Inconsistent behavior of dot products with sparse matrices #881

Comments

rlouf commented Apr 1, 2022 • edited Loading

rlouf commented Apr 5, 2022 • edited Loading

A sparse matrix multiplied by a matrix

A sparse matrix multiplied by a vector

A possible explanation

brandonwillard commented Apr 5, 2022 • edited Loading

brandonwillard commented Apr 5, 2022

rlouf commented Apr 1, 2022 •

edited

Loading

rlouf commented Apr 5, 2022 •

edited

Loading

brandonwillard commented Apr 5, 2022 •

edited

Loading