Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behavior of dot products with sparse matrices #881

Open
rlouf opened this issue Apr 1, 2022 · 3 comments
Open

Inconsistent behavior of dot products with sparse matrices #881

rlouf opened this issue Apr 1, 2022 · 3 comments
Labels
bug Something isn't working sparse tensors

Comments

@rlouf
Copy link
Member

rlouf commented Apr 1, 2022

This issue was discovered while testing horsehoe_nbinom_gibbs in aemcmcm (code) with a sparse matrix input for the variable X. Let's consider the product of a sparse tensor with a dense tensor:

import numpy as np
import scipy

import aesara
import aesara.tensor as at

X = scipy.sparse.random(10, 7, density=0.1, format="csr")  # default "coo" not supported by aesara
b = np.random.random((7, 3))

X_csr = aesara.sparse.as_sparse(X)
b_tt = at.as_tensor(b)

aesara.tensor.dot is configured to work with both TensorType and SparseType. Indeed:

at.dot(X_csr, b_tt).eval()
"""
| 0.08221337 |  0.0144887 | 0.03528356 |
| 0.03397027 | 0.00598668 | 0.01457904 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
| 0.26292134 | 0.26791936 | 0.06541766 |
|  0.3189743 | 0.05621378 |  0.1368944 |
| 0.72413837 | 0.64081756 | 0.58794318 |
| 0.00118287 | 0.00174689 | 0.00191919 |
"""

Using the dot method of the sparse tensor also works:

X_csr.dot(b_tt).eval()
"""
| 0.08221337 |  0.0144887 | 0.03528356 |
| 0.03397027 | 0.00598668 | 0.01457904 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
|          0 |          0 |          0 |
| 0.26292134 | 0.26791936 | 0.06541766 |
|  0.3189743 | 0.05621378 |  0.1368944 |
| 0.72413837 | 0.64081756 | 0.58794318 |
| 0.00118287 | 0.00174689 | 0.00191919 |
"""

The operator @, however, raises a TypeError:

X_csr @ b_tt
# TypeError: Tensor type field must be a TensorType; found <class 'aesara.sparse.type.SparseType'>.

For some reason that is not yet clear to me, the __dot__ method of the _tensortype_operators_class is called:

def __dot__(left, right):
    return at.math.dense_dot(left, right)

instead of the _sparse_py_operators.

@rlouf
Copy link
Member Author

rlouf commented Apr 5, 2022

I noticed what appears to me to be another inconsistency.

import numpy as np
import scipy.sparse

import aesara.sparse
import aesara.tensor as at
from aesara.tensor.random import RandomStream

A sparse matrix multiplied by a matrix

Let us define X_csr, a SparseConstant, and y_tt a TensorConstant, both matrices

p = 5
srng = RandomStream(seed=12345)

X = scipy.sparse.random(p, p, density=.33, format="csr")
X_csr = aesara.sparse.as_sparse(X)

y = np.ones((p, p))
y_tt = at.as_tensor(y)

We saw in the previous message that the aesara.tensor.dot product works between a sparse matrix and a dense matrix:

at.dot(X_csr, y_tt).eval()
"""
| 1.33790965 | 1.33790965 | 1.33790965 | 1.33790965 | 1.33790965 |
| 0.47078111 | 0.47078111 | 0.47078111 | 0.47078111 | 0.47078111 |
| 0.98612278 | 0.98612278 | 0.98612278 | 0.98612278 | 0.98612278 |
| 0.72805556 | 0.72805556 | 0.72805556 | 0.72805556 | 0.72805556 |
| 0.22092613 | 0.22092613 | 0.22092613 | 0.22092613 | 0.22092613 |
"""

But that this does not with the @ operator

X_csr @ y_tt
# TypeError: The dense dot product is only supported for dense types

A sparse matrix multiplied by a vector

Now let y_tt be a vector:

y_vec = np.ones(p)
y_vec_tt = at.as_tensor(y_vec)

This does not work:

at.dot(X_csr, y_vec_tt).eval()
# TypeError: The dense dot product is only supported for dense types

But this does:

at.dot(X_csr, at.shape_padright(y_vec_tt)).eval()
"""
| 1.33790965 |
| 0.47078111 |
| 0.98612278 |
| 0.72805556 |
| 0.22092613 |
"""

For reference, let us make sure that the dot product does work with a dense tensor matrix and a tensor vector:

X_tt = at.as_tensor(np.random.random((p,p)))
at.dot(X_tt, y_vec_tt).eval()
# | 1.35070533 | 2.56274912 | 1.70920864 | 2.87279029 | 1.94839187 |

A possible explanation

scipy sparse matrices are converted by aesara to a SparseConstant. This class inherits from sparse_py_operators which uses structured_dot for the dot product. However SparseConstant also inherits from TensorConstant, which inherits from TensorVariable, which inherits in turn from tensor_py_operators. Which uses dense_dot

Maybe someone who knows the internals better can correct me, but there seems to be an assumption throughout the library that we should read DenseTensorX every time we see TensorX. And a lot of confusion ensues when you introduces SparseTensor; Tensor sounds more general so it seems to make sense to make SparseConstant inherit from TensorConstant... until you realize that the latter's operators are dense tensor operators. The library probably should be more explicit here.

@brandonwillard
Copy link
Member

brandonwillard commented Apr 5, 2022

For some reason that is not yet clear to me, the __dot__ method of the _tensortype_operators_class is called:

def __dot__(left, right):
    return at.math.dense_dot(left, right)

instead of the _sparse_py_operators.

The reason is that _sparse_py_operators hasn't overridden _tensor_py_operators.__[r]matmul__, which uses at.math.dense_dot. This is what needs to be done in order to fix this issue.

Maybe someone who knows the internals better can correct me, but there seems to be an assumption throughout the library that we should read DenseTensorX every time we see TensorX. And a lot of confusion ensues when you introduces SparseTensor; Tensor sounds more general so it seems to make sense to make SparseConstant inherit from TensorConstant... until you realize that the latter's operators are dense tensor operators. The library probably should be more explicit here.

Yes, this is one of the compromises we made along the way toward combining/merging the tensor types and interfaces. The reason we chose to make all original/base tensor types dense is that it required fewer changes and wouldn't break dependent libraries.

Here's a statement to that effect in the relevant PR. In summary, we need a follow-up PR that introduces something like an AbstractTensorType, renames TensorType to DenseTensorType, and deprecates the old use of TensorType. It could work a few other ways, as well, but that's the basic idea.

@brandonwillard
Copy link
Member

Here's an issue for those type changes: #886.

@rlouf rlouf linked a pull request Sep 17, 2022 that will close this issue
@rlouf rlouf removed a link to a pull request Sep 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sparse tensors
Projects
None yet
Development

No branches or pull requests

2 participants