Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Sparse tensor Types extend TensorType #766

Merged
merged 3 commits into from
Mar 16, 2022

Conversation

aerubanov
Copy link
Contributor

This pull request is a continuation of the work in PR #303 on refactors the sparse tensor type structure so that sparse types extend the existing tensor types.

Closes #142.

@aerubanov
Copy link
Contributor Author

I did a rebase of the branch to the master, which resulted in some of the tests being broken. Now I'm fixing the bugs that occurred, but I decided to open this PR to be able to track progress and discuss changes.

@brandonwillard brandonwillard marked this pull request as draft January 19, 2022 18:43
@brandonwillard brandonwillard added enhancement New feature or request refactor This issue involves refactoring labels Jan 19, 2022
@brandonwillard brandonwillard changed the title [WIP] Make sparse tensor types extend the existing tensor types Make Sparse tensor Types extend TensorType Jan 19, 2022
@aerubanov aerubanov force-pushed the generalize-sparse-api branch 4 times, most recently from d482291 to 2c405c5 Compare January 24, 2022 14:09
@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #766 (d7179fc) into main (31ab8fd) will increase coverage by 0.02%.
The diff coverage is 84.10%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #766      +/-   ##
==========================================
+ Coverage   78.48%   78.50%   +0.02%     
==========================================
  Files         154      154              
  Lines       47773    47857      +84     
  Branches    10853    10879      +26     
==========================================
+ Hits        37494    37572      +78     
  Misses       7743     7743              
- Partials     2536     2542       +6     
Impacted Files Coverage Δ
aesara/link/c/type.py 75.66% <ø> (ø)
aesara/sparse/sandbox/sp2.py 82.79% <ø> (ø)
aesara/tensor/blas.py 79.61% <26.31%> (-1.06%) ⬇️
aesara/sparse/type.py 72.11% <81.08%> (+0.85%) ⬆️
aesara/sparse/basic.py 82.47% <94.44%> (+0.39%) ⬆️
aesara/misc/may_share_memory.py 71.42% <100.00%> (ø)
aesara/sparse/opt.py 75.60% <100.00%> (+0.14%) ⬆️
aesara/sparse/sharedvar.py 94.11% <100.00%> (ø)
aesara/tensor/basic.py 87.63% <100.00%> (+0.02%) ⬆️
aesara/tensor/basic_opt.py 85.03% <100.00%> (ø)
... and 8 more

@aerubanov
Copy link
Contributor Author

@brandonwillard , Looks like I fixed tests that were broken after rebase. In your initial PR #303 you noticed that we should "Convert sparse to dense when used with some Ops (e.g. Elemwise, CAReduce)". Could you provide more details about this?

@brandonwillard
Copy link
Member

brandonwillard commented Jan 26, 2022

@brandonwillard , Looks like I fixed tests that were broken after rebase. In your initial PR #303 you noticed that we should "Convert sparse to dense when used with some Ops (e.g. Elemwise, CAReduce)". Could you provide more details about this?

In order to make the old TensorVariable interface, _tensor_py_operators, applicable to sparse matrices, we need to start adding support for currently unsupported sparse operations, like __get__, which constructs all sorts of Subtensor Ops. Right now, sparse matrices implement their own indexing Ops, which is a huge interface problem. We can't reasonably add sparse matrix-specific logic to Subtensor, though.

We can get around this by dispatching to the appropriate sparse Ops at the _sparse_py_operators interface level, but then there are still some related operations that aren't covered, like advanced indexing on sparse matrices. If we simply convert sparse matrices to dense matrices at that interface, we can—at the very least—make things work (i.e. not return graphs that will err when evaluated).

Really, that's all we need to do in order to merge this: i.e. make sure the new interface doesn't generate graphs that necessarily produces errors, and converting to a dense matrix is a quick and dirty work-around in most cases.

@aerubanov aerubanov force-pushed the generalize-sparse-api branch from 5759d72 to 71bc74c Compare January 27, 2022 13:12
@aerubanov
Copy link
Contributor Author

@brandonwillard , I added tests to check that no errors occurred when using _tensor_py_operators with sparse tensors and added conversion sparse to dense to make them pass. Also, when the conversion happens the warning is generated to inform users about that. Please look at the code and let me know what additional improvements are needed.

Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you have all the necessary logic in there. I left a few comments about where certain steps take place (e.g. in the sparse module vs. others). The basic idea is that we want to keep all sparse object related handling within the sparse module. In other words, we don't want the concerns of sparse types to leak into code that doesn't explicitly need to concern itself with sparse types.

My guess is that those lines of code can be moved into the sparse Variable interface before the external code is called.

Aside from that, we'll need to rebase some of those commits (e.g. combine test-adding commits with their respective code changes, merge/remove commits that undo things performed by earlier commits in this branch, etc.) and update their formatting to match these guidelines (just the summary lines for now). Since this is such a large set of changes, these kinds of things are even more important—both for the reviewing overhead and basic organizational reasons.

@aerubanov aerubanov force-pushed the generalize-sparse-api branch from 737e1fe to 29fe431 Compare February 7, 2022 15:31
@aerubanov
Copy link
Contributor Author

Hi @brandonwillard ! I made the changes according to your comments. Now I override in _sparse_py_operators unimplemented operations for sparse variables, converting them to dense. So this logic stays inside the sparse module. And I cleaned up the commit history.

@brandonwillard
Copy link
Member

Hi @brandonwillard ! I made the changes according to your comments. Now I override in _sparse_py_operators unimplemented operations for sparse variables, converting them to dense. So this logic stays inside the sparse module. And I cleaned up the commit history.

Great, thanks! I'm still a bit backlogged, but I'll review soon.

@aerubanov
Copy link
Contributor Author

Hi @brandonwillard! I just wanted to remind you of this PR just in case. Of course, if you have time for that.

@aerubanov aerubanov marked this pull request as ready for review February 23, 2022 12:12
Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there were some problems rebasing, because there are a few reversions of upstream changes and unmanaged conflict diffs.

@@ -559,7 +559,7 @@ def replace_all_validate(

for r, new_r in replacements:
try:
fgraph.replace(r, new_r, reason=reason, verbose=False, **kwargs)
fgraph.replace(r, new_r, reason=reason, verbose=verbose, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fgraph.replace(r, new_r, reason=reason, verbose=verbose, **kwargs)
fgraph.replace(r, new_r, reason=reason, verbose=False, **kwargs)

verbose is set to False here because this method handles that functionality itself.

@brandonwillard
Copy link
Member

I'll fix the rebase issues and push in a minute.

@brandonwillard brandonwillard force-pushed the generalize-sparse-api branch 3 times, most recently from 85a9ad1 to d90e52d Compare February 27, 2022 01:51
brandonwillard
brandonwillard previously approved these changes Feb 27, 2022
Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still more to be done to get the sparse interface into shape, but these changes look like a good step toward that, if only because they explicitly state that we want a more correct type representation.

The biggest issue right now is that TensorTypes aren't an abstract class, and direct instances of TensorTypes represent dense tensors, alongside instances of the new DenseTensor type. It might not take much to make such a change and remove all the meta classes, though.

Anyway, I'm fine moving forward with these changes and opening another set of issues to follow up, but let's get a few more eyes on this first.

@kc611
Copy link
Contributor

kc611 commented Feb 28, 2022

LGTM.

The biggest issue right now is that TensorTypes aren't an abstract class, and direct instances of TensorTypes represent dense tensors, alongside instances of the new DenseTensor type. It might not take much to make such a change and remove all the meta classes, though.

We should make an issue page for this. Since, this can be tackled separately from the additional logic that we add towards supporting SparseTypes.

kc611
kc611 previously approved these changes Mar 8, 2022
@aerubanov aerubanov dismissed stale reviews from kc611 and brandonwillard via c82295a March 11, 2022 15:36
@aerubanov aerubanov force-pushed the generalize-sparse-api branch 3 times, most recently from b3f64a8 to cc36b11 Compare March 14, 2022 10:37
@aerubanov
Copy link
Contributor Author

Hi @brandonwillard! I rebased on top of the master and resolved the conflicts.

@brandonwillard
Copy link
Member

The location and name of the new tests.test_tensor_interface needs to be reconsidered, because it's explicitly testing interfaces that already have corresponding test sub-packages/modules (e.g. tests.sparse and tests.tensor.test_var). If the is_sparse == False case isn't needed (see here), then this module could be relocated/renamed to tests.sparse.test_var.

@brandonwillard brandonwillard force-pushed the generalize-sparse-api branch from 660fb4c to e824666 Compare March 15, 2022 22:33
@brandonwillard
Copy link
Member

brandonwillard commented Mar 15, 2022

I've updated the new tests and added a commit that renames SparseType to SparseTensorType (to match the newly introduced DenseTensorType).

@brandonwillard brandonwillard merged commit 94f5ddf into aesara-devs:main Mar 16, 2022
@aerubanov
Copy link
Contributor Author

@brandonwillard, Thanks for the review and the corrections!

@aerubanov aerubanov deleted the generalize-sparse-api branch March 16, 2022 13:41
@brandonwillard
Copy link
Member

@aerubanov, thanks for the PR! This was an important one that required a lot of persistence. It's this kind of help that really moves things forward, and we truly appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request important refactor This issue involves refactoring sparse tensors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify sparse and dense tensor class design
3 participants