Create a fixed-length/shape `TensorType` and `TensorVariable` #431

brandonwillard · 2021-05-21T16:21:50Z

We need to clean up the handling of fixed-shape tensors, and creating a fixed-size Type (TensorType, really) is perhaps the best way to do that.

Such a TensorType would allow us to replace/remove and generalize most—if not all—of the distinct Shape handling logic and the haphazard use of aesara.tensor.get_vector_length, among other things.

Likewise, it provides a good solution to #42/#93.

For anyone not intimately familiar with these parts of Aesara's internals, users deal with Variable objects (e.g. aesara.tensor.vector() returns a TensorVariable instance, which is a subclass of Variable), and each Variable has a Variable.type value that extends Type. Type instances are exactly what you'd guess they are: objects that emulate Python types and provide static domain-specific type information. For example, TensorTypes hold the data type (i.e. TensorType.dtype), and the number of dimensions/which dimensions are broadcastable (i.e. TensorType.broadcastable).

See #134 for more information about Aesara's types and how we might be able to simplify them by using actual Python types instead.

Background

If we were to implement a fixed-shape TensorType, and corresponding TensorVariable, the first question would be:
"How do we change the results of every Op.make_node so that they produce outputs that are fixed-shape TensorVariables when their inputs are fixed-shape?". This is the real challenge underlying the idea.

Without addressing this question, we could have fixed-shape TensorTypes/Variables, but they would only be useful to the first Ops they encounter. That's not necessarily a bad thing, because there are definitely a few Ops that could make good use of this (e.g. size and shape-based Op parameters). Even so, we can definitely do better.

The Shape Op, ShapeFeature, ShapeOptimizer, and their associated Op.infer_shape methods are Aesara's current means of doing all this, and we might only need to cleverly repurpose the latter to get what we want.

Proposed Changes

More specifically, Op.infer_shape is the means through which an Op propagates the "fixed-shape-edness" of its inputs through to its outputs. We could use these existing—albeit voluntary—implementations to determine whether or not an Op's output should be a fixed-shape TensorVariable or not, and, if so, propagate the sizes and construct one.

In general, Aesara is designed in a way that makes Python-based typing very difficult to use constructively, because each individual Op is left to determine the exact types of the outputs it generates. In nearly every instance, an Op will simply construct TensorVariables from scratch and use those as its outputs, making it all but impossible to utilize TensorVariable subclasses—like our proposed fixed-shape TensorVariable—without changing all existing Op.make_node implementations.

Instead, we could actually make Apply something functional and utilize it for these purposes (and perhaps more). At present, Apply is essentially just a container that performs no significant actions. For instance, when an Op.make_node returns the Apply nodes it constructs, almost nothing is done by Apply other than trivial input validation. We could make Apply a point of dispatch for iterative, type-based computations; that way, each Op isn't responsible for reasoning about things that are arguably outside of its scope.

In other words, when an Op is called with some inputs, Op.make_node constructs its outputs and creates an Apply node with said inputs and outputs, then the Apply node constructor (or Apply.__new__) calls the repurposed Op.infer_shape, which computes the concrete shapes of the outputs given the inputs, and creates updated fixed-shape outputs.

Issues

Computation

Such a change would move the somewhat elective shape inference process, which currently happens during optimization, into the model construction process. Since shape inference isn't a particularly computationally intensive thing, this isn't a real concern. Plus, we can always make shape propagation configurable, so there's no real need for a complete trade-off.

Alternatively, we might be able to make shape propagation a lazily computed process. This could be facilitated by changes to TensorVariable itself. For instance, instead of TensorVariable.shape returning the output of a Shape Op, it could return a fixed-length Sequence that, when accessed would lazily compute the shape values.

This could be a huge simplification relative to the current approach, because shape entries would be distinct objects. Under the current approach, graphs can contain numerous distinct Shape Ops, and *Subtensor* Ops on those Shapes, that all refer to the same shapes. Such a change would reduce the amount of merge-like work needed during the shape optimization process. It might also remove the need for some/all of the convoluted *Subtensor* considerations in ShapeFeature.

Regardless, these considerations are somewhat tangential to the present issue.

Unusual `Op`s

One issue with this approach is that an Op may for some reason want to hold on to the outputs it generates, and, if such an Op assumes that said outputs will be used down the line, it would be mistaken under these changes. This is almost exclusively an Op-specific implementation issue; one that can—however—be easily remedied by first creating the Apply node and then using its refined outputs.

In-place `Apply` updates

FunctionGraph.replace still operates by changing values in Apply.inputs. This would circumvent the proposed changes.

Under certain assumptions, this might not be a problem, though. For instance, if fixed-shape TensorVariables have already been produced, and their shapes have already been computed, then moving them around shouldn't matter, because, by their very type alone, no shape computations are needed (i.e. they carry their static shape information with them at all times).

Otherwise, if there's a shape conflict that arises from a rewrite, it's really the fault of the rewrite operation.

The real concern arises when a fixed-shape TensorVariable replaces a non-fixed-shape TensorVariable input and vice versa, because we might need to update something downstream. The type information itself would imply that the such a substitution isn't valid, at least not without assuming that the non-fixed-shape variables actually have the same fixed-shape (e.g. the fixed-shape types are refinement types and we're talking about something like substitutability).

There might not be many/any scenarios in which this is a real concern, especially when/if all of the fixed-shape information has already been determined within the graph containing the in-place updated Apply node. Regardless, this is perhaps the main concern right now.

The text was updated successfully, but these errors were encountered:

twiecki · 2022-02-05T06:35:13Z

Is this fixed?

brandonwillard · 2022-02-05T10:02:49Z

Is this fixed?

This was implemented in the linked PR.

ricardoV94 · 2022-02-05T11:12:16Z

@twiecki There are some follow up PRs needed to make use of this across the codebase: #732 / #797

brandonwillard added enhancement New feature or request important refactor This issue involves refactoring labels May 21, 2021

brandonwillard added this to the Clean up Theano milestone May 21, 2021

brandonwillard changed the title ~~Create a fixed-length/shape TensorType~~ Create a fixed-length/shape TensorType and TensorVariable May 21, 2021

brandonwillard mentioned this issue May 24, 2021

Implement a non-in-place version of FunctionGraph.change_input #362

Open

brandonwillard mentioned this issue Oct 4, 2021

Fix PushOutNonSeqScan + RandomVariable shape inference error in Scans #608

Closed

brandonwillard removed this from the Clean up after Theano milestone Nov 25, 2021

brandonwillard mentioned this issue Dec 29, 2021

Enable complete static shape information at the type level #711

Merged

5 tasks

brandonwillard linked a pull request Dec 29, 2021 that will close this issue

Enable complete static shape information at the type level #711

Merged

5 tasks

brandonwillard added the graph objects label Jan 7, 2022

brandonwillard closed this as completed in #711 Jan 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a fixed-length/shape `TensorType` and `TensorVariable` #431

Create a fixed-length/shape `TensorType` and `TensorVariable` #431

brandonwillard commented May 21, 2021 •

edited

Loading

twiecki commented Feb 5, 2022

brandonwillard commented Feb 5, 2022

ricardoV94 commented Feb 5, 2022

Create a fixed-length/shape TensorType and TensorVariable #431

Create a fixed-length/shape TensorType and TensorVariable #431

Comments

brandonwillard commented May 21, 2021 • edited Loading

Background

Proposed Changes

Issues

Computation

Unusual Ops

In-place Apply updates

twiecki commented Feb 5, 2022

brandonwillard commented Feb 5, 2022

ricardoV94 commented Feb 5, 2022

Create a fixed-length/shape `TensorType` and `TensorVariable` #431

Create a fixed-length/shape `TensorType` and `TensorVariable` #431

brandonwillard commented May 21, 2021 •

edited

Loading

Unusual `Op`s

In-place `Apply` updates