-
Notifications
You must be signed in to change notification settings - Fork 58
[Draft][WIP] Relax language specification #273
Conversation
relax_spec.md
Outdated
|
||
**Scope of Shape Variables** | ||
|
||
Shape variables can be introduced in two places in a Relax program: In a function signature, where they may be included with the argument shapes and return shape annotations, or in `MatchShape` bindings. Shape variables used in the function signature are scoped to the entire function in which they appear. Shape variables used in `MatchShape` bindings are scoped only to the `SeqExpr` in which they appear. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealing with shape variables has been one of the trickiest parts of the specification, in my opinion. Scoping them the same way as Relax variables seemed like a simple-enough rule but it leads to some problematic situations like having the value returned by SeqExpr
whose shape expression contains shape variables defined inside the SeqExpr
. Per this scoping rule, the shape of the SeqExpr
would have to be RuntimeDepShape
, which would lose information.
One alternative might be to scope shape variables to the entire function, but there are some potentially problematic cases, like initializing a var in one branch of an If
and not the other. Allowing them to scope more broadly may be helpful for checking the shape_
of a SeqExpr
, but creates complexity.
|
||
There can be some complexity involved in checking whether two shapes match during shape inference. A very simple, conservative method for determining equality is simply using alpha-equivalence: If the two shapes have the same structure, then they are equivalent. However, this method is conservative and can overlook numerical properties in `PrimExpr`s. We leave it up to compiler implementations as to whether to use more advanced methods for proving equivalence, such as attempting to use algebraic rewrite rules. (As a consequence, portability requires inserting dynamic checks wherever there needs to be a comparison of shapes.) | ||
|
||
Note that optimizations like function inlining or constant folding could allow for simplifying many shape annotations and expressions and make it possible to conclude at compile time that shapes in more cases are equivalent. In general, developing compiler infrastructure for partial evaluation and reasoning about common situations with shape annotations may eliminate many dynamic checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I emphasize that as in the case for getting the shape of a call, it may be advantageous to do these transformations before filling in shapes, though it may be harder for users to reason about when they need a dynamic check in that case. Do we want to consider the possibility of doing canonicalization before shape inference?
relax_spec.md
Outdated
2. If `r` is true, evaluate the `true_branch` and return its result. | ||
3. If `r` is false, evaluate the `false_branch` and return its result. | ||
9. The node `Call(op, [arg1, arg2, ..., argn])` is evaluated as follows: | ||
1. If `op` is an `ExternFunc` node, then evaluate `arg1`, `arg2`, …, `argn` in that order and call the results `a1`, `a2`, …, `an`. Next, look up the `PackedFunc` registered under the global symbol name. If it exists (it is an error at run time if it does not), call the `PackedFunc` using the given arguments and return the result. Note that if a TIR `PrimFunc` in the `IRModule` has a global symbol attribute registered, it can be called as an `ExternFunc` using that global symbol as well. `PackedFunc`s may have arbitrary side effect and are responsible for whether the result is a newly allocated value or aliases another value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return the result
We may want to make a guarantee about wrapping None
results into unit (empty tuples). TVM has a runtime object corresponding to "none" values that is distinct from an empty tuple, but in Relax, we may want to treat these as one case. Alternatively, we can represent empty tuples as None
values.
See this earlier discussion for thoughts on whether we should use |
Another issue that has come to my attention thanks to my colleagues @YuchenJin, @sunggg, and @psrivas2 is that while it may be reasonable to leave certain kinds of data structures as opaque "Relax objects" that we can define later because they don't directly interact with many other existing language features, there are some types that we might reasonably want to add into the language that would interact with a lot of existing features. Namely, there might be tensor variants that we would want to consider: dense vs sparse tensors, as well as ragged tensors. Adding these into the language design later could complicate how we deal with tensors, so it is likely to be worthwhile to consider these topics at this early stage in the language design process and possibly include these directly in the specification. That would be far preferable to later making changes that could potentially break backwards compatibility or require radically revising compiler passes. I encourage any community members interested in tensor variants like these to comment on those issues as well. |
I highly recommend taking a look at MLIR's current approach to sparse tensors: https://mlir.llvm.org/docs/Dialects/SparseTensorOps/ (which is inspired by the TACO project). I'm not sure how hard it would be to design something similar for Relax; MLIR has a lot more machinery around "attributes" and just general metaprogramming stuff like tablegen. At a high level though, I think the MLIR approach is to keep high level IR focused on "abstract" tensors with no concern for data layout/representation, and use attributes at some stage to hijack code generation for specialized backend representations. I think @yzh119 has done a ton of work on the TIR side for sparse stuff, so his thoughts here would definitely be valuable! |
Thanks for the links. Leaving density/sparsity for later would be an option in that case, though ragged tensors would require rules for handling their shapes (perhaps the per-dimension |
@MasterJH5574 and @VertexC are working on native ragged tensor support in Relax, and they are designing the shape for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slyubomirsky, thank you for your great efforts to make the language more formal!
Overall, it looks good to me and I left some questions below.
I observed some discrepancy with this doc and our current implementation. Although they deliver the very similar message, I would like to suggest to match them as much as we can for the better readability and ramp-up experience for newcomers.
relax_spec.md
Outdated
# bindings (analogous to statements) | ||
Binding ::= | ||
VarBinding(var: Var|DataflowVar, value: Expr) | ||
| MatchShape(var: (Var|DataflowVar)?, pattern: [PrimExpr], value: Expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment: it might be more readable if we match the rule with our function declarations by moving optional arguments to the behind.
e.g.,
@tvm._ffi.register_object("relax.expr.MatchShape")
class MatchShape(Binding):
"""Symbolic shape match, binds the variable of the lhs with the rhs."""
value: Expr
pattern: List[PrimExpr]
var: Var
def __init__(self, value: Expr, pattern: List[PrimExpr], var: Var, span: Span = None) -> None:
self.__init_handle_by_constructor__(_ffi_api.MatchShape, value, pattern, var, span)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. I think it makes more logical sense for the var to come first in a binding
77507a3
to
30329b0
Compare
In today's community meeting, @tqchen raised the idea that our shape analysis should be better thought of as "best-effort" (making static guarantees when it can but not making claims of completeness), where we treat user shape annotations as assumptions, so I'm wondering if that perspective might allow us to simplify the shape inference areas of the spec. One reason to go with this interpretation is (as this section of the spec discusses) that different passes and transformations might uncover more information that might allow for drawing conclusions in more cases. Provisionally, I am imagining that (as we discussed at the community meeting), the compiler would be required to insert dynamic checks of shapes at function and operator boundaries unless it can statically prove that shapes match. From that perspective, would it then make sense not to raise errors or warnings if the compiler cannot conclude whether two shapes are definitely not equal (if it concludes two shapes are definitely not equal, it should certainly raise an error)? I'm curious for more thoughts on this, since shape inference is certainly the trickiest area of the specification. |
The shape inference issue may be best decided in #293, which could replace |
I've added Relax's normal form to the specification, since in the Dec. 13 community meeting, we noted that |
I've overhauled the spec to deal with the newly implemented structural information system, which was a substantial challenge to specify. It is much more powerful than |
bafb622
to
7ebc082
Compare
apache/tvm#14148 Reposted in the unity branch, withdrawing this PR now |
Rendered view.
This document is a draft language specification for Relax. The purpose of the language specification is to serve as a technical reference describing the language's behavior in sufficient detail as to clarify the intended behavior of the compiler, hence it is by design a very detailed document rather than an accessible tutorial for the language. Its focus is the "what" and "how" of Relax, but not always the "why," though we can add more sections giving design reasons if that is desired.
Note that «double caret marks» (guillemets) are used to denote parts of the specification discussing functionality that the present prototype doesn't yet support. This notation is somewhat cumbersome, but I wasn't sure how else to proceed because Github Markdown does not support changing the text color (which was how my initial document indicated these areas). The caret marks may look strange in the text, but they have the benefit of being easy to find by text search.
Out of scope, for now, in this document is the subject of parsing: We should eventually document how we intend to parse Python into Relax, but the parser itself is being greatly reworked. We can revisit the issue of documenting its behavior once that work has been completed. Additionally, this specification is intended (for now) to focus on the user-visible behavior of Relax rather than specifying lower-level interfaces or the precise mechanisms of Relax's implementation.
Aspects Requiring Review or Still to be Determined
Since this document is a draft, any part of it is up for review and open to revision, but certain parts of the document have proven particularly challenging to describe and could benefit the most from community discussions.
StructInfo
system has been a great challenge to specify and there are many questions as to how it should work. One question is whether "strong shaping" might lead to too many error messages for something that could be checked dynamically. Additionally, many potential shape mismatches could be eliminated using constant propagation or other transformations: If we check shapes without applying transformations, we would force users to add lots of redundant shape checks. On the other hand, if we require these transformations first, that might make the code harder for users to reason about.PackedFunc
s can interact with Relax values. However, embedded targets do not support the TVM object system, so describing values in terms of the TVM object system directly may not work for all settings. Additionally, it should be determined how much detail about the representations should be included in the specification.call_tir
) should be described in the specification. I am not certain the descriptions presently in the last section are entirely correct, so more review of them should be appreciated. Additionally, are there operators that should be there but are presently missing?There are also some more minor TODOs throughout the document.
The Future of This Document
Eventually, we will want this document to be part of the Relax documentation, in which case it will be placed into a different location in the repo and probably be formatted as an
rst
file. Before that, we will officially RFC the spec into TVM to allow for maximum public discussion as to the design decisions underlying the specification. Hence, I am going to leave this document as a "WIP PR" until it is officially RFC'd.