Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.

[Draft][WIP] Relax language specification #273

Closed
wants to merge 30 commits into from

Conversation

slyubomirsky
Copy link
Collaborator

@slyubomirsky slyubomirsky commented Oct 20, 2022

Rendered view.

This document is a draft language specification for Relax. The purpose of the language specification is to serve as a technical reference describing the language's behavior in sufficient detail as to clarify the intended behavior of the compiler, hence it is by design a very detailed document rather than an accessible tutorial for the language. Its focus is the "what" and "how" of Relax, but not always the "why," though we can add more sections giving design reasons if that is desired.

Note that «double caret marks» (guillemets) are used to denote parts of the specification discussing functionality that the present prototype doesn't yet support. This notation is somewhat cumbersome, but I wasn't sure how else to proceed because Github Markdown does not support changing the text color (which was how my initial document indicated these areas). The caret marks may look strange in the text, but they have the benefit of being easy to find by text search.

Out of scope, for now, in this document is the subject of parsing: We should eventually document how we intend to parse Python into Relax, but the parser itself is being greatly reworked. We can revisit the issue of documenting its behavior once that work has been completed. Additionally, this specification is intended (for now) to focus on the user-visible behavior of Relax rather than specifying lower-level interfaces or the precise mechanisms of Relax's implementation.

Aspects Requiring Review or Still to be Determined

Since this document is a draft, any part of it is up for review and open to revision, but certain parts of the document have proven particularly challenging to describe and could benefit the most from community discussions.

  1. The StructInfo system has been a great challenge to specify and there are many questions as to how it should work. One question is whether "strong shaping" might lead to too many error messages for something that could be checked dynamically. Additionally, many potential shape mismatches could be eliminated using constant propagation or other transformations: If we check shapes without applying transformations, we would force users to add lots of redundant shape checks. On the other hand, if we require these transformations first, that might make the code harder for users to reason about.
  2. The run-time representations of values in the language will be important for determining how PackedFuncs can interact with Relax values. However, embedded targets do not support the TVM object system, so describing values in terms of the TVM object system directly may not work for all settings. Additionally, it should be determined how much detail about the representations should be included in the specification.
  3. Operators used for core language functions (like call_tir) should be described in the specification. I am not certain the descriptions presently in the last section are entirely correct, so more review of them should be appreciated. Additionally, are there operators that should be there but are presently missing?
  4. Finally, there is the question of process: How will we permit the specification to be revised? Does any change require a fresh RFC? Is there a threshold for changes that can be done as a direct PR? I have not considered this question directly, but the language specification is an important document for the community and changes to the language specification should not be taken lightly.

There are also some more minor TODOs throughout the document.

The Future of This Document

Eventually, we will want this document to be part of the Relax documentation, in which case it will be placed into a different location in the repo and probably be formatted as an rst file. Before that, we will officially RFC the spec into TVM to allow for maximum public discussion as to the design decisions underlying the specification. Hence, I am going to leave this document as a "WIP PR" until it is officially RFC'd.

@slyubomirsky slyubomirsky marked this pull request as draft October 20, 2022 23:35
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated

**Scope of Shape Variables**

Shape variables can be introduced in two places in a Relax program: In a function signature, where they may be included with the argument shapes and return shape annotations, or in `MatchShape` bindings. Shape variables used in the function signature are scoped to the entire function in which they appear. Shape variables used in `MatchShape` bindings are scoped only to the `SeqExpr` in which they appear.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dealing with shape variables has been one of the trickiest parts of the specification, in my opinion. Scoping them the same way as Relax variables seemed like a simple-enough rule but it leads to some problematic situations like having the value returned by SeqExpr whose shape expression contains shape variables defined inside the SeqExpr. Per this scoping rule, the shape of the SeqExpr would have to be RuntimeDepShape, which would lose information.

One alternative might be to scope shape variables to the entire function, but there are some potentially problematic cases, like initializing a var in one branch of an If and not the other. Allowing them to scope more broadly may be helpful for checking the shape_ of a SeqExpr, but creates complexity.

relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved

There can be some complexity involved in checking whether two shapes match during shape inference. A very simple, conservative method for determining equality is simply using alpha-equivalence: If the two shapes have the same structure, then they are equivalent. However, this method is conservative and can overlook numerical properties in `PrimExpr`s. We leave it up to compiler implementations as to whether to use more advanced methods for proving equivalence, such as attempting to use algebraic rewrite rules. (As a consequence, portability requires inserting dynamic checks wherever there needs to be a comparison of shapes.)

Note that optimizations like function inlining or constant folding could allow for simplifying many shape annotations and expressions and make it possible to conclude at compile time that shapes in more cases are equivalent. In general, developing compiler infrastructure for partial evaluation and reasoning about common situations with shape annotations may eliminate many dynamic checks.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I emphasize that as in the case for getting the shape of a call, it may be advantageous to do these transformations before filling in shapes, though it may be harder for users to reason about when they need a dynamic check in that case. Do we want to consider the possibility of doing canonicalization before shape inference?

relax_spec.md Outdated
2. If `r` is true, evaluate the `true_branch` and return its result.
3. If `r` is false, evaluate the `false_branch` and return its result.
9. The node `Call(op, [arg1, arg2, ..., argn])` is evaluated as follows:
1. If `op` is an `ExternFunc` node, then evaluate `arg1`, `arg2`, …, `argn` in that order and call the results `a1`, `a2`, …, `an`. Next, look up the `PackedFunc` registered under the global symbol name. If it exists (it is an error at run time if it does not), call the `PackedFunc` using the given arguments and return the result. Note that if a TIR `PrimFunc` in the `IRModule` has a global symbol attribute registered, it can be called as an `ExternFunc` using that global symbol as well. `PackedFunc`s may have arbitrary side effect and are responsible for whether the result is a newly allocated value or aliases another value.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return the result

We may want to make a guarantee about wrapping None results into unit (empty tuples). TVM has a runtime object corresponding to "none" values that is distinct from an empty tuple, but in Relax, we may want to treat these as one case. Alternatively, we can represent empty tuples as None values.

relax_spec.md Outdated Show resolved Hide resolved
@slyubomirsky
Copy link
Collaborator Author

slyubomirsky commented Oct 21, 2022

See this earlier discussion for thoughts on whether we should use tir::Any to represent a wildcard dimension. For simplicity, the current proposal does not use a per-dimension wildcard and instead requires the entire shape to be relaxed to RuntimeDepShape, but this comes at the cost of partial shape information. It is worth considering whether such partial shape information is worth it for compilation. (As the draft details in the "Possible Extensions to the Shape Expression System" section, we can potentially add such a feature in a later revision without breaking old code.)

relax_spec.md Outdated Show resolved Hide resolved
@slyubomirsky
Copy link
Collaborator Author

Another issue that has come to my attention thanks to my colleagues @YuchenJin, @sunggg, and @psrivas2 is that while it may be reasonable to leave certain kinds of data structures as opaque "Relax objects" that we can define later because they don't directly interact with many other existing language features, there are some types that we might reasonably want to add into the language that would interact with a lot of existing features.

Namely, there might be tensor variants that we would want to consider: dense vs sparse tensors, as well as ragged tensors. Adding these into the language design later could complicate how we deal with tensors, so it is likely to be worthwhile to consider these topics at this early stage in the language design process and possibly include these directly in the specification. That would be far preferable to later making changes that could potentially break backwards compatibility or require radically revising compiler passes. I encourage any community members interested in tensor variants like these to comment on those issues as well.

@altanh
Copy link
Collaborator

altanh commented Oct 24, 2022

Another issue that has come to my attention thanks to my colleagues @YuchenJin, @sunggg, and @psrivas2 is that while it may be reasonable to leave certain kinds of data structures as opaque "Relax objects" that we can define later because they don't directly interact with many other existing language features, there are some types that we might reasonably want to add into the language that would interact with a lot of existing features.

Namely, there might be tensor variants that we would want to consider: dense vs sparse tensors, as well as ragged tensors. Adding these into the language design later could complicate how we deal with tensors, so it is likely to be worthwhile to consider these topics at this early stage in the language design process and possibly include these directly in the specification. That would be far preferable to later making changes that could potentially break backwards compatibility or require radically revising compiler passes. I encourage any community members interested in tensor variants like these to comment on those issues as well.

I highly recommend taking a look at MLIR's current approach to sparse tensors: https://mlir.llvm.org/docs/Dialects/SparseTensorOps/ (which is inspired by the TACO project). I'm not sure how hard it would be to design something similar for Relax; MLIR has a lot more machinery around "attributes" and just general metaprogramming stuff like tablegen. At a high level though, I think the MLIR approach is to keep high level IR focused on "abstract" tensors with no concern for data layout/representation, and use attributes at some stage to hijack code generation for specialized backend representations.

I think @yzh119 has done a ton of work on the TIR side for sparse stuff, so his thoughts here would definitely be valuable!

@slyubomirsky
Copy link
Collaborator Author

slyubomirsky commented Oct 24, 2022

Thanks for the links. Leaving density/sparsity for later would be an option in that case, though ragged tensors would require rules for handling their shapes (perhaps the per-dimension Any would be a viable option there? This is actually not equivalent to using a dummy variable for _, since different tensors could each have a different value for that dimension)

@YuchenJin
Copy link
Collaborator

Thanks for the links. Leaving density/sparsity for later would be an option in that case, though ragged tensors would require rules for handling their shapes (perhaps the per-dimension Any would be a viable option there? This is actually not equivalent to using a dummy variable for _, since different tensors could each have a different value for that dimension)

@MasterJH5574 and @VertexC are working on native ragged tensor support in Relax, and they are designing the shape for it.

Copy link
Collaborator

@sunggg sunggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slyubomirsky, thank you for your great efforts to make the language more formal!
Overall, it looks good to me and I left some questions below.
I observed some discrepancy with this doc and our current implementation. Although they deliver the very similar message, I would like to suggest to match them as much as we can for the better readability and ramp-up experience for newcomers.

relax_spec.md Outdated
# bindings (analogous to statements)
Binding ::=
VarBinding(var: Var|DataflowVar, value: Expr)
| MatchShape(var: (Var|DataflowVar)?, pattern: [PrimExpr], value: Expr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment: it might be more readable if we match the rule with our function declarations by moving optional arguments to the behind.
e.g.,

@tvm._ffi.register_object("relax.expr.MatchShape")
class MatchShape(Binding):
    """Symbolic shape match, binds the variable of the lhs with the rhs."""

    value: Expr
    pattern: List[PrimExpr]
    var: Var

    def __init__(self, value: Expr, pattern: List[PrimExpr], var: Var, span: Span = None) -> None:
        self.__init_handle_by_constructor__(_ffi_api.MatchShape, value, pattern, var, span)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I think it makes more logical sense for the var to come first in a binding

relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
relax_spec.md Outdated Show resolved Hide resolved
@slyubomirsky
Copy link
Collaborator Author

slyubomirsky commented Nov 30, 2022

In today's community meeting, @tqchen raised the idea that our shape analysis should be better thought of as "best-effort" (making static guarantees when it can but not making claims of completeness), where we treat user shape annotations as assumptions, so I'm wondering if that perspective might allow us to simplify the shape inference areas of the spec. One reason to go with this interpretation is (as this section of the spec discusses) that different passes and transformations might uncover more information that might allow for drawing conclusions in more cases.

Provisionally, I am imagining that (as we discussed at the community meeting), the compiler would be required to insert dynamic checks of shapes at function and operator boundaries unless it can statically prove that shapes match.

From that perspective, would it then make sense not to raise errors or warnings if the compiler cannot conclude whether two shapes are definitely not equal (if it concludes two shapes are definitely not equal, it should certainly raise an error)?

I'm curious for more thoughts on this, since shape inference is certainly the trickiest area of the specification.

@slyubomirsky
Copy link
Collaborator Author

The shape inference issue may be best decided in #293, which could replace shape_ with something a little better-defined.

@slyubomirsky
Copy link
Collaborator Author

I've added Relax's normal form to the specification, since in the Dec. 13 community meeting, we noted that StructInfo (will be specified once we know more about how it should work) will likely rely on it.

@slyubomirsky
Copy link
Collaborator Author

I've overhauled the spec to deal with the newly implemented structural information system, which was a substantial challenge to specify. It is much more powerful than shape_ was previously, but requires quite a bit more machinery in the specification. I invite close review from those working on the structural information system: @tqchen @Hzfengsy @MasterJH5574

relax_spec.md Outdated Show resolved Hide resolved
@slyubomirsky
Copy link
Collaborator Author

apache/tvm#14148 Reposted in the unity branch, withdrawing this PR now

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants