Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.

Relax Development Guidelines

Steven S. Lyubomirsky edited this page Sep 29, 2022 · 11 revisions

Relax development guidelines

This doc is meant to serve as guidelines for Relax developers and reviewers, with a quick checklist to review for Relax related PRs.

Rationales

Develop Relax on the following rationales:

R0: Solid foundation of common IR and infra

Our goal is to build a solid foundation to enable more transformations on IRModule=>IRModule pattern. In order to do that, we need a solid and stable foundation of IR and a minimum but robust build pipeline that maps IRModule=>runtime.Module. This means we need to be extra careful in documenting, building common infrastructure for, and testing coverage of the common foundations. We should also continuously improve the foundation as we continue the development, by revisiting the implementations, adding regressions tests, and several rounds of reviews on key parts.

R1: Passes: tighten scope for focus, but always ensure correctness and clear boundary

To make sure we stay focused and land the key items, we encourage tightening up the scope in development(e.g. not eyeing for a broad set of high-level operators). However, it is important to make sure that everything we check in is correct and can handle the general case. As an example, it is OK to check in a pass that only optimizes a dataflow block, when the shape is a known constant. However, the same pass must also not crash when we give it a function that contains symbolic shapes. Instead, it should simply do no-op (no transformation) in such cases. In certain cases where restrictions apply(e.g. EmitTE), we have a clear scope on what applies(need symbolic shape), known specific kinds of arguments. Bring out a clear error message with hints for other cases.

R2: Fallback to safety net when necessary

While fixed rank, dynamic symbolic shape relation covers most of the use cases. Inevitably we also need to be able to cover general cases that may not fall into the category. It is important to establish a "safety net". In the case of Relax, a safety net is a general case that no shape computation is carried out. The support for no shape computation establishes a safety net that we can always fall back into. Specifically, our safety net bypasses the destination passing style, and directly invokes a packed function that takes in a list of outputs and returns output objects. We should always leverage and support the safety net when we find it is impossible to do advanced symbolic deduction, instead of just error out. This is closely related to R1.

Robustness and Coverage

During development of features, there are two important related concepts:

  • Robustness: the code can handle what it can handle, have gracefully fallback(no-op) or error message for things that it can not handle.
  • Coverage: the code can handle many different cases in the IR(e.g. support symbolic dynamic shape case)

We should always aim to develop robust code, but during execution it is totally OK to start with a limited scope(e.g. only optimize static shape while fall back to no-op for other cases) then gradually improve the coverage. It is also a good practice to plan ahead for coverage when possible while maintaining focused execution

CheckList

C0. Example docs: For APIs that are user-facing, always consider adding a code block example that shows input and output.

C1. Ensure test coverage: write test cases that cover the possible aspects(note that not all may apply but worth going through the list). Per R1, we do not expect to make meaningful optimizations in all cases, but the pass should at least perform no-op in cases and not crash in cases that it does not handle:

  • Normal static shape usage
  • Mix of symbolic and static shape
  • Opaque shape expression
  • Data-dependent shape case
  • Regression test: every time we fail on an uncovered case, we add it to our unit tests

C2. Write unit tests: While it is tempting to always compile and run code, it is more efficient and effective to write unit tests for a pass alone. For passes and analyses written in C++, this will require exposing their APIs in Python--it is generally good practice to expose functionality in Python. To implement unit tests, use TVMScript (or other means) to build the IR before your pass and the expected IR after your pass. You can then use structural equality assertion to check if the IR after your pass is structurally equivalent to expected IR. However, when it is impossible to do so, build IRs before, run the transformation, and assert key invariants. To standardize the nomenclature please use the following names when writing unit tests:

  • before: the IRModule before the pass
  • after: the IRModule after applying the pass
  • expected: the expected output IRModule of the pass

C3. Format the code: run ./tests/lint/git-clang-format.sh and tests/lint/git-black.sh

C4. Clear error message and TODO: It is OK to check in passes that implement partial support and improve later. In this case, have a clear validation foundation to validate the pre-conditions, raise clear error messages and leave a TODO so we can come back and improve later. When raising an error, add an ErrorType prefix (e.g. ValueError) to your error message to indicate the specific type of error (more details please refer to Error Handling Guide).

C5. Summarize possible improvements on the foundation: While it is hard to be perfect in the very beginning, it is important to summarize, articulate, and discuss solutions when we find places for improvement or fix in R0.

C6. Global packed function registration: set_body_method<>, set_body_typed, and set_body can all be used to do global function registration. Use whichever is shorter and more succinct. Typically, only use set_body when the function accepts variable-length arguments because it's hard to reason about and document, as well as requiring extra code in both Python and C++.

C7. Find a common pattern(lesson), add it to this list: If we find other common patterns/lessons that we can learn from, add them to this list.

Development Phases

This doc outlines recommended development phases. Note that some of the phases can happen concurrently as long as they are clearly communicated. Given we are still exploring, we also want to be as flexible as possible while driving common anchor points. We relies on issue tracker(discussion tracking), wiki page(design docs) and discord(community discussions).

Discuss: During this phase, a proposal prefixed with [DISCUSS] can be submitted as an issue to the issues page. The author can send the issue link to the discord channel or add it to the online development meeting agenda to engage community discussion.

Design doc After reaching broad discussion and consensus, the issue can be closed and developers are encouraged to write a design doc on the wiki page. Everyone who contributed to the design is added co-authors(alphabetically).

Tracking Open up issues with milestones of focused task items, working items can be more fine-grained to encourage focused deliverables.

Develop After or concurrent with previous phases, submit PRs to implement the proposed items.