Skip to content

Releases: mirage-project/mirage

v0.2.2

30 Oct 23:03
104b8fb
Compare
Choose a tag to compare

What's Changed

  • [Search] Support input/output strides specification in #108
  • [Docs] Update Documentation to build C++ library in #109
  • [Transpiler] Parallel transpile to accelerate superoptimize speed by 10x faster by @GuangyaoZhang in #119
  • [Transpiler] Adding mechanism to skip invalid transpiled kernels in #117
  • [Visualizer] Add functionality to visualize mugraphs by @NorthmanPKU in #113
  • [Transpiler] Add shared memory usage as part of the cost when determining the layouts for stensors in #130

New Contributors

Full Changelog: v0.2.1...v0.2.2

v0.2.1

14 Oct 13:43
68075ea
Compare
Choose a tag to compare

What's Changed

  • [Docs] Add doc files by @jiazhihao in #90
  • fix silu by @xinhaoc in #100
  • [Layout] adding initial support that allows users to define customized input/output strides for kernel graphs. by @jiazhihao in #98
  • Set default strides for outputs by @wmdi in #105

Full Changelog: v0.2.0...v0.2.1

v0.2.0

01 Oct 18:38
8edf81c
Compare
Choose a tag to compare

Major release with a range of changes to Python interface, search implementation, transpiler, and documentation.

What's Changed

  • [Triton CodeGen] Fix an issue when generating Triton programs from mugraphs
  • [LoRA demo] Add the checkpoint file for the lora demo
  • [DeviceMemoryManager] Use offsets instead of pointers to locate tensors and fingerprints in device memory
  • [Graph Generator] Parallelize the generation algorithm
  • Improve parallel search performance
  • [Accumulator] Decouples accumulator from output saver in threadblock graphs
  • Update the setup workflow for packaging
  • Add more element_unary & element_binary operators at the kernel and threadblock levels
  • [CUDA Transpiler] Supporting JIT transpilation and compilation
  • [Search] Range-based pruning
  • Fix some existing issues by @xinhaoc in #63
  • [Transpiler] Support threadblock matmul using cute when the input/output stensors have more than 2 dimensions
  • Include header files for JIT compilation. MIRAGE_ROOT is no longer required.
  • [Python] update python interface to support search
  • [Search] Adjust the expansion phase of search
  • [Search] Improve the display of search statistics
  • Set default max_num_threadblock_graphs to 1

New Contributors

Full Changelog: https://github.com/mirage-project/mirage/commits/v0.2.0