💬 Discord •
This repository aims to implement SOTA efficient token/channel mixers. Any technologies related to non-Vanilla Transformer are welcome. If you are interested in this repository, please join our Discord.
- Token Mixers
- Linear Attention
- Linear RNN
- Long Convolution
- Channel Mixers
- GPT
- Doreamonzzz/xmixers_gpt_120m_50b
- LLaMA
- Doreamonzzz/xmixers_llama_120m_50b
- Add special init.
- LLaMA.
- GPT.
- Add data type for class and function.
- long_conv_1d_op.
- Gtu.
[Feature Add]
[Bug Fix]
[Benchmark Add]
[Document Add]
[README Add]