Xmixers: A collection of SOTA efficient token/channel mixers

Introduction

This repository aims to implement SOTA efficient token/channel mixers. Any technologies related to non-Vanilla Transformer are welcome. If you are interested in this repository, please join our Discord.

Roadmap

Token Mixers
- Linear Attention
- Linear RNN
- Long Convolution
Channel Mixers

Pretrained weights

GPT
- Doreamonzzz/xmixers_gpt_120m_50b
LLaMA
- Doreamonzzz/xmixers_llama_120m_50b

ToDo

Add special init.

Model

LLaMA.
GPT.

Basic

Add data type for class and function.

Ops

long_conv_1d_op.

Token Mixers

Gtu.

Note

[Feature Add]
[Bug Fix]
[Benchmark Add]
[Document Add]
[README Add]

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
examples		examples
tests		tests
xmixers		xmixers
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xmixers: A collection of SOTA efficient token/channel mixers

Introduction

Roadmap

Pretrained weights

ToDo

Model

Basic

Ops

Token Mixers

Note

About

Releases

Packages

Languages

Doraemonzzz/xmixers

Folders and files

Latest commit

History

Repository files navigation

Xmixers: A collection of SOTA efficient token/channel mixers

Introduction

Roadmap

Pretrained weights

ToDo

Model

Basic

Ops

Token Mixers

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages