GitHub - april-tools/pixar: Repository of PIXAR, a Pixel-based Auto-Regressive Language Model

1. Environment Setup

1. Create the environment

Text images are rendered differently under different versions of cairo and fonttools. I higly recommand to create the environment step by step using the following commands.

conda create -n {name} python=3.10
conda activate {name}
conda install -c conda-forge pycairo=1.24.0 cairo=1.16.0 fonttools=4.25.0 pygobject=3.42.1 manimpango=0.4.1 # make sure the version matches, othewise the rendered text may different
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers==4.34.1 datasets==2.14.6 tokenizers==0.14.1 diffusers["torch"]==0.17.1
pip install opencv-python==4.6.0.66 opencv-contrib-python==4.6.0.66
pip install ipywidgets wandb==0.15.12 pytesseract==0.3.10 editdistance==0.6.2 adamr scikit-image==0.22.0 scikit-learn==1.3.0
pip install nltk==3.8.1 apache-beam==2.51.0
pip install paddlepaddle==2.5.2 paddleocr==2.7.0.3
pip install flash-attn --no-build-isolation # install FlashAttention https://github.com/Dao-AILab/flash-attention
sudo apt install tesseract-ocr # if you cannot use sudo, please refer to https://tesseract-ocr.github.io/tessdoc/Installation.html

Setup the project directory

Here are commands to setup the project folder.

cd {where you want to put the project folder}
git clone --recurse-submodules [email protected]:april-tools/pixar.git
pip install -e pixel
cd distro
./install.sh
source ~/.bashrc
cd ../bAbI-tasks
luarocks make babitasks-scm-1.rockspec
cd ..
mkdir storage

2. Prepare the pretraining dataset

Clone the PIXEL weights, fonts

git clone https://huggingface.co/Team-PIXEL/pixel-base {where you want to put the model}

Run the data preparation scripts

The scripts only collate text files, does not render them into image. By default, we render images during the training.

python prepare_dataset.py

2. Pretraining

When you run the scripts, remember to modify hyperparameters to suit your environment or change the model settings. A detailed hyperparameter instruction is coming soon.

Initialize the backbone model

python scripts/init_models.py

Stage one pretraining

bash scripts/1M_pretrain/dllama_l2_b0.sh

Stage two pretraining

bash scripts/gan_pretrain/full_GAN1.sh

3. Download the pretrained checkpoint

Here is the model checkpoint used to test the Lambda and bAbI performance in the paper. It was trained with 1M stage 1 and 200 stage 2 steps.

https://drive.google.com/file/d/1uFr78VttIfOOqYxyCcsoc7ewsQvU_BVf/view?usp=sharing

Here are some samples generated from it:

https://drive.google.com/drive/folders/1vHmF0UHGKAzhtOUtP55_m6VrfVQIHc4K?usp=drive_link

4. Run the downstream tasks

GLUE

bash scripts/llama_glue/dllama_2b1M.sh

bAbI

You can use babi.ipynb to evalueate the performance on the bAbI benchmark.

LAMBADA

You can use lambada.ipynb to evalueate the performance on the bAbI benchmark.

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
LatentPixel		LatentPixel
bAbI-tasks @ ccd8fd6		bAbI-tasks @ ccd8fd6
distro @ 0219027		distro @ 0219027
pixel @ 20216df		pixel @ 20216df
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
babi.ipynb		babi.ipynb
environment.yml		environment.yml
eval_image.py		eval_image.py
evaluate_gpt2.py		evaluate_gpt2.py
lambada.ipynb		lambada.ipynb
playground.ipynb		playground.ipynb
prepare_dataset.py		prepare_dataset.py
readability.ipynb		readability.ipynb
recognition_test.py		recognition_test.py
recon_test.sh		recon_test.sh
test_flash.py		test_flash.py
train.py		train.py
train_autoencoder.sh		train_autoencoder.sh
train_llama.sh		train_llama.sh
try_gan.sh		try_gan.sh
try_glue.sh		try_glue.sh
try_train.sh		try_train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Environment Setup

1. Create the environment

2. Prepare the pretraining dataset

2. Pretraining

3. Download the pretrained checkpoint

4. Run the downstream tasks

About

Releases

Packages

Languages

april-tools/pixar

Folders and files

Latest commit

History

Repository files navigation

1. Environment Setup

1. Create the environment

2. Prepare the pretraining dataset

2. Pretraining

3. Download the pretrained checkpoint

4. Run the downstream tasks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages