Skip to content
This repository has been archived by the owner on Feb 28, 2024. It is now read-only.

Latest commit

 

History

History
176 lines (116 loc) · 6.63 KB

README.md

File metadata and controls

176 lines (116 loc) · 6.63 KB
Lit-GPT

⚡ Lit-GPT-Chinese

PyPI - Python Version license

Hackable implementation of state-of-the-art open-source large language models for chinese released under the Apache 2.0 license.

Supports the following popular model checkpoints (along with all the english models supported by official Lit-GPT):

Model and usage Model size Reference
Yi-01 6B-Chat, 34B-Chat Yi
Baichuan 2 7B-Chat/Base, 13B-Chat/Base Baichuan 2
ChatGLM3 6B, 6B-Base, 6B-32k ChatGLM3
ChatGLM2 6B ChatGLM2-6B

This implementation extends on Lit-LLaMA and nanoGPT, and it's powered by Lightning Fabric.

 

Lit-GPT design principles

This repository follows the main principle of openness through clarity.

Lit-GPT is:

  • Simple: Single-file implementation without boilerplate.
  • Correct: Numerically equivalent to the original model.
  • Optimized: Runs fast on consumer hardware or at scale.
  • Open-source: No strings attached.

Avoiding code duplication is not a goal. Readability and hackability are.

 

Setup

Clone the repo:

git clone https://github.com/metame-none/lit-gpt-chinese
cd lit-gpt-chinese

Install the minimal dependencies:

pip install -r requirements.txt

Install with all dependencies (including quantization, sentencepiece, tokenizers for Llama models, etc.):

pip install -r requirements-all.txt

(Optional) Use Flash Attention 2

Flash Attention 2 will be used automatically if PyTorch 2.2 (or higher) is installed. Currently, that requires installing PyTorch nightly, which you can get by running:

pip uninstall -y torch torchvision torchaudio torchtext
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121

You are all set! 🎉

 

Use the model

Take ChatGLM3-6B as an example:

  1. Download repo and checkpoints (manually or using git lfs):
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm3-6b $path
  1. Convert the checkpoint to the Lit-GPT format:
ln -snf $path checkpoints/chatglm/chatglm3-6b-hf

python scripts/convert_hf_checkpoint.py --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf
  1. Iteratively generate responses:
python chat/base.py --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf  --precision "16-true"
Optional: check the lit-gpt model is numerically equivalent to the original model.
  • make the following changes to the original model (modeling_chatglm.py):
-@torch.jit.script
+# @torch.jit.script
 def apply_rotary_pos_emb(x: torch.Tensor, rope_cache: torch.Tensor) -> torch.Tensor:
  • check the model difference:
CUDA_VISIBLE_DEVICES=0,1 python tests/test_chatglm3.py model_diff ./checkpoints/chatglm/chatglm3-6b-hf

 

Finetune the model

We provide a simple training scripts (finetune/adapter.py, finetune/adapter_v2.py, and finetune/lora.py) that instruction-tunes a pretrained model on the random 10k samples from multiturn_chat_0.8M dataset.

  1. Download the data and generate an instruction tuning dataset:
python scripts/prepare_belle_chatglm3.py
  1. Run the finetuning script

For example, you can either use

Adapter (Zhang et al. 2023):

python finetune/adapter.py --data_dir ./data/belle_chat_ramdon_10k_chatglm3 --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --out_dir out/adapter/belle_chatglm3_6b --precision "bf16-true"

# test the finetuned model
python chat/adapter.py --adapter_path ./out/adapter/belle_chatglm3_6b/lit_model_adapter_finetuned.pth --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --precision "16-true"

or Adapter v2 (Gao et al. 2023):

python finetune/adapter_v2.py --data_dir ./data/belle_chat_ramdon_10k_chatglm3 --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --out_dir out/adapter_v2/belle_chatglm3_6b --precision "bf16-true"

# test the finetuned model
python chat/adapter_v2.py --adapter_path ./out/adapter_v2/belle_chatglm3_6b/lit_model_adapter_finetuned.pth --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --precision "16-true"

or LoRA (Hu et al. 2021):

python finetune/lora.py --data_dir ./data/belle_chat_ramdon_10k_chatglm3 --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --out_dir out/lora/belle_chatglm3_6b --precision "16-true"

# test the finetuned model
python chat/lora.py --lora_path ./out/lora/belle_chatglm3_6b/lit_model_lora_finetuned.pth --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf  --precision "16-true"

(Please see the tutorials/finetune_adapter for details on the differences between the two adapter methods.)

Reference

For more details, please refer to the Lit-GPT

 

License

Lit-GPT-Chinese is released under the Apache 2.0 license.