Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 569 Bytes

README.md

File metadata and controls

21 lines (16 loc) · 569 Bytes

Finetuning and Inference

To be able to run and finetune the models you need a >6GB VRAM Nvidia GPU with CUDA.

Xturing, vLLM, and QLora are used in finetuning and running inference. 

Install into separate virtual environments as they each use different library versions.

conda create -n xt python=3.10
conda activate xt
pip install xturing==0.1.4

conda create -n vl python=3.10
conda activate vl
pip install vllm==0.1.2

conda create -n ql python=3.10
conda activate ql
git clone https://github.com/artidoro/qlora
pip install -U -r requirements.txt