This code uses a modified and slightly smaller version of the spider dataset. Using SQL create commands as context instead of the weird other schema spider uses.
- Inside the llama directory that is the notebook that will fine-tune Code Llama 34B on the dataset
- It uses Huggingface transformers to pull the model and dataset from the huggingface hub
This is a Lora fine-tune quantising the model to 8 bits. Therefore, you'll need just over 40GB of GPU memory. I used an A40. But you could use an 80GB A100 or A6000 too.
Inside the openai folder there are two notebooks and the two dataset files (formatted for OpenAI):
- openai_train.ipynb loads the datasets and sends them to OpenAI to fine-tune GPT 3.5
- openai_dataset_analytics.ipynb reads an OpenAI dataset jsonl file and works out whether there are any errors in the file and the total number of tokens in the dataset.
My full post on the results of this is here!