How do I use other pre-trained models for this project? #417
-
Hi there! What do I need to change in the notebook in order for me to use something like GPT-J or Llama models? Currently the main code uses GPT-2 so I want to evaluate and compare the performances of the other models with GPT-2 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi there, I also made a notebook with the Llama 3.2 1B standalone code that you could use to replace the GPT-2 model code in Chapter 4 and later: https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/07_gpt_to_llama/standalone-llama32.ipynb |
Beta Was this translation helpful? Give feedback.
Hi there,
I have some bonus material on converting the GPT-2 model to a Llama model here: https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/07_gpt_to_llama (it's a step-by-step guide for educational purposes).
I also made a notebook with the Llama 3.2 1B standalone code that you could use to replace the GPT-2 model code in Chapter 4 and later: https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/07_gpt_to_llama/standalone-llama32.ipynb