How to train different model (InternLM-Math) on ReProver dataset other than google/byt5-small? #67
-
Hello, Training the model on google/byt5-small works! There are other models which have been fine-tuned on other math-related subjects previously like InternLM-Math, so I'd like to get that model trained as a tactic generator (without retrieval), via the command In Log
This seems to be as in model.py, the model is defined in line 87 by When I ran it again, though, I recieved the following error: Log
Maybe this seems to be as there's some training code which uses the property that byt5-small is a T5 model. Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The training script in this codebase is tailored for encoder-decoder Transformers like ByT5, whereas more recent models are typically decoder-only. I would preprocess LeanDojo's data using something like preprocess.py and then train the model using off-the-shelf tools like LLaMA-Factory or torchtune |
Beta Was this translation helpful? Give feedback.
The training script in this codebase is tailored for encoder-decoder Transformers like ByT5, whereas more recent models are typically decoder-only.
I would preprocess LeanDojo's data using something like preprocess.py and then train the model using off-the-shelf tools like LLaMA-Factory or torchtune