Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chatllama] what's supposed to be in the Actor checkpoint dir? #337

Open
StrangeTcy opened this issue Apr 18, 2023 · 3 comments
Open

[Chatllama] what's supposed to be in the Actor checkpoint dir? #337

StrangeTcy opened this issue Apr 18, 2023 · 3 comments

Comments

@StrangeTcy
Copy link

I'm following the example from your README, and it works like this:
my config.yaml has these line for the actor:

actor_config:
  model: "decapoda-research/llama-7b-hf"
  model_folder: "./models"

Then I run python artifacts/main.py artifacts/config/config.yaml --type ACTOR.
main.py calls ActorTrainer from actor.py, which calls load_model from llama_model.py, which calls load_checkpoints, which expects to find *.pth files and a params.json file in my ckpt_dir which is ./models. My models folder has neither . I wonder where these file have to come from.

@StrangeTcy
Copy link
Author

Oh. It seems like you mean the single file you'd get from running llama.donwload from pyllama. Let me try it out...

@PierpaoloSorbellini
Copy link
Collaborator

PierpaoloSorbellini commented Apr 19, 2023

Hi @StrangeTcy, thanks for reaching out.
The first time that the model is loaded, you probably won't have the checkpoints dir in /models.

The folder is created when during training checkpoints are saved and the folder gets populated.

To specify the models from HF you just need to type in the config.yaml in the model field the name of the model from HF that is passed to transformer.AutoModel() when instantiating the model.

Be aware that HF itself had an issue when loading the tokenizer for llama. You may need to check that if it is still an issue.

@StrangeTcy
Copy link
Author

The first time the model is loaded from ./models, there are indeed no checkpoints there, but they can be downloaded with the python or bash script from pyllama.

As for HF models and LLaMA, HF transformers are indeed handled by the

 self.model = AutoModelForCausalLM.from_pretrained(
                config.model,
            )
            ``` in the `actor.py`, but pure llama models go through `load_model` from `llama_model.py`.
            
I guess I should try something like `decapoda-research/llama-7b-hf` as an HF model instead of the single-file llama checkpoint            

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants