Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible to train Llama 3.1? #133

Open
mosh98 opened this issue Jul 30, 2024 · 7 comments
Open

Possible to train Llama 3.1? #133

mosh98 opened this issue Jul 30, 2024 · 7 comments

Comments

@mosh98
Copy link

mosh98 commented Jul 30, 2024

Hi,

I tried training llama 3.1 with run_mntp.py but get an obsucre error

AttributeError: 'LlamaBiModel' object has no attribute 'rotary_emb'

What is that about ?

@bzantium
Copy link

bzantium commented Aug 1, 2024

you can check this: 03382c3

@mosh98
Copy link
Author

mosh98 commented Aug 2, 2024

hmm still not sure what to do...

@stefanhgm
Copy link

Hi everyone,

@bzantium thanks for pointing us to the commit. I added the respective lines and used a more recent version of transformers to make it work. MNTP training for Llama 3.1 seems to work now for me. However, I failed to do the MTEB evaluation locally so far, see #123.

Did you make any progress in training Llama 3.1 for LLM2Vec?

@mosh98
Copy link
Author

mosh98 commented Aug 5, 2024

@bzantium Thanks i was able to get the embeddings after adding in the lines, haven't been able to train it yet through MNTP but i'll keep on trying

@andupotorac
Copy link

@stefanhgm Once Llama 3.1 (I presume the 8B parameters model) is trained, can you use it for generating images the way ELLA uses t5, with better prompt adherence?

@stefanhgm
Copy link

@andupotorac I am not familiar with the ELLA project, but you could use the model to create embeddings just as with the other LLM2Vec models.

However, the eval on MTEB currently hangs #135

@andupotorac
Copy link

Thanks, I will keep an eye on it as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants