Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对话数据怎么设置不对别人说的话训练? #264

Open
petergaoshan opened this issue Oct 10, 2024 · 3 comments
Open

对话数据怎么设置不对别人说的话训练? #264

petergaoshan opened this issue Oct 10, 2024 · 3 comments

Comments

@petergaoshan
Copy link

通常对话数据只在模型的回复上面进行梯度下降。比如把非模型输出的label index变成-100。
类似这样:

<start>user
你好<end>
<start> assistant
请问有什么帮助你的?<end>

transformer只在assistant后面进行梯度下降。
rwkv这种结构是不是不太适合这种只在回复上面训练的方式?如果适合,请问在RWKV-LM训练的时候是怎么设定的?

@Triang-jyed-driung
Copy link

  1. RWKV的训练方式跟GPT、Llama完全一致。
  2. 在User上进行训练似乎没有副作用

@petergaoshan
Copy link
Author

  1. RWKV的训练方式跟GPT、Llama完全一致。
  2. 在User上进行训练似乎没有副作用

感谢。主要很多时候user会犯错,模型要纠正user的错误。不想把错误也学进去。

@uniartisan
Copy link

Please consider using a loss with mask, you can skip user's tokens when calculating cross-entropy

Also, you can use:
https://github.com/JL-er/RWKV-PEFT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants