We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi,
Whenever a saved RewardModel is loaded via RewardModel.from_pretrained(model_path, flash_attn=True, fp16=False, bf16=True, low_cpu_mem_usage=True), it downloads the entire sharded checkpoint (https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/modeling_utils.py#L2876), which is already ~30GB because it contains all the weights of the reward model, including both the backbone model and the reward head. It then calls RewardModel.__init__() (via this line https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/modeling_utils.py#L2966), which loads all the weights of the backbone model (SFT10K, another ~30GB). Surely loading a pretrained model shouldn't require loading the backbone model weights twice?
RewardModel.from_pretrained(model_path, flash_attn=True, fp16=False, bf16=True, low_cpu_mem_usage=True)
RewardModel.__init__()
Thanks!
The text was updated successfully, but these errors were encountered:
Merge pull request tatsu-lab#75 from tatsu-lab/rohan/vicuna-v1.3
e5da1ee
gpt4 annotations for vicuna v1.3
No branches or pull requests
Hi,
Whenever a saved RewardModel is loaded via
RewardModel.from_pretrained(model_path, flash_attn=True, fp16=False, bf16=True, low_cpu_mem_usage=True)
, it downloads the entire sharded checkpoint (https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/modeling_utils.py#L2876), which is already ~30GB because it contains all the weights of the reward model, including both the backbone model and the reward head. It then callsRewardModel.__init__()
(via this line https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/modeling_utils.py#L2966), which loads all the weights of the backbone model (SFT10K, another ~30GB). Surely loading a pretrained model shouldn't require loading the backbone model weights twice?Thanks!
The text was updated successfully, but these errors were encountered: