You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the fine-tuning process was done successfully, however when I try to run separate the inference by loading the code"
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
base_model = "codellama/CodeLlama-7b-hf"
model = AutoModelForCausalLM.from_pretrained(
base_model,
load_in_8bit=True,
torch_dtype=torch.float16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
from peft import PeftModel
model = PeftModel.from_pretrained(model, "/home/fm/codellama/sql-code-llama/checkpoint-400")
eval_prompt = """You are a powerful text-to-SQL model. Your job is to answer questions about a database. You are given a question and context regarding one or more tables.
You must output the SQL query that answers the question.
### Input:
Which Class has a Frequency MHz larger than 91.5, and a City of license of hyannis, nebraska?
### Context:
CREATE TABLE table_name_12 (class VARCHAR, frequency_mhz VARCHAR, city_of_license VARCHAR)
### Response:
"""
model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda")
model.eval()
with torch.no_grad():
print(tokenizer.decode(model.generate(**model_input, max_new_tokens=100)[0], skip_special_tokens=True))
I am getting this error:
loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.19s/it]
Traceback (most recent call last):
File "/home/fm/codellama/evaluate.py", line 14, in
model = PeftModel.from_pretrained(model, "/home/florin.manaila/codellama/sql-code-llama/checkpoint-400")
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/peft/peft_model.py", line 332, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/peft/peft_model.py", line 629, in load_adapter
adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 222, in load_peft_weights
adapters_weights = safe_load_file(filename, device=device)
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/safetensors/torch.py", line 308, in load_file
with safe_open(filename, framework="pt", device=device) as f: safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
Is something I am doing wrong?
Thank you,
Florin
The text was updated successfully, but these errors were encountered:
Your safetensors file should be broken.
Just check the size of it and if it is really small then that is the issue.
My suggestion is to just comment out the following code to avoid torch compile, it may speed something up but it actually messed up with checkpoint generation from my experience.
model.config.use_cache = False
old_state_dict = model.state_dict
model.state_dict = (lambda self, *_, **__: get_peft_model_state_dict(self, old_state_dict())).__get__(
model, type(model)
)
if torch.__version__ >= "2" and sys.platform != "win32":
print("compiling the model")
model = torch.compile(model)
After then, just rerun the training. It will converge in 40-60 steps so there is no need to run it for 400 or ish. Once you are done with the new training, I think you will be good to go to try the adapter model loading leveraging peft.
Hello,
the fine-tuning process was done successfully, however when I try to run separate the inference by loading the code"
I am getting this error:
loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.19s/it]
Traceback (most recent call last):
File "/home/fm/codellama/evaluate.py", line 14, in
model = PeftModel.from_pretrained(model, "/home/florin.manaila/codellama/sql-code-llama/checkpoint-400")
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/peft/peft_model.py", line 332, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/peft/peft_model.py", line 629, in load_adapter
adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 222, in load_peft_weights
adapters_weights = safe_load_file(filename, device=device)
File "/home/fm/anaconda3/envs/genai/lib/python3.10/site-packages/safetensors/torch.py", line 308, in load_file
with safe_open(filename, framework="pt", device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
Is something I am doing wrong?
Thank you,
Florin
The text was updated successfully, but these errors were encountered: