Skip to content

Commit

Permalink
Update tokenize.py for common >= 1.4.0
Browse files Browse the repository at this point in the history
  • Loading branch information
patrickvonplaten authored Sep 13, 2024
1 parent 41c793f commit 8520eb8
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions finetune/data/tokenize.py
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,10 @@ def tokenize_instruct(
is_first=msg_idx == first_user_idx,
system_prompt=sample.system_prompt,
)
if isinstance(curr_tokens, tuple):
# Versions of mistral_common>1.3.4 return a tuple of tokens (text), tokens (image), spans (image)
curr_tokens = curr_tokens[0]

curr_masks = [False] * len(curr_tokens) # only predict bot answers
elif isinstance(message, ToolMessage):
curr_tokens = instruct_tokenizer.encode_tool_message(
Expand Down

0 comments on commit 8520eb8

Please sign in to comment.