Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug-fix: snprintf prints NULL in place of the last character #10419

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kallewoof
Copy link

@kallewoof kallewoof commented Nov 20, 2024

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.

char buf[5];
snprintf(buf, 5, "hello");
printf("%s", buf); // -> hell\0

Because of this, when copying the C string into the std::string, we get a \u0000 at the end of it, which can cause issues.

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.
@ngxson
Copy link
Collaborator

ngxson commented Nov 20, 2024

@slaren The return value of snprintf excludes the terminating null byte. Should we add one byte before returning from llama_model_meta_val_str, or it's up to the user?

In anyway, I think we should leave a comment in llama.h to clarify this behavior.

@kallewoof
Copy link
Author

kallewoof commented Nov 20, 2024

Edited: never mind, this is C code..

Added a comment. I think a cleaner approach might be to have this allocate and return a char* which is then strlen()'d instead. Chat templates are not enormous enough that the extra length check will pose a noticeable impact.

@kallewoof
Copy link
Author

kallewoof commented Nov 20, 2024

I pushed an alternative PR in #10424 which simplifies the interface but requires a free() call.
Rewritten to not require free() as #10430.

@slaren
Copy link
Collaborator

slaren commented Nov 20, 2024

Should we add one byte before returning from llama_model_meta_val_str, or it's up to the user?

It's up to the user. They do not need to use NUL-terminated strings.

@kallewoof
Copy link
Author

kallewoof commented Nov 20, 2024

Unless there are architecture subtleties that I'm not aware of, I think #10430 is a cleaner solution, but keeping both up until a judgement is made.

@ngxson
Copy link
Collaborator

ngxson commented Nov 20, 2024

Should we also apply this patch to every places where llama_chat_apply_template is called? (We don't have many of those)

@kallewoof
Copy link
Author

Should we also apply this patch to every places where llama_chat_apply_template is called? (We don't have many of those)

I looked and it doesn't seem like it is affected since the allocated buffer is basically guaranteed to be bigger than the chat template (including NULL term).

@ngxson
Copy link
Collaborator

ngxson commented Nov 21, 2024

Sorry I mean changing all other places using llama_model_meta_val_str (there is one inside llama_chat_apply_template IIRC)

@kallewoof
Copy link
Author

kallewoof commented Nov 22, 2024

Sorry I mean changing all other places using llama_model_meta_val_str (there is one inside llama_chat_apply_template IIRC)

Right -- the only other place is in server.cpp specifically in

bool validate_model_chat_template() const {
std::vector<char> model_template(2048, 0); // longest known template is about 1200 bytes
std::string template_key = "tokenizer.chat_template";
int32_t res = llama_model_meta_val_str(model, template_key.c_str(), model_template.data(), model_template.size());
if (res >= 0) {
llama_chat_message chat[] = {{"user", "test"}};
std::string tmpl = std::string(model_template.data(), model_template.size());
int32_t chat_res = llama_chat_apply_template(model, tmpl.c_str(), chat, 1, true, nullptr, 0);
return chat_res > 0;
}
return false;
}

which also preallocates a big enough buffer, like llama_chat_apply_template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants