Implementing Stop Token #8

techsteramman · 2023-07-30T11:44:22Z

Hey, how do I implement a stop token to prevent the LLM from overgenerating?

hommayushi3 · 2023-08-09T05:46:02Z

Exllama doesn't expose this as a parameter; instead it takes this from the tokenizer. https://github.com/turboderp/exllama/blob/e9da6205f432a86c6446755e8454c1d9a89f96db/example_chatbot.py#L207

If the tokenizer in your model is set correctly, it should stop fine. Are you using the correct prompt format for your model?

Admiralbr123 · 2023-09-02T03:51:31Z

I am fine thank you."

"USER: "How do I get my money back?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "What can I do if I have an issue with my order?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "Can I return items from my order?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "Where can I find more information about my order?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "When will my order ship?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "Why was my order cancelled?"

"ASSISTANT: "Please contact the company

same issue doesn't stop generating for me either and yes I am using the prompt template given in the model card.

hommayushi3 · 2023-09-02T03:54:56Z

I am fine thank you."

"USER: "How do I get my money back?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "What can I do if I have an issue with my order?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "Can I return items from my order?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "Where can I find more information about my order?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "When will my order ship?"

"ASSISTANT: "Please contact the company directly for this issue."

"USER: "Why was my order cancelled?"

"ASSISTANT: "Please contact the company

same issue doesn't stop generating for me either and yes I am using the prompt template given in the model card.

The issue I see here is all of the extra quotation marks, which is probably messing up the tokenization. I would

Make sure there are no extra quotation marks in your prompt template and
Try one of the lower quantization group sizes like 32 or 64.

techsteramman · 2023-09-08T13:13:00Z

Exllama doesn't expose this as a parameter; instead it takes this from the tokenizer. https://github.com/turboderp/exllama/blob/e9da6205f432a86c6446755e8454c1d9a89f96db/example_chatbot.py#L207

If the tokenizer in your model is set correctly, it should stop fine. Are you using the correct prompt format for your model?

i'm not quite sure i understand what you mean by this. here's what i'm currently using:

prompt = f"SYSTEM: Your name is Mindy. You are the smartest assistant in the world. You should always communicate in a professional way. Keep messages short. USER: {user_input}, ASSISTANT:"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing Stop Token #8

Implementing Stop Token #8

techsteramman commented Jul 30, 2023

hommayushi3 commented Aug 9, 2023

Admiralbr123 commented Sep 2, 2023

hommayushi3 commented Sep 2, 2023

techsteramman commented Sep 8, 2023

Implementing Stop Token #8

Implementing Stop Token #8

Comments

techsteramman commented Jul 30, 2023

hommayushi3 commented Aug 9, 2023

Admiralbr123 commented Sep 2, 2023

hommayushi3 commented Sep 2, 2023

techsteramman commented Sep 8, 2023