You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My machine configuration is 4*3090, and my example prompt is: please introduce Kobe Bryant, who played basketball in NBA. I use three SSMs, all of which are opt-125M. Only when the LLM uses opt-13b, the generated text looks It's normal until it gets up, as follows:
When I use smaller LLMs (opt-6.7b, opt-1.3b), the generated text is all .
why is that?
My script is as follows: (in the directory /workspace/Flexflow/build/). The prompt.json is "please introduce Kobe Bryant, who played basketball in NBA".
@dutsc Hi! We have demonstrated using one ssm can achieve best performance in our latest version paper.
And there is an assertion in the at here to make sure only one ssm is registered. Please make sure you are using the newest code. Please tell me if you still get the incorrect output.
My machine configuration is 4*3090, and my example prompt is: please introduce Kobe Bryant, who played basketball in NBA. I use three SSMs, all of which are opt-125M. Only when the LLM uses opt-13b, the generated text looks It's normal until it gets up, as follows:
When I use smaller LLMs (opt-6.7b, opt-1.3b), the generated text is all .
why is that?
My script is as follows: (in the directory /workspace/Flexflow/build/). The prompt.json is "please introduce Kobe Bryant, who played basketball in NBA".
./inference/spec_infer/spec_infer \ -ll:gpu 4 \ -ll:fsize 22000 \ -ll:zsize 30000 \ -llm-model /models/opt-13b/ \ -ssm-model /models/opt-125m/ \ -ssm-model /models/opt-125m/ \ -ssm-model /models/opt-125m/ \ -prompt /workspace/FlexFlow/prompts/prompt.json \ -tensor-parallelism-degree 4 \ --fusion > ../sclog/spec_infer.log
Thank you very much for your valuable time.
The text was updated successfully, but these errors were encountered: