Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpecInfer generate '<pad>' #12

Open
dutsc opened this issue Feb 19, 2024 · 1 comment
Open

SpecInfer generate '<pad>' #12

dutsc opened this issue Feb 19, 2024 · 1 comment
Assignees

Comments

@dutsc
Copy link

dutsc commented Feb 19, 2024

My machine configuration is 4*3090, and my example prompt is: please introduce Kobe Bryant, who played basketball in NBA. I use three SSMs, all of which are opt-125M. Only when the LLM uses opt-13b, the generated text looks It's normal until it gets up, as follows:
13b

When I use smaller LLMs (opt-6.7b, opt-1.3b), the generated text is all .
6 7b
1 3b

why is that?

My script is as follows: (in the directory /workspace/Flexflow/build/). The prompt.json is "please introduce Kobe Bryant, who played basketball in NBA".

./inference/spec_infer/spec_infer \
    -ll:gpu 4 \
    -ll:fsize 22000 \
    -ll:zsize 30000 \
    -llm-model /models/opt-13b/ \
    -ssm-model /models/opt-125m/ \
    -ssm-model /models/opt-125m/ \
    -ssm-model /models/opt-125m/ \
    -prompt /workspace/FlexFlow/prompts/prompt.json \
    -tensor-parallelism-degree 4 \
    --fusion > ../sclog/spec_infer.log

Thank you very much for your valuable time.

@xinhaoc xinhaoc self-assigned this Feb 20, 2024
@xinhaoc
Copy link
Contributor

xinhaoc commented Feb 24, 2024

@dutsc Hi! We have demonstrated using one ssm can achieve best performance in our latest version paper.
And there is an assertion in the at here to make sure only one ssm is registered. Please make sure you are using the newest code. Please tell me if you still get the incorrect output.

@lockshaw lockshaw transferred this issue from flexflow/flexflow-train Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants