Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] a new model adapter to speed up many models inference performance on Intel CPU #2554

Closed
a3213105 opened this issue Oct 13, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@a3213105
Copy link
Contributor

Hi guys,
We have developed an opensource solution to speedup LLM inference on CPUs (especially on XEON).
This is our repo https://github.com/intel/xFasterTransformer .
Now we have supported many models, and more will be supported soon.
We see that FastChat is widely used, so we wanted to enable our solution in FastChat.
We will submit a new adapter to leverage our solution and so that end users can get better inference performance when their use CPUs.
Can I submit a pull request directly? Or do we need to do some process?

@merrymercy merrymercy added the enhancement New feature or request label Oct 13, 2023
@merrymercy
Copy link
Member

Sure. You can submit a PR directly.
Some examples are ExLlama integration (#2455) and vLLM worker (https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/vllm_worker.py)

@a3213105
Copy link
Contributor Author

Thanks, I will prepare the PR :)

Sure. You can submit a PR directly. Some examples are ExLlama integration (#2455) and vLLM worker (https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/vllm_worker.py)

@a3213105
Copy link
Contributor Author

Sure. You can submit a PR directly. Some examples are ExLlama integration (#2455) and vLLM worker (https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/vllm_worker.py)

hi, I have committed a PR #2615 for this feature, thanks for your review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants