Integrate NPU C++ imple into ipex-llm #12461

plusbang · 2024-11-28T06:41:32Z

Description

With https://github.com/intel-analytics/llm.cpp/pull/690, integrate NPU C++ implementation and add qwen2.5-7B support first.

For qwen2.5-7B model, save_directory parameter is necessary when load model with optimize_model=True

jason-dai

LGTM

plusbang added 2 commits November 28, 2024 17:22

fix

1937273

fix conflict

a5a8faf

plusbang force-pushed the port-npu-cpp branch from e275f27 to a5a8faf Compare November 28, 2024 09:31

update

ecc0803

plusbang changed the title ~~[WIP] Integrate NPU C++ imple into ipex-llm~~ Integrate NPU C++ imple into ipex-llm Nov 28, 2024

plusbang marked this pull request as ready for review November 28, 2024 09:41

plusbang requested review from rnwang04 and jason-dai November 28, 2024 09:59

jason-dai approved these changes Nov 28, 2024

View reviewed changes

plusbang merged commit 14d8d3d into intel-analytics:main Nov 29, 2024
1 check passed