Skip to content

Commit

Permalink
update guide for make installation, memory, gguf model link, rm todo …
Browse files Browse the repository at this point in the history
…for windows build
  • Loading branch information
NeoZhangJianyu committed Feb 1, 2024
1 parent 558007b commit 4190d9a
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 6 deletions.
37 changes: 32 additions & 5 deletions README-sycl.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ For Intel CPU, recommend to use llama.cpp for X86 (Intel MKL building).

## Intel GPU

### Verified

|Intel GPU| Status | Verified Model|
|-|-|-|
|Intel Data Center Max Series| Support| Max 1550|
Expand All @@ -50,6 +52,17 @@ For Intel CPU, recommend to use llama.cpp for X86 (Intel MKL building).
|Intel built-in Arc GPU| Support| built-in Arc GPU in Meteor Lake|
|Intel iGPU| Support| iGPU in i5-1250P, i7-1165G7|

Note: If the EUs (Execution Unit) in iGPU is less than 80, the inference speed will too slow to use.

### Memory

The memory is a limitation to run LLM on GPUs.

When run llama.cpp, there is print log to show the applied memory on GPU. You could know how many memory to be used in your case. Like `llm_load_tensors: buffer size = 3577.56 MiB`.

For iGPU, please make sure the shared memory from host memory is enough. For llama-2-7b.Q4_0, recommend the host memory is 8GB+.

For dGPU, please make sure the device memory is enough. For llama-2-7b.Q4_0, recommend the device memory is 4GB+.

## Linux

Expand Down Expand Up @@ -152,6 +165,8 @@ Note:

1. Put model file to folder **models**

You could download [llama-2-7b.Q4_0.gguf](https://huggingface.co/TheBloke/Llama-2-7B-GGUF/blob/main/llama-2-7b.Q4_0.gguf) as example.

2. Enable oneAPI running environment

```
Expand Down Expand Up @@ -223,6 +238,8 @@ Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device

Please install Intel GPU driver by official guide: [Install GPU Drivers](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/arc/software/drivers.html).

Note: **The driver is mandatory for compute function**.

2. Install Intel® oneAPI Base toolkit.

a. Please follow the procedure in [Get the Intel® oneAPI Base Toolkit ](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html).
Expand Down Expand Up @@ -260,15 +277,21 @@ Output (example):
[opencl:cpu:1] Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz OpenCL 3.0 (Build 0) [2023.16.10.0.17_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Iris(R) Xe Graphics OpenCL 3.0 NEO [31.0.101.5186]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Iris(R) Xe Graphics 1.3 [1.3.28044]
```

3. Install cmake & make

a. Download & install cmake for windows: https://cmake.org/download/
a. Download & install cmake for Windows: https://cmake.org/download/

b. Download & install make for Windows provided by mingw-w64

- Download binary package for Windows in https://github.com/niXman/mingw-builds-binaries/releases.

Like [x86_64-13.2.0-release-win32-seh-msvcrt-rt_v11-rev1.7z](https://github.com/niXman/mingw-builds-binaries/releases/download/13.2.0-rt_v11-rev1/x86_64-13.2.0-release-win32-seh-msvcrt-rt_v11-rev1.7z).

b. Download & install make for windows provided by mingw-w64: https://www.mingw-w64.org/downloads/
- Unzip the binary package. In the **bin** sub-folder and rename **xxx-make.exe** to **make.exe**.

- Add the **bin** folder path in the Windows system PATH environment.

### Build locally:

Expand Down Expand Up @@ -309,6 +332,8 @@ Note:

1. Put model file to folder **models**

You could download [llama-2-7b.Q4_0.gguf](https://huggingface.co/TheBloke/Llama-2-7B-GGUF/blob/main/llama-2-7b.Q4_0.gguf) as example.

2. Enable oneAPI running environment

- In Search, input 'oneAPI'.
Expand Down Expand Up @@ -419,8 +444,10 @@ Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device

Miss to enable oneAPI running environment.

## Todo
- Meet compile error.

- Support to build in Windows.
Remove folder **build** and try again.

## Todo

- Support multiple cards.
2 changes: 1 addition & 1 deletion examples/sycl/win-run-llama2.bat
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
:: Copyright (C) 2024 Intel Corporation
:: SPDX-License-Identifier: MIT

INPUT2="Building a website can be done in 10 simple steps:\nStep 1:"
set INPUT2="Building a website can be done in 10 simple steps:\nStep 1:"
@call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force


Expand Down

0 comments on commit 4190d9a

Please sign in to comment.