Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Build error seen on Power Architecture #10421

Merged
merged 24 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0ce09ab
Fix: Build error seen on Power Architecture
Nov 18, 2024
201a6da
[Model][LoRA]LoRA support added for glm-4v (#10418)
B-201 Nov 18, 2024
9120161
[Model] Remove transformers attention porting in VITs (#10414)
Isotr0py Nov 18, 2024
d7b14ce
[Doc] Update doc for LoRA support in GLM-4V (#10425)
B-201 Nov 18, 2024
06de800
[5/N][torch.compile] torch.jit.script --> torch.compile (#10406)
youkaichao Nov 18, 2024
d3a6317
[Doc] Add documentation for Structured Outputs (#9943)
ismael-dm Nov 18, 2024
b2a1685
Fix open_collective value in FUNDING.yml (#10426)
andrew Nov 18, 2024
4c610a5
[Model][Bugfix] Support TP for PixtralHF ViT (#10405)
mgoin Nov 18, 2024
96d56ba
[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107)
yma11 Nov 18, 2024
21da67d
[Kernel] Explicitly specify other value in tl.load calls (#9014)
angusYuhao Nov 18, 2024
2b855b1
[Kernel] Initial Machete W4A8 support + Refactors (#9855)
LucasWilkinson Nov 18, 2024
887d326
[3/N][torch.compile] consolidate custom op logging (#10399)
youkaichao Nov 18, 2024
db5dddb
[ci][bugfix] fix kernel tests (#10431)
youkaichao Nov 18, 2024
53e3a96
[misc] partial prefix & random input generation benchmark (#9929)
rickyyx Nov 18, 2024
ec45058
[ci/build] Have dependabot ignore all patch update (#10436)
khluu Nov 19, 2024
cce69dc
[Bugfix]Fix Phi-3 BNB online quantization (#10417)
jeejeelee Nov 19, 2024
2ce7cd4
[Platform][Refactor] Extract func `get_default_attn_backend` to `Plat…
MengqingCao Nov 19, 2024
6372003
Add openai.beta.chat.completions.parse example to structured_outputs.…
mgoin Nov 19, 2024
c0482f6
[Bugfix] Guard for negative counter metrics to prevent crash (#10430)
tjohnson31415 Nov 19, 2024
3fcfe67
[Misc] Avoid misleading warning messages (#10438)
jeejeelee Nov 19, 2024
392acf9
[Doc] Add the start of an arch overview page (#10368)
russellb Nov 19, 2024
30deada
[misc][plugin] improve plugin loading (#10443)
youkaichao Nov 19, 2024
cd96cde
Fix for clang-format (3.11)
mikejuliet13 Nov 19, 2024
6b053cb
Merge branch 'vllm-project:main' into vllm-power-issue
mikejuliet13 Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions cmake/cpu_extension.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,16 @@ include_directories("${CMAKE_SOURCE_DIR}/csrc")
#
# Check the compile flags
#
list(APPEND CXX_COMPILE_FLAGS
"-fopenmp"
"-mf16c"
"-DVLLM_CPU_EXTENSION")
if (CMAKE_SYSTEM_PROCESSOR STREQUAL "ppc64le")
list(APPEND CXX_COMPILE_FLAGS
"-fopenmp"
"-DVLLM_CPU_EXTENSION")
else()
list(APPEND CXX_COMPILE_FLAGS
"-fopenmp"
"-mf16c"
"-DVLLM_CPU_EXTENSION")
endif()

execute_process(COMMAND cat /proc/cpuinfo
RESULT_VARIABLE CPUINFO_RET
Expand Down
12 changes: 10 additions & 2 deletions csrc/cpu/attention.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,20 @@ struct KernelVecType<float> {

template <>
struct KernelVecType<c10::Half> {
#ifdef __powerpc64__
// Power architecture-specific vector types
using q_load_vec_type = vec_op::FP32Vec8;
using k_load_vec_type = vec_op::FP32Vec16;
using v_load_vec_type = vec_op::FP32Vec16;
#else
// Fallback for other architectures, including x86
using q_load_vec_type = vec_op::FP16Vec8;
using q_vec_type = vec_op::FP32Vec16;
using k_load_vec_type = vec_op::FP16Vec16;
using v_load_vec_type = vec_op::FP16Vec16;
#endif
using q_vec_type = vec_op::FP32Vec16;
using k_vec_type = vec_op::FP32Vec16;
using qk_acc_vec_type = vec_op::FP32Vec16;
using v_load_vec_type = vec_op::FP16Vec16;
};

#ifdef __AVX512BF16__
Expand Down
6 changes: 6 additions & 0 deletions csrc/cpu/quant.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,13 @@ struct KernelVecType<c10::BFloat16> {

template <>
struct KernelVecType<c10::Half> {
#ifdef __powerpc64__
// Power architecture-specific vector type
using load_vec_type = vec_op::FP32Vec16;
#else
// Fallback for other architectures
using load_vec_type = vec_op::FP16Vec16;
#endif
using azp_adj_load_vec_type = vec_op::INT32Vec16;
using cvt_vec_type = vec_op::FP32Vec16;
};
Expand Down