Skip to content

Commit

Permalink
Debug
Browse files Browse the repository at this point in the history
Signed-off-by: wangxiyuan <[email protected]>
  • Loading branch information
wangxiyuan committed Dec 20, 2024
1 parent 80f9986 commit 6331a34
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions vllm/attention/layer.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,15 @@
from vllm.attention.selector import backend_name_to_enum, get_attn_backend
from vllm.config import CacheConfig, get_current_vllm_config
from vllm.forward_context import ForwardContext, get_forward_context
from vllm.logger import init_logger
from vllm.model_executor.layers.quantization.base_config import (
QuantizationConfig)
from vllm.model_executor.layers.quantization.kv_cache import BaseKVCacheMethod
from vllm.platforms import _Backend, current_platform
from vllm.utils import direct_register_custom_op

logger = init_logger(__name__)


class Attention(nn.Module):
"""Attention layer.
Expand Down Expand Up @@ -307,6 +310,9 @@ def unified_attention_with_output_fake(
return


logger.info("====================current platform===========: ", current_platform.dispatch_key)

Check failure on line 313 in vllm/attention/layer.py

View workflow job for this annotation

GitHub Actions / ruff (3.12)

Ruff (E501)

vllm/attention/layer.py:313:81: E501 Line too long (95 > 80)


direct_register_custom_op(
op_name="unified_attention_with_output",
op_func=unified_attention_with_output,
Expand Down

0 comments on commit 6331a34

Please sign in to comment.