Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: integrate with kwok to simulate mock GPU/NPU nodes #3830

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

JesseStutler
Copy link
Member

@JesseStutler JesseStutler commented Nov 20, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

related issue: #3829

verfication

  1. Use ./create-fake-node.sh -n 10 -c 4 -m 8Gi -e volcano.sh/gpu-number=4,volcano.sh/gpu-memory=20 to create 10 fake nodes with 4 CPUs, 8Gi memories and extended resources with volcano.sh/gpu-number=4,volcano.sh/gpu-memory=20. After successfully creating these nodes, take one of them as an example:
    image
    image

  2. Open deviceshare plugin and set the argument deviceshare.GPUNumberEnable enabled, and then create a fake deployment to create a pod requesting 1 volcano.sh/gpu-number and 1 volcano.sh/gpu-memory, successfully scheduled:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: fake-gpu-pod
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fake-gpu-pod
  template:
    metadata:
      labels:
        app: fake-gpu-pod
    spec:
      schedulerName: volcano
      tolerations:
      - key: "kwok.x-k8s.io/node"
        operator: "Exists"
        effect: "NoSchedule"
      containers:
      - name: fake-container
        image: fake-image
        resources:
          limits:
            volcano.sh/gpu-number: 1
            volcano.sh/gpu-memory: 1

image

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hwdef
You can assign the PR to them by writing /assign @hwdef in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 20, 2024
@volcano-sh-bot volcano-sh-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 20, 2024
@JesseStutler
Copy link
Member Author

JesseStutler commented Nov 21, 2024

I found that under benchmark there is already a script to deploy kwok and fake nodes, maybe move these into benchmark/kwok is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants