-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
configure-scheduler.sh script breaks Kubernetes config #126
Comments
Hi @eero-t Thank you for flag this and for your suggestions. We are looking into them. |
Hi @eero-t As for the other suggestions, I have added them to our backlog. Will let you know when they will be picked up |
As script did not provide a backup (and I was not the person updating that cluster to k8s 1.26), I do now know what the starting point was. If I run the command now (with fixed config), it outputs following:
And does not change Which looks now like this:
After the initial run of the script, there were additional |
Hi, Command./telemetry-aware-scheduling/deploy/extender-configuration/configure-scheduler.sh -f ./gpu-aware-scheduling/deploy/extender-configuration/scheduler-config-tas+gas.yaml Initial manifest file
Output
The problem I found is here :
From what I could tell the root cause is this line https://github.com/intel-innersource/libraries.orchestrators.resourcemanagement.platform-aware-scheduling/blob/master/telemetry-aware-scheduling/deploy/extender-configuration/configure-scheduler.sh#L162
I started with the default manifest file (deploy/extender/scheduler-config.yaml), but when I specified a new one -f ./gpu-aware-scheduling/deploy/extender-configuration/scheduler-config-tas+gas.yaml, the $MANIFEST_FILE would be scheduler-config-tas+gas.yaml not scheduler-config.yaml so sed can't find and won't remove it. Not sure if this is exactly the same problem as in my case, the scheduler was still working (the scheduler pod started successfully, and then I was able to schedule the demo pod according to the TAS health-demo policy). In terms of priorities, I think we will be starting with 2 items first: fixing the bug above and having the file create backups as I think these are more urgent. I'll be picking these up in the next sprint (following week). |
IMHO YAML modifications should be done with something that actually understands YAML (e.g. Python script using |
Hi, Apologies for the late reply. I logged an issue in our backlog about this problem (it was also referenced here #86), but we weren't able to pick-up the work so far. My plan is to look this week/Monday at what changes we can work on in March and then what we plan for the next quarter. I'll be able to give a more specific date then. Just to keep things clear, I will continue to update this issue regarding the changes required for the configuration script and just update issue#86 with regards to your last request. |
Updating Kubernetes scheduler config using the script & (GAS) config in this repo, breaks Kubernetes 1.26 config:
./telemetry-aware-scheduling/deploy/extender-configuration/configure-scheduler.sh -f ./gpu-aware-scheduling/deploy/extender-configuration/scheduler-config-tas+gas.yaml
It messed up the last volume specification in the config so that there were couple of extra lines for the last volume specification.
Besides fixing that in the script, I think script should take a backup of the config file, and show user a diff of the changes it did.
Maybe also ask user whether changes should be accepted (unless something like "--yes" is specified), like e.g. Debian configuration packages script updates do.
You might also consider using kustomize to do the file updates, as that is actually designed for semantic updating of k8s YAML files.
(Which still leaves the issue of upgrade overwriting the changes, as mentioned in #86.)
The text was updated successfully, but these errors were encountered: