Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert some jobs to Ginkgo --label-filter #32911

Open
2 of 21 tasks
carlory opened this issue Jul 5, 2024 · 9 comments
Open
2 of 21 tasks

convert some jobs to Ginkgo --label-filter #32911

carlory opened this issue Jul 5, 2024 · 9 comments
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@carlory
Copy link
Member

carlory commented Jul 5, 2024

Why we need this change:

  • Filtering tests via --label-filter is more readable and easier to review.

  • --label-filter can specify an allow-list of features supported by a job. When adding new tests which depend on a new feature, they don't accidentally run and fail in existing jobs, as they do now when the SKIP regexp doesn't get updated in lockstep.

  • As soon as all active jobs are converted and use the --label-filter=Feature: isEmpty expression or some variant of it, we can remove the requirement that a test which depends only on feature gates must have a [Feature:<xxx>] tag in addition to [FeatureGate:<xxx>] (this blocked PR makes such a change). WithFeatureGate automatically adds Alpha or Beta to the set of required features, so they still get skipped unless a job explicitly allows them.

    At that point we can run tests from different SIGs in the shared alpha/beta jobs (pull-kubernetes-kind-alpha-features, pull-kubernetes-kind-beta-features, pull-kubernetes-e2e-kind-alpha-beta-features, kind-master-alpha, kind-master-beta, kind-master-alpha-beta) without having to constantly update those jobs.

    Replacing [Feature:xxx] with [FeatureGates:xxx] is not part of this issue. Test owners need to do that once this issue is resolved to take advantage of the generic jobs.

Some background references:

Example PRs:

Examples:

e2e_node job:

- '--test_args=--timeout=1h --label-filter="Feature: containsAny DynamicResourceAllocation && Feature: isSubsetOf DynamicResourceAllocation && !Flaky"'

e2e jobs:

- name: LABEL_FILTER
value: "Feature: isEmpty"

- name: FEATURE_GATES
value: '{"AllBeta":true}'
- name: RUNTIME_CONFIG
value: '{"api/beta":"true", "api/ga":"true"}'
- name: LABEL_FILTER
value: "!Slow && !Disruptive && !Flaky && Feature: isSubsetOf Beta"

How to do it

  • Some tips for figuring out how to replace --focus/skip with --label-filter:

    • go test -v ./test/e2e -args -list-tests (same for e2e_node) shows all tests.
    • Note that not all tags in [] are also labels. Only those listed by go test -v ./test/e2e -args -list-labels are proper labels. If you find that you need to filter by something that isn't a label yet, then replace the inline text with WithLabel.
    • _output/bin/ginkgo --dry-run [--focus/skip/label-filter=...] ./test/e2e can be used to compare how many and, when adding -v, which tests would run. Note that statistics for actual runs may differ when tests ask to be skipped when invoked. (install ginkgo via make ginkgo)
  • If a job has a "canary" version, update that version first and try it out.

  • When converting some real jobs:

    • Create a separate tracking issue with a link to the jobs' testgrid entry.
    • Convert the job, linking to that issue with Related-to: ... (not Fixes: ...!).
    • Verify that the job still works as intended, then close the issue.
  • After converting a pre-submit job successfully, do the same for the corresponding periodic job.

  • Some best practices:

    • Replace focus/skip completely with a single label-filter.
    • If a job has a TODO(bentheelder): reduce the skip list further, check whether any of those skip entries really are still needed. Typically those are out-dated workarounds for know issues which got resolved already long ago. Skipping tests by name should not be needed anymore.
    • Include !Flaky even if you currently don't have flaky tests. Having it in the jobs is convenient when some tests turns out to be problematic and needs to be disable temporarily.
    • Exclude tests with special requirements with Feature: isEmpty.
    • Feature: containsAny <my feature> && Feature: isSubsetOf <my feature> runs tests with have the <my feature> label and no other unknown features. Use this in dedicated jobs for that feature. Note that e2e_node uses NodeFeature for the same purpose.
    • Ginkgo can run serial jobs together with parallel ones in a single invocation by running the serial jobs (and only those) sequentially, so it's not absolutely required to use !Serial when running ginkgo -p and there is no need for separate serial/parallel jobs.
    • Beware that serial tests can affect the overall runtime. It might be useful to skip with !Slow and !Serial in pre-submits and only run the skipped tests in periodic jobs.

What should be cleaned up or changed:

@carlory carlory added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Jul 5, 2024
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 5, 2024
@carlory
Copy link
Member Author

carlory commented Jul 5, 2024

/sig testing

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 5, 2024
@carlory
Copy link
Member Author

carlory commented Jul 5, 2024

/assign
cc @pohly @aojea

@aojea
Copy link
Member

aojea commented Jul 5, 2024

/cc @BenTheElder

@aojea
Copy link
Member

aojea commented Jul 29, 2024

@carlory it will be super useful to write some notes on how to perform the migration, so we document it and achieve some consistency to avoid different groups to end with different solutions to the same problem

@pohly
Copy link
Contributor

pohly commented Jul 30, 2024

Here's a proposal. @carlory: if this looks good, copy it into the description to ensure that it remains visible?

  • Some tips for figuring out how to replace --focus/skip with --label-filter:
    • go test -v ./test/e2e -args -list-tests (same for e2e_node) shows all tests.
    • Note that not all tags in [] are also labels. Only those listed by go test -v ./test/e2e -args -list-labels are proper labels. If you find that you need to filter by something that isn't a label yet, then replace the inline text with WithLabel.
    • _output/bin/ginkgo --dry-run [--focus/skip/label-filter=...] ./test/e2e can be used to compare how many and, when adding -v, which tests would run. Note that statistics for actual runs may differ when tests ask to be skipped when invoked.
  • If a job has a "canary" version, update that version first and try it out.
  • When converting some real jobs:
    • Create a separate tracking issue with a link to the jobs' testgrid entry.
    • Convert the job, linking to that issue with Related-to: ... (not Fixes: ...!).
    • Verify that the job still works as intended, then close the issue.
  • After converting a pre-submit job successfully, do the same for the corresponding periodic job.
  • Some best practices:
    • Replace focus/skip completely with a single label-filter.
    • Include !Flaky even if you currently don't have flaky tests. Having it in the jobs is convenient when some tests turns out to be problematic and needs to be disable temporarily.
    • Exclude tests with special requirements with Feature: isEmpty.
    • Feature: containsAny <my feature> && Feature: isSubsetOf <my feature> runs tests with have the <my feature> label and no other unknown features. Use this in dedicated jobs for that feature. Note that e2e_node uses NodeFeature for the same purpose.
    • Ginkgo can run serial jobs together with parallel ones in a single invocation by running the serial jobs (and only those) sequentially, so it's not absolutely required to use !Serial when running ginkgo -p and there is no need for separate serial/parallel jobs.
    • Beware that serial tests can affect the overall runtime. It might be useful to skip with !Slow and !Serial in pre-submits and only run the skipped tests in periodic jobs.

@aojea aojea added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jul 30, 2024
@pohly
Copy link
Contributor

pohly commented Nov 22, 2024

@BenTheElder and I discussed this again at KubeCon NA. He suggested creating an issue with instructions and links to example PRs. Now I remembered that we already have this one here. All it misses are the examples.

I noticed that I have permission to edit the description, so I went ahead and directly added some examples there.

@carlory, @BenTheElder: can you check the updated issue description? When you are done, I can notify leads and potential contributors that this is something that we want to tackle in 1.33.

@carlory
Copy link
Member Author

carlory commented Nov 22, 2024

As soon as all active jobs are converted and use the --label-filter=Feature: isEmpty expression or some variant of it, we can remove the requirement that a test which depends only on feature gates must have a [Feature:] tag in addition to [FeatureGate:] (this blocked PR makes such a change). WithFeatureGate automatically adds Alpha or Beta to the set of required features, so they still get skipped unless a job explicitly allows them.

@pohly To achieve it, we also need to change the k/k repo because most tests are not using framework.WithFeatureGate when these tests were written.

Step 1: change the k/k repo to use framework.WithFeatureGate and feature.XXXX in the tests if the tests are depending on feature gates.
Step 2: change the prow config to filter the tests with feature label.
Step 3: Once step 1 is done, re-do step 2 to filter the tests with feature-gate label.
Step 4: Remove unnecessary feature labels from k/k.

Todo items of this issue seems only include step 2.

@carlory
Copy link
Member Author

carlory commented Nov 22, 2024

We will do this

sig-storage && Feature: containsAny {Alpha, Beta} && !Slow && !Flaky && Feature: isSubsetOf {VolumeAttributesClass, HonorPVReclaimPolicy, CSIVolumeHealth}

but final expected result is:

sig-storage && Feature: containsAny {Alpha, Beta} && !Slow && !Flaky && FeatureGate: isSubsetOf {VolumeAttributesClass, HonorPVReclaimPolicy, CSIVolumeHealth}

Is it right?

@pohly
Copy link
Contributor

pohly commented Nov 22, 2024

Todo items of this issue seems only include step 2.

Correct, that is the goal. I added one sentence about this:

"Replacing [Feature:xxx] with [FeatureGates:xxx] is not part of this issue. Test owners need to do that once this issue is resolved to take advantage of the generic jobs."

We will do this ... Is it right?

Presumably this is a SIG Storage job which runs tests for features which only depend on feature gates. This looks correct to me. Just note that these tests then also should run in the generic jobs. You probably won't need the separate job anymore. We'll have to work with SIG Testing and SIG Release to ensure that those generic jobs are properly monitored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests

4 participants