New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Missing parameters #59

Merged

dwlehman merged 10 commits into linux-system-roles:master from dwlehman:missing-parameters

Feb 10, 2021

Collaborator

dwlehman commented Oct 24, 2019 •

edited

Loading

This changes the behavior of the role such that omitted parameters do not change settings on existing pools and volumes. Omitted parameters on new pools and volumes will be given the default value, while the same on existing pools and volumes will leave whatever the current setting is intact.

Collaborator Author

dwlehman commented Oct 24, 2019

I see there are CI failures. I'll take a look on Monday.

Member

pcahyna commented Oct 25, 2019

@dwlehman thanks, but I would prefer to wait with this for further user feedback, as the discussion in #48 shows, it is not clear to what extent it is desirable or not.

Collaborator Author

dwlehman commented Oct 25, 2019

I only looked over it briefly, but from what I gathered everyone there is in agreement that we should not change settings for existing volumes because of omitted parameters.

dwlehman force-pushed the missing-parameters branch 2 times, most recently from 981ada0 to fbb0a9e Compare

July 9, 2020 18:49

dwlehman mentioned this pull request

Creating new LV in existing VG requires specifying disk #96

Closed

dwlehman marked this pull request as draft

July 9, 2020 18:55

dwlehman changed the title ~~WIP: Missing parameters~~ Missing parameters

Collaborator Author

dwlehman commented Jul 9, 2020

I just rebased this and added the commit allowing lookup of an existing pool by name only. I ran one random integration test and it passed, so I'm waiting to see what CI turns up.

Collaborator Author

dwlehman commented Jul 9, 2020

To Do:

type-specific parameters (lvm, md, luks)
docs

dwlehman force-pushed the missing-parameters branch 2 times, most recently from 5b36c13 to 8c2beba Compare

August 5, 2020 18:57

richm reviewed

View reviewed changes

tests/run_blivet.yml Outdated Show resolved Hide resolved

richm reviewed

View reviewed changes

tests/run_blivet.yml Outdated

@@ @@ -1,6 +1,6 @@ @@
               ---
               - include_vars:
-                  name: roles/storage/defaults/main.yml
+                  file: roles/storage/defaults/main.yml

Contributor

richm Aug 5, 2020

lgtm

Contributor

richm Aug 31, 2020

please use file: "{{ playbook_dir }}/roles/linux-system-roles.storage/defaults/main.yml"

dwlehman added this to the 2.0 milestone

richm reviewed

View reviewed changes

tests/tests_existing_lvm_pool.yml Outdated

+                tasks:
+                  - include_role:
+                      name: storage

Contributor

richm Aug 31, 2020

please use name: linux-system-roles.storage

richm reviewed

View reviewed changes

tests/tests_existing_lvm_pool.yml Outdated

+                  - name: Create one LVM logical volume under one volume group
+                    include_role:
+                      name: storage

Contributor

richm Aug 31, 2020

please use name: linux-system-roles.storage

richm reviewed

View reviewed changes

tests/tests_existing_lvm_pool.yml Outdated

+                  - name: Create another volume in the existing pool, identified only by name.
+                    include_role:
+                      name: storage

Contributor

richm Aug 31, 2020

please use name: linux-system-roles.storage

richm reviewed

View reviewed changes

tests/tests_existing_lvm_pool.yml Outdated

+                  - name: Clean up.
+                    include_role:
+                      name: storage

Contributor

richm Aug 31, 2020

please use name: linux-system-roles.storage

richm reviewed

View reviewed changes

tests/tests_luks_pool.yml Outdated

    
            @@ -109,10 +110,11 @@
          
                  - include_tasks: verify-role-results.yml

                  - name: Add encryption to the volume

                  - name: Add both encryption and raid to the pool

                    include_role:

                      name: storage

Contributor

richm Aug 31, 2020

please use name: linux-system-roles.storage

richm reviewed

View reviewed changes

tests/tests_luks_pool.yml Outdated

		@@ -133,6 +136,7 @@
		include_role:
		name: storage

Contributor

richm Aug 31, 2020

please use name: linux-system-roles.storage

dwlehman force-pushed the missing-parameters branch from c76c635 to 51edee7 Compare

November 17, 2020 16:17

Contributor

lgtm-com bot commented Nov 17, 2020

This pull request introduces 1 alert when merging 51edee7 into e990c1f - view on LGTM.com

new alerts:

1 for Syntax error

richm reviewed

View reviewed changes

meta/main.yml Outdated

@@ @@ -7,6 +7,6 @@ galaxy_info: @@
                 min_ansible_version: 2.5
                 platforms:
                   - name: Fedora
-                    versions: [ 31, 32 ]
+                    versions: [ 31, 32, 33 ]

Contributor

richm Nov 17, 2020

@jharuda can this be now changed to just versions: all?

Contributor

richm commented Nov 19, 2020

[citest]

dwlehman force-pushed the missing-parameters branch from f221605 to 83681c7 Compare

November 19, 2020 18:14

Collaborator Author

dwlehman commented Nov 19, 2020

Note to self, since I'm taking Friday off:

There are two persistent CI failures that I have been unable to reproduce locally, both of which appear to be triggered by the CI infrastructure's slowness:

udevd is failing to pick up the label after the mkfs; e2label sequence, which might be resolved using flock in blivet (tests_existing_pool.yml on f32)
some devices/filesystems are active on disks, causing a parted commit failure (tests_luks_pool.yml on rhel-7)

I can investigate blivet mitigation for failure 1, but I'll have to figure out how to examine the blivet log from the actual ci run in order to diagnose failure 2.

Collaborator Author

dwlehman commented Nov 19, 2020

[citest]

dwlehman force-pushed the missing-parameters branch 2 times, most recently from 642a053 to 15d8387 Compare

December 1, 2020 22:13

Collaborator Author

dwlehman commented Jan 27, 2021

It's worth noting that this is a pretty big behavioral change, and probably merits a major version bump. @pcahyna @richm thoughts on that?

dwlehman added 8 commits

January 27, 2021 14:25


          Improve supplied defaults for omitted parameters.

ff883ef

For new/non-existent pools and volumes, any missing parameters
will be filled with the role default.

For existing pools and volumes, missing parameters will be filled
in with the current setting for that pool or volume in the system.


          Pass defaults into blivet in run_blivet.yml

7e00df5


          Add empty disk list to pool and volume defaults.

f8422df


          Fix crash checking for duplicate names when pool has no volumes.

5a2a248


          Fix required_packages w/ possible missing parameters.

846be4d


          Use raw string for regex.

36b71fa


          Only expect to find LUKS metadata on encrypted volumes.

da53909


          Fix conditions under which we expect to find a LUKS device.

b0efa6f

dwlehman force-pushed the missing-parameters branch 2 times, most recently from 10af084 to 560a8b8 Compare

January 28, 2021 17:09

Contributor

richm commented Jan 29, 2021

It's worth noting that this is a pretty big behavioral change, and probably merits a major version bump. @pcahyna @richm thoughts on that?

Are there changes which will break the API e.g. if a customer has playbooks that work with the existing role, will those playbooks continue to work the exact same way if they update the role?
If there are breaking changes, we will need to bump the major version - and also figure out a way to communicate this to users of the role so that they can modify their playbooks and/or inventories, and justify such a change.

richm reviewed

View reviewed changes

tasks/main-blivet.yml Outdated

-                  safe_mode: "{{ storage_safe_mode }}"
-                  diskvolume_mkfs_option_map: "{{ __storage_blivet_diskvolume_mkfs_option_map|d(omit) }}"
-                register: blivet_output
+              - command: 'systemctl list-units --all --plain --full --no-legend -t service'

Contributor

richm Jan 29, 2021

Can you use service_facts instead? https://docs.ansible.com/ansible/2.9/modules/service_facts_module.html#service-facts-module

Collaborator Author

dwlehman Jan 29, 2021

Changed to use service_facts in latest version. Bonus fun ensued related to json_query.

richm reviewed

View reviewed changes

tasks/main-blivet.yml

+                    diskvolume_mkfs_option_map: "{{ __storage_blivet_diskvolume_mkfs_option_map|d(omit) }}"
+                  register: blivet_output
+                rescue:
+                  - fail:

Contributor

richm Jan 29, 2021

If you want to check the result of the failure in tests, you'll have to "re-raise" the error - see https://richm.github.io/how-to-catch-and-reraise-errors-in-ansible

Collaborator Author

dwlehman Jan 29, 2021

Fixed in latest version, although tests were passing as it was.

Contributor

richm Jan 29, 2021

Fixed in latest version, although tests were passing as it was.

Then you probably aren't trying to rescue and check the error in a calling context.

Collaborator Author

dwlehman commented Jan 29, 2021

It's worth noting that this is a pretty big behavioral change, and probably merits a major version bump. @pcahyna @richm thoughts on that?

Are there changes which will break the API e.g. if a customer has playbooks that work with the existing role, will those playbooks continue to work the exact same way if they update the role?

No, the behavior will change when they run the role against pre-existing pools or volumes with any parameters omitted. It seems clear that the new behavior is "better" (don't change anything unless explicitly told to), but it's definitely different.

If there are breaking changes, we will need to bump the major version - and also figure out a way to communicate this to users of the role so that they can modify their playbooks and/or inventories, and justify such a change.

As I said above, the new behavior is less disruptive/destructive, but it's still different.

Contributor

richm commented Jan 29, 2021

As I said above, the new behavior is less disruptive/destructive, but it's still different.

Will a playbook run against the new role report any differences at all?

run playbook against old role
upgrade role
run playbook again with new role and no other changes

Will the playbook run report changed? That is, will the upgrade break idempotency (assuming the role is currently idempotent)? If so, then this suggests to me a major version change.

Collaborator Author

dwlehman commented Jan 29, 2021

As I said above, the new behavior is less disruptive/destructive, but it's still different.

Will a playbook run against the new role report any differences at all?
* run playbook against old role

* upgrade role

* run playbook again with new role and no other changes
Will the playbook run report changed? That is, will the upgrade break idempotency (assuming the role is currently idempotent)? If so, then this suggests to me a major version change.

No, it should not report 'changed' in that case.

Contributor

richm commented Jan 29, 2021

No, it should not report 'changed' in that case.

Then IMO a major version change is not needed, only a minor version change.


          Mask systemd cryptsetup services while blivet is running.

c899ef9

The systemd-cryptsetup@ services serve to prompt for a password when
the encrypted/backing device appears, which is not desirable while
blivet is configuring storage.

dwlehman force-pushed the missing-parameters branch from 560a8b8 to c899ef9 Compare

January 29, 2021 21:14

richm reviewed

View reviewed changes

tasks/main-blivet.yml

+              - set_fact:
+                  # For an explanation of the to_json|from_json silliness, see
+                  # https://github.com/ansible-collections/community.general/issues/320
+                  storage_cryptsetup_services: "{{ ansible_facts.services|to_json|from_json|json_query('*.name')|json_query('[?starts_with(@, `\"systemd-cryptsetup@\"`)]') }}"

Contributor

richm Jan 29, 2021

Here is a way to do it without json_query:

    storage_cryptsetup_services: "{{
      ansible_facts.services | dict2items | selectattr('key', 'match', '^systemd-cryptsetup@') |
      map(attribute='value') | map(attribute='name') | list
    }}"

however, on fedora 33, this gives me a value like this: ['systemd-cryptsetup@luks\\x2d7fa23e85\\x2d4729\\x2d43b2\\x2dcc7a\\x2d9608d8cdb732.service']
not sure if that is the same result of using json_query

Collaborator Author

dwlehman Jan 29, 2021

Someone came in a while back and switched all of our selectattr usage to json_query since selectattr doesn't work with Jinja versions < 2.8. You can see the comment, look at blame, etc. in storage/tests/test-verify-volume-mount.yml.

Contributor

richm Jan 29, 2021

Someone came in a while back and switched all of our selectattr usage to json_query since selectattr doesn't work with Jinja versions < 2.8. You can see the comment, look at blame, etc. in storage/tests/test-verify-volume-mount.yml.

Mea culpa. That was based on my incomplete understanding of the problems of selectattr and json_query. The problem with jinja27 is not selectattr in general, it is that certain filters are not available - specifically, the equalto, equal, == filters to test if a value in an object is equal to some given value.

We have just completed converting all of the system roles to work with jinja27, and we figured out how to get around this limitation, by using the match filter instead of equalto or == - linux-system-roles/logging@48d21ef#diff-fe77ca1acb10f971c92988a8aed5bb5ae631443b1a20668629ccc846c608205aL11-L12

In the case of this PR, we have to use match anyway to look for services beginning with systemd-cryptsetup@.

In addition, I have verified the code above works with ansible 2.8/jinja2.7/centos7 controller host.

Contributor

richm Jan 29, 2021

We have CI tests for ansible 2.8/jinja2.7/centos7 controller host - the statuses like */ansible-2.8 - so you can know if you make a change that breaks in this case.

Contributor

richm Jan 29, 2021

However, all of that being said, if you prefer to keep the json_query implementation, that's fine with me.

Collaborator Author

dwlehman Feb 8, 2021

I'm happy with it as it is now if you are. No need to fiddle with it any more.

Collaborator Author

dwlehman commented Feb 8, 2021

[citest bad]

Contributor

richm commented Feb 8, 2021

Don't worry about the (staging) failures. All of the required tests have passed.

Contributor

richm commented Feb 8, 2021

The rhel-x/ansible-2.9 fails because of this:

TASK [Read the /etc/crypttab file] *********************************************
task path: /tmp/tmpppms2nvo/tests/verify-role-results.yml:24
fatal: [/cache/rhel-x.qcow2]: FAILED! => {"changed": false, "cmd": ["cat", "/etc/crypttab"], "delta": "0:00:00.002482", "end": "2021-02-08 20:39:44.912470", "msg": "non-zero return code", "rc": 1, "start": "2021-02-08 20:39:44.909988", "stderr": "cat: /etc/crypttab: No such file or directory", "stderr_lines": ["cat: /etc/crypttab: No such file or directory"], "stdout": "", "stdout_lines": []}

rhel-x uses the internal rhel 8.4.0 snapshot 1 - which is quite old by now - we are in the process of upgrading it - is it possible that this test uses a component that had a bug in this older snapshot, or the test uses a component in rhel 8.4 that is in a later version of 8.4?


          Don't let a missing crypttab file cause tests to fail.

fb717ce

Contributor

richm commented Feb 9, 2021

[citest bad]

dwlehman requested a review from richm

February 10, 2021 00:16

richm approved these changes

View reviewed changes

Contributor

richm left a comment

lgtm

dwlehman merged commit 500e5ab into linux-system-roles:master

Member

pcahyna commented Feb 12, 2021

rhel-x uses the internal rhel 8.4.0 snapshot 1 - which is quite old by now - we are in the process of upgrading it - is it possible that this test uses a component that had a bug in this older snapshot, or the test uses a component in rhel 8.4 that is in a later version of 8.4?

No, crypttab is still missing when testing against the latest nightly snapshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet