Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing parameters #59

Merged
merged 10 commits into from
Feb 10, 2021
Merged

Conversation

dwlehman
Copy link
Collaborator

@dwlehman dwlehman commented Oct 24, 2019

This changes the behavior of the role such that omitted parameters do not change settings on existing pools and volumes. Omitted parameters on new pools and volumes will be given the default value, while the same on existing pools and volumes will leave whatever the current setting is intact.

@dwlehman
Copy link
Collaborator Author

I see there are CI failures. I'll take a look on Monday.

@pcahyna
Copy link
Member

pcahyna commented Oct 25, 2019

@dwlehman thanks, but I would prefer to wait with this for further user feedback, as the discussion in #48 shows, it is not clear to what extent it is desirable or not.

@dwlehman
Copy link
Collaborator Author

I only looked over it briefly, but from what I gathered everyone there is in agreement that we should not change settings for existing volumes because of omitted parameters.

@dwlehman dwlehman force-pushed the missing-parameters branch 2 times, most recently from 981ada0 to fbb0a9e Compare July 9, 2020 18:49
@dwlehman dwlehman marked this pull request as draft July 9, 2020 18:55
@dwlehman dwlehman changed the title WIP: Missing parameters Missing parameters Jul 9, 2020
@dwlehman
Copy link
Collaborator Author

dwlehman commented Jul 9, 2020

I just rebased this and added the commit allowing lookup of an existing pool by name only. I ran one random integration test and it passed, so I'm waiting to see what CI turns up.

@dwlehman
Copy link
Collaborator Author

dwlehman commented Jul 9, 2020

To Do:

  • type-specific parameters (lvm, md, luks)
  • docs

@dwlehman dwlehman force-pushed the missing-parameters branch 2 times, most recently from 5b36c13 to 8c2beba Compare August 5, 2020 18:57
tests/run_blivet.yml Outdated Show resolved Hide resolved
@@ -1,6 +1,6 @@
---
- include_vars:
name: roles/storage/defaults/main.yml
file: roles/storage/defaults/main.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use file: "{{ playbook_dir }}/roles/linux-system-roles.storage/defaults/main.yml"

@dwlehman dwlehman added this to the 2.0 milestone Aug 24, 2020

tasks:
- include_role:
name: storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use name: linux-system-roles.storage


- name: Create one LVM logical volume under one volume group
include_role:
name: storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use name: linux-system-roles.storage


- name: Create another volume in the existing pool, identified only by name.
include_role:
name: storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use name: linux-system-roles.storage


- name: Clean up.
include_role:
name: storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use name: linux-system-roles.storage

@@ -109,10 +110,11 @@

- include_tasks: verify-role-results.yml

- name: Add encryption to the volume
- name: Add both encryption and raid to the pool
include_role:
name: storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use name: linux-system-roles.storage

@@ -133,6 +136,7 @@
include_role:
name: storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use name: linux-system-roles.storage

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Nov 17, 2020

This pull request introduces 1 alert when merging 51edee7 into e990c1f - view on LGTM.com

new alerts:

  • 1 for Syntax error

meta/main.yml Outdated
@@ -7,6 +7,6 @@ galaxy_info:
min_ansible_version: 2.5
platforms:
- name: Fedora
versions: [ 31, 32 ]
versions: [ 31, 32, 33 ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jharuda can this be now changed to just versions: all?

@richm
Copy link
Contributor

richm commented Nov 19, 2020

[citest]

@dwlehman
Copy link
Collaborator Author

Note to self, since I'm taking Friday off:

There are two persistent CI failures that I have been unable to reproduce locally, both of which appear to be triggered by the CI infrastructure's slowness:

  1. udevd is failing to pick up the label after the mkfs; e2label sequence, which might be resolved using flock in blivet (tests_existing_pool.yml on f32)
  2. some devices/filesystems are active on disks, causing a parted commit failure (tests_luks_pool.yml on rhel-7)

I can investigate blivet mitigation for failure 1, but I'll have to figure out how to examine the blivet log from the actual ci run in order to diagnose failure 2.

@dwlehman
Copy link
Collaborator Author

[citest]

@dwlehman dwlehman force-pushed the missing-parameters branch 2 times, most recently from 642a053 to 15d8387 Compare December 1, 2020 22:13
@dwlehman
Copy link
Collaborator Author

It's worth noting that this is a pretty big behavioral change, and probably merits a major version bump. @pcahyna @richm thoughts on that?

@dwlehman dwlehman force-pushed the missing-parameters branch 2 times, most recently from 10af084 to 560a8b8 Compare January 28, 2021 17:09
@richm
Copy link
Contributor

richm commented Jan 29, 2021

It's worth noting that this is a pretty big behavioral change, and probably merits a major version bump. @pcahyna @richm thoughts on that?

Are there changes which will break the API e.g. if a customer has playbooks that work with the existing role, will those playbooks continue to work the exact same way if they update the role?
If there are breaking changes, we will need to bump the major version - and also figure out a way to communicate this to users of the role so that they can modify their playbooks and/or inventories, and justify such a change.

safe_mode: "{{ storage_safe_mode }}"
diskvolume_mkfs_option_map: "{{ __storage_blivet_diskvolume_mkfs_option_map|d(omit) }}"
register: blivet_output
- command: 'systemctl list-units --all --plain --full --no-legend -t service'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use service_facts in latest version. Bonus fun ensued related to json_query.

diskvolume_mkfs_option_map: "{{ __storage_blivet_diskvolume_mkfs_option_map|d(omit) }}"
register: blivet_output
rescue:
- fail:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to check the result of the failure in tests, you'll have to "re-raise" the error - see https://richm.github.io/how-to-catch-and-reraise-errors-in-ansible

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest version, although tests were passing as it was.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in latest version, although tests were passing as it was.

Then you probably aren't trying to rescue and check the error in a calling context.

@dwlehman
Copy link
Collaborator Author

It's worth noting that this is a pretty big behavioral change, and probably merits a major version bump. @pcahyna @richm thoughts on that?

Are there changes which will break the API e.g. if a customer has playbooks that work with the existing role, will those playbooks continue to work the exact same way if they update the role?

No, the behavior will change when they run the role against pre-existing pools or volumes with any parameters omitted. It seems clear that the new behavior is "better" (don't change anything unless explicitly told to), but it's definitely different.

If there are breaking changes, we will need to bump the major version - and also figure out a way to communicate this to users of the role so that they can modify their playbooks and/or inventories, and justify such a change.

As I said above, the new behavior is less disruptive/destructive, but it's still different.

@richm
Copy link
Contributor

richm commented Jan 29, 2021

As I said above, the new behavior is less disruptive/destructive, but it's still different.

Will a playbook run against the new role report any differences at all?

  • run playbook against old role
  • upgrade role
  • run playbook again with new role and no other changes

Will the playbook run report changed? That is, will the upgrade break idempotency (assuming the role is currently idempotent)? If so, then this suggests to me a major version change.

@dwlehman
Copy link
Collaborator Author

As I said above, the new behavior is less disruptive/destructive, but it's still different.

Will a playbook run against the new role report any differences at all?

* run playbook against old role

* upgrade role

* run playbook again with new role and no other changes

Will the playbook run report changed? That is, will the upgrade break idempotency (assuming the role is currently idempotent)? If so, then this suggests to me a major version change.

No, it should not report 'changed' in that case.

@richm
Copy link
Contributor

richm commented Jan 29, 2021

No, it should not report 'changed' in that case.

Then IMO a major version change is not needed, only a minor version change.

The systemd-cryptsetup@ services serve to prompt for a password when
the encrypted/backing device appears, which is not desirable while
blivet is configuring storage.
- set_fact:
# For an explanation of the to_json|from_json silliness, see
# https://github.com/ansible-collections/community.general/issues/320
storage_cryptsetup_services: "{{ ansible_facts.services|to_json|from_json|json_query('*.name')|json_query('[?starts_with(@, `\"systemd-cryptsetup@\"`)]') }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a way to do it without json_query:

    storage_cryptsetup_services: "{{
      ansible_facts.services | dict2items | selectattr('key', 'match', '^systemd-cryptsetup@') |
      map(attribute='value') | map(attribute='name') | list
    }}"

however, on fedora 33, this gives me a value like this: ['systemd-cryptsetup@luks\\x2d7fa23e85\\x2d4729\\x2d43b2\\x2dcc7a\\x2d9608d8cdb732.service']
not sure if that is the same result of using json_query

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone came in a while back and switched all of our selectattr usage to json_query since selectattr doesn't work with Jinja versions < 2.8. You can see the comment, look at blame, etc. in storage/tests/test-verify-volume-mount.yml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone came in a while back and switched all of our selectattr usage to json_query since selectattr doesn't work with Jinja versions < 2.8. You can see the comment, look at blame, etc. in storage/tests/test-verify-volume-mount.yml.

Mea culpa. That was based on my incomplete understanding of the problems of selectattr and json_query. The problem with jinja27 is not selectattr in general, it is that certain filters are not available - specifically, the equalto, equal, == filters to test if a value in an object is equal to some given value.

We have just completed converting all of the system roles to work with jinja27, and we figured out how to get around this limitation, by using the match filter instead of equalto or == - linux-system-roles/logging@48d21ef#diff-fe77ca1acb10f971c92988a8aed5bb5ae631443b1a20668629ccc846c608205aL11-L12

In the case of this PR, we have to use match anyway to look for services beginning with systemd-cryptsetup@.

In addition, I have verified the code above works with ansible 2.8/jinja2.7/centos7 controller host.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have CI tests for ansible 2.8/jinja2.7/centos7 controller host - the statuses like */ansible-2.8 - so you can know if you make a change that breaks in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, all of that being said, if you prefer to keep the json_query implementation, that's fine with me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with it as it is now if you are. No need to fiddle with it any more.

@dwlehman
Copy link
Collaborator Author

dwlehman commented Feb 8, 2021

[citest bad]

@richm
Copy link
Contributor

richm commented Feb 8, 2021

Don't worry about the (staging) failures. All of the required tests have passed.

@richm
Copy link
Contributor

richm commented Feb 8, 2021

The rhel-x/ansible-2.9 fails because of this:

TASK [Read the /etc/crypttab file] *********************************************
task path: /tmp/tmpppms2nvo/tests/verify-role-results.yml:24
fatal: [/cache/rhel-x.qcow2]: FAILED! => {"changed": false, "cmd": ["cat", "/etc/crypttab"], "delta": "0:00:00.002482", "end": "2021-02-08 20:39:44.912470", "msg": "non-zero return code", "rc": 1, "start": "2021-02-08 20:39:44.909988", "stderr": "cat: /etc/crypttab: No such file or directory", "stderr_lines": ["cat: /etc/crypttab: No such file or directory"], "stdout": "", "stdout_lines": []}

rhel-x uses the internal rhel 8.4.0 snapshot 1 - which is quite old by now - we are in the process of upgrading it - is it possible that this test uses a component that had a bug in this older snapshot, or the test uses a component in rhel 8.4 that is in a later version of 8.4?

@richm
Copy link
Contributor

richm commented Feb 9, 2021

[citest bad]

@dwlehman dwlehman requested a review from richm February 10, 2021 00:16
Copy link
Contributor

@richm richm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dwlehman dwlehman merged commit 500e5ab into linux-system-roles:master Feb 10, 2021
@pcahyna
Copy link
Member

pcahyna commented Feb 12, 2021

rhel-x uses the internal rhel 8.4.0 snapshot 1 - which is quite old by now - we are in the process of upgrading it - is it possible that this test uses a component that had a bug in this older snapshot, or the test uses a component in rhel 8.4 that is in a later version of 8.4?

No, crypttab is still missing when testing against the latest nightly snapshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants