Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sds schema validation #1538

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Conversation

VirajP1002
Copy link
Contributor

@VirajP1002 VirajP1002 commented Oct 17, 2024

What is the context of this PR?

To align with SDS & Author we need to update the validation rules for supplementary data schema versions within Runner in order to support future automation of the releases of new SDS schema versions.

supplementary_data_parser previously used a list of known versions but this will become unmanageable with several versions. Instead, we compare the sds_schema_version set in a questionnaire schema to the schema_version found in the supplementary data payload. We pass sds_schema_version from session.py to supplementary_data_parser for it to be validated. The old logic has been removed and tests have been updated/added to show this change.

A new schema has been added called test_supplementary_data_with_sds_schema_version.json to include this new field.

How to review

  • Ensure changes make sense
  • Check unit, integration & functional tests pass
  • Ensure schema validation passes (using the validator PR)

Checklist

  • New static content marked up for translation
  • Newly defined schema content included in eq-translations repo

@VirajP1002 VirajP1002 marked this pull request as ready for review October 18, 2024 18:15
@@ -80,12 +71,19 @@ def validate_dataset_and_survey_id( # pylint: disable=unused-argument
"Supplementary data did not return the specified Survey ID"
)

if self.context["sds_schema_version"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can merge the nested if here. Also data["data"] is a bit weird but not sure there is a way around it

"survey_id": "123",
"title": "Test Supplementary Data",
"theme": "default",
"description": "A questionnaire to demo using Supplementary data for placeholders, validation and routing in both repeating and non repeating sections.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would update the description here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new schema is not actually being used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in an integration test 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add discussed we decided to remove the schema and the integration test

app/routes/session.py Outdated Show resolved Hide resolved
@@ -67,25 +58,33 @@ class SupplementaryDataMetadataSchema(Schema, StripWhitespaceMixin):

@validates_schema()
def validate_dataset_and_survey_id( # pylint: disable=unused-argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name should probably change since you also check "sds_schema_version" now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, I moved the sds_schema_version validation in it's own function underneath, just because it makes the name look smaller in addition to the method name making sense 😅 and validation still passes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, that new function will require some unit testing (if it's not already tested).

Copy link
Contributor

@liamtoozer liamtoozer Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V minor - No worries if this has already been discussed, so feel free to ignore me 😆 but we now have 2 similar functions that validate things in the payload as follows:

  • existing validate_dataset_and_survey_id() which validates 2 things in the payload
  • newly added validate_sds_schema_version() which validates 1 thing in the payload

would it be worth splitting validate_dataset_and_survey_id into separate validation functions, if we're making the validation as individual functions?

Personal preference, but I feel like it might make the most sense to combine all validation logic for this into a single function so it checks all 3 things (like it was originally in the PR I think?) and just rename it to something like validate_payload()?

@patch(
"app.questionnaire.questionnaire_store_updater.QuestionnaireStoreUpdaterBase.set_supplementary_data",
)
def test_login_with_sds_schema_version_valid(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to move your testing to test_supplementary_data_parser.py where the class and methods you changed are tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this test was just to check if the schema opens, I initially did a functional test, but then we decided it would be better to add it as an integration test. So we chose to put it in test_login as these just check if schemas can open?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this is testing anything at the moment though cause if you change the sds_schema_version in your schema the suite still passes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed we agreed that the unit test did the same as the integration test and so I removed it

dataset_id: str,
identifier: str,
survey_id: str,
sds_schema_version: str | None = None,
) -> dict:
Copy link
Contributor

@petechd petechd Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little detail here but for the return value type hint we recommend in our guide to at least give types of the dict keys and values. https://github.com/ONSdigital/eq-questionnaire-runner/blob/main/doc/python-type-hinting.md#generic-types

)


def test_valid_supplementary_dataset_version():
Copy link
Contributor

@petechd petechd Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create some more readable code by just writing a test where valid dataset version does not raise an error by using some @contextmanager decorator like this one:

from contextlib import contextmanager

@contextmanager
def not_raises(exception):
  try:
    yield
  except exception:
    raise pytest.fail("DID RAISE {0}".format(exception))

Copy link
Contributor

@petechd petechd Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and then use it as with not_raises(ValidationError):
Then you can rename the test to align the wording with the first one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants