-
-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couch to SQL Repeat Records PR 1: Write to SQL #33978
Conversation
cached_property is less complicated and easier to patch.
Syncing submodels using the Couch document model is expensive. It would be wildly inefficient to rewrite all attempts every time a repeat record is saved, not to mention when a new attempt is being added. However, proper submodel syncing is needed when copying old documents from Couch to SQL. New attempts are only added in non-migration code in one place, RepeatRecord.add_attempt(), which will be addressed in another commit.
@@ -50,7 +67,6 @@ def setUpClass(cls): | |||
domain=cls.domain, | |||
succeeded=True, | |||
repeater_id=cls.repeater_id, | |||
next_check=before, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this line removed/how was it breaking the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not valid to have a next_check
date in SQL when state == Success
.
commcare-hq/corehq/motech/repeaters/models.py
Lines 1366 to 1373 in f8abe92
models.CheckConstraint( | |
name="next_check_pending_or_null", | |
check=( | |
models.Q(next_check__isnull=True) | |
| models.Q(next_check__isnull=False, state=State.Pending) | |
| models.Q(next_check__isnull=False, state=State.Fail) | |
) | |
), |
repeater=cls.repeater, | ||
registered_at=now, | ||
)) | ||
cls.sql_records = list(SQLRepeatRecord.objects.filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can get rid of line 534 (cls.sql_records = [])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This gets refactored in a later commit (it will be in the next PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall. Just a minor comment on tests.
I am assuming this is already in process of QA and is deployed on staging?
self.succeeded = False | ||
self.cancelled = False | ||
try: | ||
reason = self.failure_reason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why will this raise AssertionError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jsonobject's JsonProperty
raises AssertionError
if the record does not have a failure_reason
. That's annoying. It should raise AttributeError
, which would allow standard use of getattr
.
# NOTE a 204 status here could mean either | ||
# 1. the request was not sent, in which case response is | ||
# probably RepeaterResponse(204, "No Content") | ||
# 2. the request was sent, and the remote end responded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it is required to specify this in user facing docs as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly if we want to keep it this way. I think a better long-term fix would be to change the way we internally signal that the request was not sent so there is no ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point (see docstring) we internally signaled that the request was not sent by passing True
.
How about ...
if response is True or (
isinstance(response, RepeaterResponse)
and response.status_code == 204
):
state = State.Empty
else:
state = State.Success
... and in the future we can come up with a better, and consistent, internal signal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaapstorm can you point to the place(s) where response is True
originates? I don't see anywhere that does that. Is it possible that they were all changed from return True
to return RepeaterResponse(204, "No content")
?
Edit: a test asserts that a (SQL) record created with response=True
results in state == Success
(not Empty
). Should that test be changed?
commcare-hq/corehq/motech/repeaters/tests/test_models.py
Lines 304 to 305 in d624c4a
self.repeat_record.add_success_attempt(response=True) | |
self.assertEqual(self.repeat_record.state, RECORD_SUCCESS_STATE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that they were all changed from
return True
to returnRepeaterResponse(204, "No content")
?
I checked. They were all changed:
Repeater.send_request() -> requests.Response
Dhis2Repeater.send_request() -> requests.Response | RepeaterResponse
FHIRRepeater.send_request() -> requests.Response | RepeaterResponse
ReferCaseRepeater.send_request() -> requests.Response | RepeaterResponse
DataRegistryRepeater.send_request() -> requests.Response | RepeaterResponse
OpenmrsRepeater.send_request() -> RepeaterResponse
Dhis2EntityRepeater.send_request() -> RepeaterResponse
We should update that docstring, in a separate PR.
a test asserts that a (SQL) record created with
response=True
results instate == Success
(notEmpty
). Should that test be changed?
I guess that test can be removed in the PR that updates the docstring.
Correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. For parts I couldn't follow super well, I looked for tests—and the way you paired small test changes with each commit made that easy!
@@ -1049,8 +1049,7 @@ def wrap(cls, data): | |||
)] | |||
return self | |||
|
|||
@property | |||
@memoized | |||
@cached_property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to note that cached_property
and memoized
+ property
have somewhat different semantics; if you try to directly write to a memoized
property
it raises an error, whereas if you try to directly write to a cached_property
it will let you override the value defined by the decorated method. https://docs.python.org/3/library/functools.html#functools.cached_property
Not sure if that is disqualifying, but wanted to make sure you'd considered that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, not disqualifying.
|
||
May include repeaters that have been created since the migration | ||
started, whose records are already migrated. Also ignore records | ||
associated with deleted repeaters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds probably fine, but to confirm, what is the behavior now of records for deleted repeaters? Are they hidden from the UI and any other places the user might be able to see that they're still in the system? Or are the effectively abandoned and never shown to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "deleted" is referring to hard-deleted repeaters. Such repeat records cannot exist in SQL because of the foreign key relationship between (SQL)Repeater
and SQLRepeatRecord
. In Couch they could exist, but would be orphaned.
To answer your question more directly, it is possible to find them in the repeat record report, for example by searching by payload ID, but attempting to view responses or the payload result in an error:
Repeater with id ... could not be found
Clicking the button to Resend Payload or Requeue Payload appears to succeed, but when the record is processed it will not be forwarded because the repeater does not exist.
It seems like a bug that repeat records for deleted repeaters can be viewed at all since the information about where they were sent was deleted with the repeater.
def iter_domain_docs(domain): | ||
return iter_docs(chunk_size, startkey=[domain], endkey=[domain, {}]) | ||
for domain in domains: | ||
yield from iter_domain_docs(domain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are these two new methods called? Couldn't find it in the PR diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're called by the Couch to SQL migration framework code when migrating individual domains.
commcare-hq/corehq/apps/cleanup/management/commands/populate_sql_model_from_couch_model.py
Lines 368 to 370 in bbe7d35
doc_count = self._get_couch_doc_count_for_domains(domains) | |
sql_doc_count = self._get_sql_doc_count_for_domains(domains) | |
docs = self._iter_couch_docs_for_domains(domains, chunk_size) |
# NOTE a 204 status here could mean either | ||
# 1. the request was not sent, in which case response is | ||
# probably RepeaterResponse(204, "No Content") | ||
# 2. the request was sent, and the remote end responded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point (see docstring) we internally signaled that the request was not sent by passing True
.
How about ...
if response is True or (
isinstance(response, RepeaterResponse)
and response.status_code == 204
):
state = State.Empty
else:
state = State.Success
... and in the future we can come up with a better, and consistent, internal signal?
return super().save(*args, **kw) | ||
|
||
def _migration_sync_submodels_to_sql(self, sql_object): | ||
if self._should_sync_attempts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't follow the paths of what calls __migration_sync_to_sql()
, but I'm guessing that you are sure that this cannot be reached outside the enable_attempts_sync_to_sql()
context manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is only called within the context of .save()
, which uses the context manager. Having the attribute not set outside the context manager will alert us (hopefully in tests, if not then in Sentry) if any code path arises that violates that precondition.
Couch can have null attempt messages, SQL cannot.
next_check is only valid if the record is in Pending or Fail state.
Latest commits address data states discovered while running |
self.assertEqual(len(obj.attempts), len(doc.attempts)) | ||
self.assertTrue(obj.attempts) | ||
|
||
def test_migrate_record_with_unynced_sql_attempts2(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_migrate_record_with_unynced_sql_attempts2(self): | |
def test_migrate_record_with_unsynced_sql_attempts2(self): |
What is the difference between this test and the test above it? What extra behaviour is it testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 9e7d4d3, in which I renamed the test and added a comment.
This is the first PR in the Couch to SQL migration process of Repeat Records. Commits for PR 2 (and eventually PR 3) can be found on dm/sql-repeatrecord.
🐠 Review by commit.
https://dimagi-dev.atlassian.net/browse/SAAS-14299
Safety Assurance
Safety story
Automated tests have been added where necessary to cover new functionality and to verify that Couch Repeat Records are synced to SQL as expected. In general, TDD was used to develop this PR. Repeater functionality should be unchanged.
Automated test coverage
Yes.
QA Plan
https://dimagi-dev.atlassian.net/browse/QA-6025
Rollback instructions
A decision will need to be made about what to do with data in
SQLRepeatRecord
andSQLRepeatRecordAttempt
tables If this PR is reverted. Otherwise, this PR can be reverted after deploy with no further considerations.