-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(normalization): Only run normalization once, with item headers #3730
Conversation
c0b2d5d
to
823ebe4
Compare
868fdff
to
3508fdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There have been no protocol/normalization changes, so it's safe to reuse the same feature flags.
@@ -161,5 +161,5 @@ | |||
"username": "username" | |||
}, | |||
"version": "7", | |||
"location": null | |||
"location": "not-empty-so-that-it-isnt-dropped" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra fields with null values are not serialized. Setting this value to anything else ensures we test future-compatibility.
tests/integration/test_store.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These use cases are covered in new tests in tests/integration/test_normalization.py
.
This PR updates the approach to decide whether to run normalization on an event.
Relay looks at the headers of items in the envelope to decide whether to run normalization, instead of relying on assumptions on whether the previous relay run normalization. Since external systems can rewrite the header, only envelopes coming from relays in the internal infrastructure are trusted.
Relay resets the
fully_normalized
flag after updating the event when processing envelopes, in order to ensure it runs normalization. It follows the same approach for events created in the processing pipeline.Events from attachments
Relay disables the normalization flag for events created from attachments, and it will later normalize them.
This step is required as only processing relays deal with attachments. It could be simpler if we moved the attachment processing out of processing relays, but this is out of scope of this PR.
Background
Currently, PoP relays run some normalization, and processing relays run full normalization. Since both layers are internal and their deployments are controlled by Sentry, running full normalization in PoP relays saves the CPU in processing relays (estimated around 10%) while ensuring it always runs once.
Previous attempt
The previous attempt forced full normalization in PoPs and assumed processing relays would always receive the fully normalized events. However, it didn't consider events created in processing relays (from minidumps and various crash reporting mechanisms), resulting in ingesting events without running normalization.
Rollout and rollback
Feature flags will first enable full normalization in PoPs and then disable normalization in processing. The integration test
test_relay_chain_normalizes_events
validates this flow. Disabling feature flags will roll back this behavior.