[Snowpipe Streaming] Fix IndexOutOfBoundException thrown when offsets are not continous during schema-evolution #1037
+61
−73
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Connector Exception stacktrace :
Due to this connector could not
evolve-schema
(even ifsnowflake.role.name
had correct permissions)Reproduction on ITs : #1036
What was the bug ?
This computation is not right.
When Offsets are continous, lets checkout
records
,offsets
andsinkRecords
objectssinkRecords : [SinkRecord<2000>, SinkRecord<2001> .... SinkRecord<2100>] (101 records)
records : [Map<2000>, .... Map<2100>]
offsets : [2000, ... 2100].
For
idx
= 10=> long originalSinkRecordIdx = 2010 - 2000 = 10\
If offsets are not continous (with gaps, like when using FilterSMT) ->
sinkRecords : [SinkRecord<2000>, SinkRecord<2002> .... SinkRecord<2100>] (51 records)
records : [Map<2000>, Map<2002>, .... Map<2100>]
offsets : [2000, 2002,... 2100].
For
idx
= 10=> long originalSinkRecordIdx = 2020 - 2000 = 20 (incorrect).
[we want SinkRecord at 10th index].
Pre-review checklist
snowflake.ingestion.method
.Yes
- Added end to end and Unit Tests.No
- Suggest why it is not param protected (This is a bug-fix)Urgency
This review is high priority