Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1859651 Salt prefix for files mapped from different topics #1035

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sfc-gh-dseweryn
Copy link

@sfc-gh-dseweryn sfc-gh-dseweryn commented Dec 20, 2024

Overview

SNOW-1859651

Given two topics that are setup to be ingested to a single table using Snowpipe ingestion, each of the topics has an assigned cleaner for deleting already ingested files. This change prevents these cleaners from incorrectly cleaning files from the topic that should be handled by the other cleaner.
Any topic that is mapped by the topic2table.map should have filename prefix salted with the topic as it is otherwise complicated (and unnecessary) to determine which other topic can have a clash on cleaning filenames.

Supersedes: #1030

Pre-review checklist

  • This change should be part of a Behavior Change Release. See go/behavior-change.
  • This change has passed Merge gate tests
  • Snowpipe Changes
  • Snowpipe Streaming Changes
  • This change is TEST-ONLY
  • This change is README/Javadocs only
  • This change is protected by a config parameter snowflake.snowpipe.stageFileNameExtensionEnabled.
    • Yes - Added end to end and Unit Tests.
    • No - Suggest why it is not param protected
  • Is his change protected by parameter <PARAMETER_NAME> on the server side?
    • The parameter/feature is not yet active in production (partial rollout or PrPr, see Changes for Unreleased Features and Fixes).
    • If there is an issue, it can be safely mitigated by turning the parameter off. This is also verified by a test (See go/ppp).

FileNameUtils.filePrefix(conn.getConnectorName(), tableName, topicName, partition);
} else {
this.prefix = FileNameUtils.filePrefix(conn.getConnectorName(), tableName, "", partition);
FileNameUtils.filePrefix(conn.getConnectorName(), tableName, topicForPrefix, partition);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, if i read this correctly - we will not generate the salted prefix for all cases but only for a subset of them, right?
why not enable this by default for all?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. There are three benefits I can think of when topic2table.map is not used for particular topic:

  • no chances of regression
  • no leftover files just because the connector has been updated
  • easier to identify partition numbers

Copy link
Contributor

@sfc-gh-gjachimko sfc-gh-gjachimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - left one question though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants