-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1506546: Add support for INCLUDE_METADATA
copy option for df.copy_into_table()
#1839
Comments
INCLUDE_METADATA
copy option for df.copy_into_table()
INCLUDE_METADATA
copy option for df.copy_into_table()
Actually there is a bigger issue. I was able to bypass the issue with string quoting like this:
Using this custom class:
But then the SQL fails: Is it possible to generate a COPY statement in Snowpark that does not explicitly list columns, and instead relies on native schema evolution? That may be required for |
Hello @sfc-gh-kgaputis , Thanks for raising this issue. https://docs.snowflake.com/en/release-notes/2024/8_17#new-copy-option-include-metadata The purpose of copy transform is to transform any column while loading, so it expects explicit column names, so we can't skip column names in copy transform. Regards, |
Hello @sfc-gh-kgaputis , Will check and update further if we have any plans to support it. Regards, |
Hello @sfc-gh-kgaputis , After discussion with the team: SQL compilation error: include_metadata is not supported with copy transform Example Query history: At present, it's not supported, I will take this as a feature request. Regards, |
Hello @sfc-gh-kgaputis , You can try the following code to get the metadata. Its working fine and fetching all metadata `from snowflake.snowpark.column import METADATA_FILENAME, METADATA_FILE_ROW_NUMBER, METADATA_FILE_LAST_MODIFIED, METADATA_START_SCAN_TIME schema_for_data_file = StructType([ df = session.read.schema(schema_for_data_file) target_column_names = ["seq","first_name" "last_name","FILE_LAST_MODIFIED","FILE_SCAN_TIME","FILENAME"] session.sql("DROP TABLE IF EXISTS copied_into_table").collect() user_schema = StructType([ Output
|
I am running into a similar issue: ` df.copy_into_table('DEV_CORE.RAW.BROADCAST_UNP', force=True, MATCH_BY_COLUMN_NAME='CASE_INSENSITIVE', include_metadata='(CHECKSUM=METADATA$FILE_CONTENT_KEY)') Without the '' around (CHECKSUM=METADATA$FILE_CONTENT_KEY)') the code does not compile, and with the quotes, SNowflake gives following: For reference following SQL, when executed from Snowflake directly works: |
Got redirected from support and been thinking on how making it work Somehow this Plus depending on where this fields is located, it can mess with the table structure (my filename is at the end for example) I came with this solution
with raw_files being my DataframeReader Only tweaking the transformation argument helped me to achieve adding the filename in my copy into processes |
I'm using Snowpark Python 1.16.0.
I'd like to add metadata columns to my COPY statement using the new
include_metadata
copy option, but there's no way to pass in the value in such a way that it doesn't get quoted:I think it comes down to this code in
AnalyzerUtils
:And
convert_value_to_sql_option
doesn't support a list of key values pairs.The text was updated successfully, but these errors were encountered: