Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We got an error like this:
It turns out the column/character references were misleading. What actually happened was an earlier column in the row had the literal value
\
for a string field. This caused the CSV to serialize as:Snowflake interprets the second column here as escaping the comma, offsetting all the columns by one.
Python
csv
didn't escape or quote the value because it is using the defaultexcel
dialect, which has no escape character.The ideal solution would be to tell Python
csv
that\
is an escape character (csv.DictWriter(out, csv_headers, escapechar='\\'
). This causes the field to be quoted. However it also causes\\N
to be quoted, breaking the null columns.The best solution I could come up with was to tell it to use
\
as an escape character and use escaping rather than quoting (quoting=csv.QUOTE_NONE
). I don't know if this will have any unintended side effects. It does meanFIELD_OPTIONALLY_ENCLOSED_BY = '"'
is no longer necessary as it will never happen; fields with commas/newlines will now be escaped instead of quoted. I'm open to other solutions here.