Skip to content

Commit

Permalink
improves csv docs
Browse files Browse the repository at this point in the history
  • Loading branch information
rudolfix committed Jun 26, 2024
1 parent 4cb2646 commit f098e5a
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 1 deletion.
13 changes: 13 additions & 0 deletions docs/website/docs/dlt-ecosystem/destinations/snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,19 @@ When staging is enabled:
When loading from `parquet`, Snowflake will store `complex` types (JSON) in `VARIANT` as a string. Use the `jsonl` format instead or use `PARSE_JSON` to update the `VARIANT` field after loading.
:::

### Custom csv formats
By default we support csv format [produced by our writers](../file-formats/csv.md#default-settings) which is comma delimited, with header and optionally quoted.

You can configure your own formatting ie. when [importing](../../general-usage/resource.md#import-external-files) external `csv` files.
```toml
[destination.snowflake.csv_format]
delimiter="|"
include_header=false
on_error_continue=true
```
Which will read, `|` delimited file, without header and will continue on errors.

Note that we ignore missing columns `ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE` and we will insert NULL into them.

## Supported column hints
Snowflake supports the following [column hints](https://dlthub.com/docs/general-usage/schema#tables-and-columns):
Expand Down
14 changes: 13 additions & 1 deletion docs/website/docs/dlt-ecosystem/file-formats/csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,23 @@ info = pipeline.run(some_source(), loader_file_format="csv")
`dlt` attempts to make both writers to generate similarly looking files
* separators are commas
* quotes are **"** and are escaped as **""**
* `NULL` values are empty strings
* `NULL` values both are empty strings and empty tokens as in the example below
* UNIX new lines are used
* dates are represented as ISO 8601
* quoting style is "when needed"

Example of NULLs:
```sh
text1,text2,text3
A,B,C
A,,""
```

In the last row both `text2` and `text3` values are NULL. Python `csv` writer
is not able to write unquoted `None` values so we had to settle for `""`

Note: all destinations capable of writing csvs must support it.

### Change settings
You can change basic **csv** settings, this may be handy when working with **filesystem** destination. Other destinations are tested
with standard settings:
Expand Down

0 comments on commit f098e5a

Please sign in to comment.