Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add slash at the end of the load path (#1366)
Shredder puts together entities with the same schema model-revision-addition in the same batch under same folder. Let’s say you have events with `1-0-0`, `1-0-1` and `1-0-2` version of the `com.acme.test` in the same batch. In that case, resulting run folder will have following subfolders: ``` output=good/vendor=com.acme/name=test/format=tsv/model=1/revision=0/addition=0 output=good/vendor=com.acme/name=test/format=tsv/model=1/revision=0/addition=1 output=good/vendor=com.acme/name=test/format=tsv/model=1/revision=0/addition=2 ``` Before the fix, Loader was using the s3 paths without slash (/) at the end in the created copy statements. This works fine in most cases. However, when same batch contains events with `1-0-1` and `1-0-11`, then problem starts. In that case, run folder will have following subfolders: ``` output=good/vendor=com.acme/name=test/format=tsv/model=1/revision=0/addition=1 output=good/vendor=com.acme/name=test/format=tsv/model=1/revision=0/addition=11 ``` When entities in the `/model=1/revision=0/addition=1` are tried to be copied to respective table with copy statement, Redshift tries to copy the entities under `/model=1/revision=0/addition=11` as well since they have same prefix and it gives error during the copy since data under `/model=1/revision=0/addition=11` doesn’t have same structure with `1-0-1`. Putting slash at the end of the path solved the problem. After that change, only entities under `model=1/revision=0/addition=1` are copied as expected.
- Loading branch information