-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEXT-37310 - Added single row import strategy on import error #35
NEXT-37310 - Added single row import strategy on import error #35
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work 💪 , only added some small notes 🙂
CHANGELOG.md
Outdated
@@ -1,4 +1,5 @@ | |||
# NEXT-RELEASE | |||
- NEXT-37310 - Added single row import strategy when encountering an unhandleable error during a chunk import. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- NEXT-37310 - Added single row import strategy when encountering an unhandleable error during a chunk import. | |
- NEXT-37310 - Added single row import strategy when encountering an error without a reference to a specific row during a chunk import. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not say that this formulation is better as there are errors, e.g. a product with a duplicate product number
, that reference a specific row but also exit the sync call with an error.
In my opinion it would maybe be better to write something like "[...] when encountering an error that cannot be handled automatically during a chunk import".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not say that this formulation is better
Of course I also wouldn't say it is better, just a suggestion my mind came up with. Feel free to discard it 🙂 .
as there are errors, e.g. a product with a duplicate product number, that reference a specific row but also exit the sync call with an error.
Does it? I thought that in case we have an "error pointer" we would just remove that invalid entry from the chunk and retry. Of course the user sees the error.
And in the case where we don't have specific "error pointers", we fallback to one-by-one import and report any errors encountered there, which still shouldn't fail the whole import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only filter the rows that have a corresponding SwError::WriteError
. But the error that is thrown when we have for example a duplicate product number is of the type SwError::GenericError
so it won't be filtered by remove_invalid_entries_from_chunk
.
And with my current implementation these falsy rows will still be skipped during the "single row import" as every row that triggers an automatically unhandleable error e.g. the "duplicate product number" error (not deadlocks and the like!) will just be ignored.
6e4a5ba
to
d41610a
Compare
Summary of the total line code coverage for the whole codebase
Summary of each file (click to expand)
More details (click to expand)Download full HTML reportYou can download the full HTML report here: click to download You can also generate these reports locallyFor that, you need to install cargo-llvm-cov, then you can run: cargo llvm-cov --all-features --no-fail-fast --open Hint: There are also other ways to see code coverage in Rust. For example with RustRover, you can execute tests with coverage generation directly in the IDE. RememberYour tests should be meaningful and not just be written to raise the coverage. |
When encountering an for now unhandleable error during an import of a chunk we return the whole error but skip the chunk.
I've added a new strategy where we try to import the chunk piece by piece to specifically filter the ones that are invalid (the error will be still returned to the user, cf.
import.rs
ll. 172 - 173).There was also a small bug in
remove_invalid_entries_from_chunk
where if an row contained multiple invalid fields the row itself but also some of the following ones (2 errors -> 1 row after the invalid one, and so on) were filtered. To prevent that I added a filter for duplicates inimport.rs
l. 255.