Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a test to show error described in #126 (also fixes #126) #128

Merged
merged 5 commits into from
Aug 24, 2023

Conversation

Diggory
Copy link
Contributor

@Diggory Diggory commented Aug 17, 2023

Adds a test to check if creating a CSV instance directly from a string that starts with a BOM correctly creates the headers (without the BOM in the first header.). This is referencing issue #126

@Diggory
Copy link
Contributor Author

Diggory commented Aug 17, 2023

Now also fixes #126

@Diggory Diggory changed the title Adds a test to show error described in #126 Adds a test to show error described in #126 (also fixes #126) Aug 19, 2023
@Diggory
Copy link
Contributor Author

Diggory commented Aug 24, 2023

@DivineDominion Hi, any chance of a quick review? Thanks.

@DivineDominion
Copy link
Contributor

DivineDominion commented Aug 24, 2023

Ah, I see -- the now 'designated' string-based initializer does all the heavy lifting instead of the many URL based ones.

That sounds sensible to me; I'll approve and merge into master. Thank you!

In the meantime:

@lardieri Do you recall any good reason we did not go ahead with this, or did we just not notice when you came up with the first BOM fix?

@DivineDominion DivineDominion merged commit ed520ff into swiftcsv:master Aug 24, 2023
1 check passed
@lardieri
Copy link
Contributor

It's not that we didn't notice. It's that we decided it was irrelevant. BOMs do not apply to Strings, they apply to encodings of Strings.

A String in Swift is a sequence of abstract Characters that does not have an inherent encoding. (The internal storage of a String is none of our business.) Therefore, a String doesn't have and doesn't need to have any indicator of byte order.

The BOM is a hint to the decoder that turns a sequence of bytes into a String. It is not part of the String itself.

If someone is passing you a String that has something that looks like a byte order mark in the first Character, I would argue that it is a malformed String and the decoder that created it is buggy.

Unfortunately, as I mentioned in this comment, Apple's UTF-8 decoder is buggy.

Given all of the above, we decided we would deal with BOMs at the point where we decode a sequence of bytes into a String, namely, in the convenience initializers.

It's not that I strongly object to moving the BOM handler into the designated initializer. I just feel that it is papering over a bug elsewhere, namely in the code that reads the byte contents of the Excel CSV file and turns it into a String.

@DivineDominion
Copy link
Contributor

DivineDominion commented Sep 7, 2023

Thank you for the reminder and summary!

I just feel that it is papering over a bug elsewhere, namely in the code that reads the byte contents of the Excel CSV file and turns it into a String.

It's true that this is none of SwiftCVS's responsibility and we could do without this completely.

But 'Postel's Law' for robust software ("Be conservative in what you do, be liberal in what you accept from others") isn't a bad idea. Apple bugs usually don't go away quickly, so we can make the journey simpler for users of the library. That should be good for developer happiness.

If nobody is objecting:

@Diggory would you be willing to add a comment to the BOM that summarizes why we're doing that (b/c Apple's code is buggy) and point to https://github.com/swiftcsv/SwiftCSV/issues/97#issuecomment-1181599651?

@Diggory
Copy link
Contributor Author

Diggory commented Sep 7, 2023

Will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants