-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixed eol characters in the same csv file is not handled #439
Comments
IMHO this lib does not have to support all types of incorrectly formated CSV files, there are users already complaining about the size of the lib. You have to preprocess the file before feeding it into this module. In Node is quite easy and you can use a stream reader with this lib. |
I don't necessarily disagree, as mentioned, this may be out of scope for this lib, this may be a code fix or simple documentation describing how the new line character is auto-detected/used or no action at all, just wanted to bring it up with the maintainers here in the event that this case was not considered. |
Maybe I was not clear enough, this module allows Non-tested code, something like this const fs = require('fs')
const { Transform } = require("stream")
const csv = require('csvtojson')
const trans = new Transform({
transform(chunk, encoding, callback) {
// process chunk, for example chunk.toString().toUpperCase()
const processedChunk = chunk.toString().toUpperCase()
callback(null, processedChunk)
},
});
csv()
.fromStream(fs.createReadStream('/path/to/file', { encoding: 'utf-8' }).pipe(trans))
.subscribe((json) => {
console.log(json)
},
(err) => {
throw err
},
() => {
console.log('success')
}) |
Thanks, that's exactly what I used to pre-process the file actually. |
This may be a non-standard csv format, but if a csv file has a carriage-return as the
eol
character in the first row, but then uses say new-line characters for the remaining rows, then the lib will parse the remaining lines as 1 giant row. Meaning the output will be an array of 1 object with a massive number of keys (example file below yieldsfield7507057
as the last key in the object).Example of this kind of file is a data file from the US Department of Education: https://nces.ed.gov/surveys/pss/zip/pss1920_pu_csv.zip
This may be outside the scope of this lib to handle, but I wanted to bring it to your attention.
Repro steps:
Download and unzip the example file
See the results:
The text was updated successfully, but these errors were encountered: