-
-
Notifications
You must be signed in to change notification settings - Fork 141
Open
Description
- Operating System: MacOS
- Node Version: 20
- NPM Version: 9.6.6
- csv-parser Version: ^3.0.0
Expected Behavior
parse a file even if data has a double quote, or at least produce error
Actual Behavior
silently quits file early
How Do We Reproduce?
I kept getting less rows streamed than I expected from a file located at https://download.geonames.org/export/dump/admin2Codes.txt
This is a tab delimited file of 45,784 rows
I realized that it was because one of the entries has a double quote
RU.45.517838 Novotor”yal’skiy Rayon Novotor"yal'skiy Rayon 517838
, which if I delete it works properly
RU.45.517838 Novotor[DELETED]yal’skiy Rayon Novotor"yal'skiy Rayon 517838
import {Writable} from "node:stream";
import csvParser from "csv-parser";
import {Transform} from "stream";
import https from "https";
const repro = async () => {
let lineCount = 0
return new Promise<void>((resolve, reject) => {
https.get("https://download.geonames.org/export/dump/admin2Codes.txt", (response) => {
response
.pipe(csvParser({separator: "\t", headers: ["id", "name", "nameAscii", "geonameId"]}))
.pipe(new Transform({
objectMode: true,
transform(chunk, encoding, callback) {
lineCount++
this.push(chunk);
callback();
},
}))
.pipe(new Writable({
objectMode: true,
write(chunk, encoding, callback) {
callback();
}
}))
.on('finish', () => {
console.log("total lines should be ~45k", lineCount)
resolve()
})
.on('error', reject)
}).on('error', reject)
})
}
(async () => {
await repro()
})()prasrvenkat
Metadata
Metadata
Assignees
Labels
No labels