-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
bugSomething isn't workingSomething isn't working
Description
I ran into an issue when trying to load some reddit files and especially when breaking up the twitter files. It fails with this error:
Error loading [file_name]: Error tokenizing data.
C error: Buffer overflow caught - possible malformed input file.
It looks like the file might have some malformed rows or unusually long lines that the parser can’t handle. I'm assuming it's either corrupted or malformed rows (obviously from the note) or encoding issues
Edit: I know this can't be because of long lines, because this happens with the twitter datasets if they're broken up, and most of them are 140 characters or less
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working