Skip to content

Fix dataset line splitting bug#38

Merged
SYSTEMS-OPERATOR merged 1 commit intomainfrom
codex/find-and-fix-bug
Jun 27, 2025
Merged

Fix dataset line splitting bug#38
SYSTEMS-OPERATOR merged 1 commit intomainfrom
codex/find-and-fix-bug

Conversation

@SYSTEMS-OPERATOR
Copy link
Owner

Summary

  • avoid ValueError when spaces are doubled in token id datasets
  • update dataset line parsing logic to use generic whitespace split

Testing

  • pytest -q

https://chatgpt.com/codex/tasks/task_e_685e526131e0832484f5a7a517bf0078

@SYSTEMS-OPERATOR SYSTEMS-OPERATOR merged commit 05d8eea into main Jun 27, 2025
1 check passed
@SYSTEMS-OPERATOR SYSTEMS-OPERATOR deleted the codex/find-and-fix-bug branch June 27, 2025 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant