-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
enhancementNew feature or requestNew feature or request
Description
When evaluating file specifications to create file collections, we should follow this:
- If a file has a known extensions, mark it as text or binary based on the dictionary (implemented)
- Include files that do not have a file extension, and files with extensions not covered by the dictionary
- Guess if the file is (canonically) text, otherwise mark them as binary
- I'd prefer zlib's algorithm: https://github.com/madler/zlib/blob/master/doc/txtvsbin.txt
- If a file does not have any content, then mark it as binary
- Guess if the file is (canonically) text, otherwise mark them as binary
- Document this flow in the specification section
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request