-
Notifications
You must be signed in to change notification settings - Fork 447
Description
Currently, the hash field set in schemas/hash.yml defines fields for common hashes (MD5, SHA1, etc.) but does not explicitly advise on the format of the hex-encoded hash values themselves. While the examples provided in the schema are lowercase, there is no normative text suggesting that users should normalize these values.
This ambiguity allows users to populate these fields with uppercase, lowercase, or mixed-case letters. If the underlying storage or query engine is case-sensitive (e.g. Elasticsearch keyword fields 😉 ), then the user might not match the threat indicator.
Proposal
Update the description in schemas/hash.yml to explicitly recommend that hash values be normalized to lowercase.
Current Description:
The hash fields represent different bitwise hash algorithms and their values.
Proposed Addition:
The hash fields represent different bitwise hash algorithms and their values. Field values should be normalized to lowercase (e.g.
efd6...instead ofEFD6...).
Future Consideration
Ideally, a lowercase normalizer should be applied to these fields in the generated Elasticsearch templates so this is handled automatically at index and query time.
Related PR/discussion from 2019 for additional context: https://github.com/elastic/ecs/pull/426/changes#r276024283