Skip to content

Recommend lowercase format for hash values in schema guidance #2586

@andrewkroh

Description

@andrewkroh

Currently, the hash field set in schemas/hash.yml defines fields for common hashes (MD5, SHA1, etc.) but does not explicitly advise on the format of the hex-encoded hash values themselves. While the examples provided in the schema are lowercase, there is no normative text suggesting that users should normalize these values.

This ambiguity allows users to populate these fields with uppercase, lowercase, or mixed-case letters. If the underlying storage or query engine is case-sensitive (e.g. Elasticsearch keyword fields 😉 ), then the user might not match the threat indicator.


Proposal

Update the description in schemas/hash.yml to explicitly recommend that hash values be normalized to lowercase.

Current Description:

The hash fields represent different bitwise hash algorithms and their values.

Proposed Addition:

The hash fields represent different bitwise hash algorithms and their values. Field values should be normalized to lowercase (e.g. efd6... instead of EFD6...).


Future Consideration

Ideally, a lowercase normalizer should be applied to these fields in the generated Elasticsearch templates so this is handled automatically at index and query time.

Related PR/discussion from 2019 for additional context: https://github.com/elastic/ecs/pull/426/changes#r276024283

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions