add normalizer for keyword fields #415
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cherry-picked from #412 and based on #414.
This PR adds a
normalizerwhich is the nearest thing to ananalyzerforkeywordfields.more info here: elastic/elasticsearch#18064
This allows us to perform some basic normalization to fields such as
layer,sourceandcategory, forcing them to be lowercased and doing some ICU normalization.One notable change here is that those fields were previously case-sensitive and will now be case-insensitive, which I think is preferable despite there being a test which was covering this behaviour.
Note that not all
keywordfields should have a normalizer specified, for instance, verbatim fields such asbounding_boxandaddendumare probably best left with the defaultnullnormalizer.Normalizers are applied both at index-time and at query-time.
I would like to add some additional filters such as
trimanduniquebut they are not available until version6.4of elasticsearch and so will come in a subsequent PR which can be merged independently of this.