-
Notifications
You must be signed in to change notification settings - Fork 2
Description
UPDATE 3: Full discussion:
algolia/instantsearch#3999
Some facets appear as TitleCase in Algolia dashboard, but appear as UPPERCASE in index.
Results in errors when refining certain facets.
To reproduce:
- Visit Wiregrass Foundation
- Click on
CITY OF DOTHANin hits table - RefinementList shows the TitleCase version as well as the UPPERCASE version
Can also reproduce in Algolia Dashboard.
So far, have noticed for the following in the following facets::
Facet: grantee_name
CITY OF DOTHAN
WIREGRASS MUSEUM OF ART
Curiously, there's only one instance in grant_purpose:
CORE PROGRAM
The MongoDB document shows UPPERCASE as expected
{
"_id" : "200897153_2018_48",
"objectID" : "200897153_2018_48",
"ein" : "200897153",
"organization_name" : "WIREGRASS FOUNDATION",
"city" : "Dothan",
"state" : "AL",
"tax_year" : 2018.0,
"aws_index_year" : "2019",
"last_updated_grantmakers" : "2019-06-20T19:46:38.155Z",
"last_updated_irs" : "2019-06-19T01:50:06.8779991Z",
"grant_amount" : 1000.0,
"grant_purpose" : "CORE PROGRAM",
"grantee_name" : "LIGHTHOUSE FAMILY RETREAT",
"grantee_city" : "Atlanta",
"grantee_state" : "GA",
"grantee_state_displayed" : "GA",
"grantee_country" : "US",
"grantee_is_foreign" : false,
"grant_number" : 49.0
}
UPDATE 2
It appears the root cause is the Algolia engine creates facets based the case type of the first record. Subsequent records appear to be case-normalized.
Thus, the reason UPPERCASE "CITY OF DOTHAN" appears as Title Case "City of Dothan" in facets for the Wiregrass Foundation profile is because the facet was created using another foundation's donation to Title Case "City of Dothan".
UPDATE
Possibly related to Algolia using UCS-2 encoding
Further research
- MongoDB defaults to UTF-8
- Confirmed only UPPERCASE appears in source collection in MongoDB (e.g. grants collection)
- TitleCase only appears in Algolia facets - records are fine