-
Notifications
You must be signed in to change notification settings - Fork 15
Description
I'm seeing odd behavior when using the search for selection (Control+H, referred to as ^H below) feature when the search string contains certain characters. The specific character I'm seeing issues with is ſ (long s). In a nutshell, the search results vary (and sometimes match strings that aren't the search string) depending on where/how the search string characters are selected (see examples below). A sample of text that shows issues is:
# l s ſ U+017F latin small letter long S
# f s ſ U+017F latin small letter long s
ſ U
[The above text contains both space and tab characters.]
Some of the issues that I've observed (all related to searching for strings containing the character ſ) are:
- Manually select the character
ſon the first line.- Press
^H; the characterſon the second line is selected. - Press
^H; the selection remains the same (ditto for then pressing^G).
- Press
- Manually select the character
ſon the second line.- Press
^H; the characterſon the third line is selected. - Press
^H; the selection remains the same (ditto for then pressing^G).
- Press
- Manually select the character
ſon the third line.- Press
^H; the characterſon the first line is selected. - Press
^H; the selection remains the same (ditto for then pressing^G).
- Press
- Manually select
ſ U(four characters) on the third line.- Press
^H; the three-character stringſon the third line is selected [the trailing character is a tab]. - Press
^H; the two-character stringſon the third line is selected. - Press
^H; the two-character stringſon the first line is selected. - Press
^H; the two-character stringſon the second line is selected. - Press
^H; the space character () before the wordsmallon the second line is selected. - All subsequent presses of
^Hselect subsequent space () characters but not tab characters.
The behavior in v. and vi. may vary. I've found that adding any text to the tail of the file changes the latter matches tos(rather than just). This behavior often (but not always) persists even if the added text is removed manually or with undo.
Also Pasting using the middle mouse button into another window (Firefox in my case) before pressing^Hresults in the following being pasted: For 4.ſ UFor 4. i.ſFor 4. ii.ſBut for 4. iii. it'sÅ[Not sure if it's relevant but I did type/paste/select/search forÅearlier today.]
- Press
- Change the string
ſ Utoſ U(change third character to a tab); the same behaviour as for 4. is seen. - Manually select the single character
ſor the four-character stringſ Uin a different program (Firefox in my test).- With the Xnedit window focused press
^H; no text is selected.
- With the Xnedit window focused press
Normal behavior (for comparison):
- Select any occurrence of the character
U.- Press
^H; the subsequentUcharacter is selected (exactly which one depends on cursor position). - All subsequent presses of
^Hselect subsequentUcharacters.
- Press
- Clear the Xnedit selection then select the character
Uin a different program (Firefox in my test).- With the Xnedit window focused press
^H; one of theUcharacters is select (exactly which one depends on cursor position). - All subsequent presses of
^Hselect subsequentUcharacters.
- With the Xnedit window focused press
It also seems that in the context of searching the characters S and s are considered to be the same letter but ſ is not (i.e. something awry with converting ſ to normal form). Likewise, maybe ss and ß should be considered the same letter/string when searching (though this doesn't appear to be noted in the Unicode spec). [Perhaps this paragraph should be a separate bug report.]
I'm running 1.6.0 (built from source) on Ubuntu Linux 22.04.5.
[Sorry for the deluge; I tried to provide a sizable set of test cases to make it easier to isolate the problem.]