Skip to content

Search for selection acts weird for some characters #153

@Alex-Kent

Description

@Alex-Kent

I'm seeing odd behavior when using the search for selection (Control+H, referred to as ^H below) feature when the search string contains certain characters. The specific character I'm seeing issues with is ſ (long s). In a nutshell, the search results vary (and sometimes match strings that aren't the search string) depending on where/how the search string characters are selected (see examples below). A sample of text that shows issues is:

# l s ſ	U+017F	latin small letter long S
# f s ſ	U+017F	latin small letter long s
 ſ U

[The above text contains both space and tab characters.]

Some of the issues that I've observed (all related to searching for strings containing the character ſ) are:

  1. Manually select the character ſ on the first line.
    1. Press ^H; the character ſ on the second line is selected.
    2. Press ^H; the selection remains the same (ditto for then pressing ^G).
  2. Manually select the character ſ on the second line.
    1. Press ^H; the character ſ on the third line is selected.
    2. Press ^H; the selection remains the same (ditto for then pressing ^G).
  3. Manually select the character ſ on the third line.
    1. Press ^H; the character ſ on the first line is selected.
    2. Press ^H; the selection remains the same (ditto for then pressing ^G).
  4. Manually select ſ U (four characters) on the third line.
    1. Press ^H; the three-character string ſ on the third line is selected [the trailing character is a tab].
    2. Press ^H; the two-character string ſ on the third line is selected.
    3. Press ^H; the two-character string ſ on the first line is selected.
    4. Press ^H; the two-character string ſ on the second line is selected.
    5. Press ^H; the space character ( ) before the word small on the second line is selected.
    6. All subsequent presses of ^H select subsequent space ( ) characters but not tab characters.
      The behavior in v. and vi. may vary. I've found that adding any text to the tail of the file changes the latter matches to s (rather than just ). This behavior often (but not always) persists even if the added text is removed manually or with undo.
      Also Pasting using the middle mouse button into another window (Firefox in my case) before pressing ^H results in the following being pasted: For 4. ſ U For 4. i. ſ For 4. ii. ſ But for 4. iii. it's Å [Not sure if it's relevant but I did type/paste/select/search for Å earlier today.]
  5. Change the string ſ U to ſ U (change third character to a tab); the same behaviour as for 4. is seen.
  6. Manually select the single character ſ or the four-character string ſ U in a different program (Firefox in my test).
    1. With the Xnedit window focused press ^H; no text is selected.

Normal behavior (for comparison):

  1. Select any occurrence of the character U.
    1. Press ^H; the subsequent U character is selected (exactly which one depends on cursor position).
    2. All subsequent presses of ^H select subsequent U characters.
  2. Clear the Xnedit selection then select the character U in a different program (Firefox in my test).
    1. With the Xnedit window focused press ^H; one of the U characters is select (exactly which one depends on cursor position).
    2. All subsequent presses of ^H select subsequent U characters.

It also seems that in the context of searching the characters S and s are considered to be the same letter but ſ is not (i.e. something awry with converting ſ to normal form). Likewise, maybe ss and ß should be considered the same letter/string when searching (though this doesn't appear to be noted in the Unicode spec). [Perhaps this paragraph should be a separate bug report.]

I'm running 1.6.0 (built from source) on Ubuntu Linux 22.04.5.

[Sorry for the deluge; I tried to provide a sizable set of test cases to make it easier to isolate the problem.]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions