Skip to content

Make string handling unicode-preserving#25

Open
m7a wants to merge 1 commit intomazenharake:masterfrom
m7a:string_handling_unicode
Open

Make string handling unicode-preserving#25
m7a wants to merge 1 commit intomazenharake:masterfrom
m7a:string_handling_unicode

Conversation

@m7a
Copy link
Contributor

@m7a m7a commented Apr 13, 2024

Hello,

I tried to get cecho to work with non-ASCII characters but it looked as if they were always mangled. After some digging into the code I came up with a solution for the string-processing functions that keeps UTF-8 data as-is resulting in correct display on my terminal (urxvt).

Feel free to include it into the upstream if it makes sense to you 😄

I think this patch is good, but it could still be incomplete e.g. I did not test with non-ASCII border drawing characters yet. A similar issue might come up there, too, but it could then possibly be resolved in a separate commit?

Thanks in advance
Linux-Fan (@m7a)

Commit Message

Previously, it was impossible to output non-ASCII characters using cecho for multiple reasons:

  • Strings were processed as flattened lists. This caused encoding information to not be preserved properly.
  • Data was transmitted as string from the Erlang to the C side causing encoding information not to arrive correctly.
  • The library linked against libncurses rather than libncursesw and as a result, no “wide character” support was available.

This commit fixes this by switching the string handling to be based on iolists and binaries. String values are transferred to the C-side as binaries rather than strings now.

Additionally, upon compilation, the C part is linked against libncursesw in favor of the previously chosen libncurses.

The API remains compatible with preceding invocations and still allows strings to be passed to all of the string functions.

Previously, it was impossible to output non-ASCII characters using cecho for
multiple reasons:

 * Strings were processed as flattened lists.
   This caused encoding information to not be preserved properly.
 * Data was transmitted as string from the Erlang to the C side causing
   encoding information not to arrive correctly.
 * The library linked against libncurses rather than libncursesw and as
   a result, no “wide character” support was available.

This commit fixes this by switching the string handling to be based on iolists
and binaries. String values are transferred to the C-side as binaries rather
than strings now.

Additionally, upon compilation, the C part is linked against `libncursesw` in
favor of the previously chosen `libncurses`.

The API remains compatible with preceding invocations and still allows strings
to be passed to all of the string functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant