Skip to content

Accent marks in Russian files #1355

@javnik36

Description

@javnik36

Continuation of #1354

I ran the script that looks for combining diacritical marks from Combining Diacritical Marks Unicode Block in i18n files. My main purpose was to find out cases when single unicode character should be used instead of combination of 2 marks [letter+combining accent].

Example:

ę in Polish can be mistakenly written as [e+U+0328 combining ogonek]. Both cases look exactly the same > ę vs ę < but the latter is incorrect and may cause issues when editing/displaying.

Script found several accent marks in Russian files - I'm leaving output of the script for Russian natives to take a look and (maybe) do any action. According to my small research it looks like acute accent (https://www.compart.com/en/unicode/U+0301) may be used in Russian with combination with Cyrillic letters, but other combinations have their own glyphs, so they may be incorrect right now.

Char Unicode of accent found after Char String Found in File Line number
о U+0301 по́том ru: dwl/armitages_fate.po 29
о U+0301 во́роны ru: tcu/union_and_disillusion.po 19
а U+0301 замка́ ru: tfa/heart_of_the_elders_part_1.po 47
о U+0301 Про́кляты(...) ru: tfa/heart_of_the_elders_part_2.po 57
, U+0301 (...)удно,́ ru: tfa/the_city_of_archives.po 130
о U+0301 мо́чи ru: tfa/those_held_captive.po 100
о U+0301 по́том ru: tfa/those_held_captive.po 109
и U+0306 (...)льшой ru: tskc/10_dancing_mad.po 126
е U+0308 (...)удалённые ru: tskc/10_dancing_mad.po 201
е U+0308 её ru: tskc/10_dancing_mad.po 201
е U+0308 «стёрты ru: tskc/10_dancing_mad.po 207
е U+0308 N/A ru: tskc/10_dancing_mad.po 210
е U+0308 N/A ru: tskc/10_dancing_mad.po 213
й U+0306 Кай̆т-Бей ru: tskc/15_dogs_of_war.po 76
и U+0306 N/A tskc/27_congress_of_the_keys.po 613
и U+0306 (...)ертой tskc/27_congress_of_the_keys.po 613
и U+0306 N/A tskc/27_congress_of_the_keys.po 619
и U+0306 (...)ертой tskc/27_congress_of_the_keys.po 619
и U+0306 (...)ругой tskc/27_congress_of_the_keys.po 619
и U+0306 (...)ячейка tskc/28_epilogue.po 75
и U+0306 (...)Новый tskc/campaign.po 873
и U+0306 (...)хожий tskc/campaign.po 934
и U+0306 (...)остей rules\ru\rules.json N/A
и U+0306 (...)енной rules\ru\rules.json N/A
и U+0306 (...)ыграйте rules\ru\rules.json N/A
и U+0306 свойства rules\ru\rules.json N/A
и U+0306 (...)аждой rules\ru\rules.json N/A
и U+0306 (...)остей rules\ru\rules.json N/A
ы U+0301 (...)менны́х rules\ru\rules.json N/A

^ string may be truncated when (...) is written

Interpretation of the table:

и | U+0306 | (...)льшой | ru: tskc/10_dancing_mad.po | 126

In msgid number 126 of tskc/10_dancing_mad.po Russian .po file there is string like (...)льшой where и character is combined with U+0306 accent. This may be incorrect, because й character exists as separate unicode glyph OR U+0301 accent supposed to be used instead of U+0306.

Reference (all unicode accent from the table):

Disclaimer: I may be completely wrong, hence treat this issue as information only :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions