-
-
Notifications
You must be signed in to change notification settings - Fork 88
Description
According to previous issue issue 57, we propose to add a new function to unshape this text
Salam,
I tested the given words with pyarabic word as follow,
the word contains encoded glyphs not standard letters, it must be converted to ordinary letters.To convert glyph based word into a string of letters you can use:
NB: the second unshape function is used only to inverse the result word
word = "ﻣﺴﺎﻣﻌﻬﻢ"
from pyarabic.unshape import unshaping_word
unshaping_word(unshaping_word(word))
'مسامعهم'
- The test used to detect the problem
``>>> import pyarabic.araby as ar
lst=["اﻟﻤﺴﺌﻮﻟﻴﺔ","ﻣﺴﺎﻣﻌﻬﻢ","ﻓﻜﻠﻨﺎ","ﻣﺒﺎدراﺗﻨﺎ","ﻓﻬﻢ","اﻟﻤﻨﻈﻮﻣﺔ"]
for i in lst:
... print(i, ar.is_arabicword(i))
...
اﻟﻤﺴﺌﻮﻟﻴﺔ False
ﻣﺴﺎﻣﻌﻬﻢ False
ﻓﻜﻠﻨﺎ False
ﻣﺒﺎدراﺗﻨﺎ False
ﻓﻬﻢ False
اﻟﻤﻨﻈﻮﻣﺔ False
for i in lst:
... print("%s"%i, ar.is_arabicword(i))
...
اﻟﻤﺴﺌﻮﻟﻴﺔ False
ﻣﺴﺎﻣﻌﻬﻢ False
ﻓﻜﻠﻨﺎ False
ﻣﺒﺎدراﺗﻨﺎ False
ﻓﻬﻢ False
اﻟﻤﻨﻈﻮﻣﺔ False
for i in lst:
... for c in i :
... print(c, ord(c), ar.name(c))
...
ا 1575 ألف
ﻟ 65247
ﻤ 65252
ﺴ 65204
ﺌ 65164
ﻮ 65262
ﻟ 65247
ﻴ 65268
ﺔ 65172
ﻣ 65251
ﺴ 65204
ﺎ 65166
ﻣ 65251
ﻌ 65228
ﻬ 65260
ﻢ 65250
ﻓ 65235
ﻜ 65244
ﻠ 65248
ﻨ 65256
ﺎ 65166
ﻣ 65251
ﺒ 65170
ﺎ 65166
د 1583 دال
ر 1585 راء
ا 1575 ألف
ﺗ 65175
ﻨ 65256
ﺎ 65166
ﻓ 65235
ﻬ 65260
ﻢ 65250
ا 1575 ألف
ﻟ 65247
ﻤ 65252
ﻨ 65256
ﻈ 65224
ﻮ 65262
ﻣ 65251
ﺔ 65172
`