Skip to content

Morphologically rich langugages #6

@sanjanasri

Description

@sanjanasri

Dear all,

 I trained the system with around 1 lakh monolingual corpus and 68k parallel corpus, but not sure i got proper emdedding. when I chk wid bidist, i am not getting relevant words. I tried for morophologically rich language. should i need to do some preprocessing in prior . Please gimme ur sugeestion 

For example i tried for 'he' (the english translation is given in braces)

i got it as

நீ (you) 0.997540
? 0.995166
எனக்கு (me) 0.994865
அவள் (she) 0.994344
நாங்கள் (we) 0.993825
அவன் (he) 0.993440
அவர்கள் (they) 0.990702
உனக்கு (you) 0.987608
என்னால் 0.986669

wen i tried for she

இந்த (this) 0.986125
உள்ளது (is) 0.970964
இருக்கிறது (is) 0.966018
இடம் (place) 0.965937
இங்கு (here) 0.963809
எல்லா (all) 0.960640
அதன் 0.959040
ஒரு (a) 0.958450
அதனுடைய 0.958169

in above two words, for he the output seems good. but for 'she' , the words are not relevant. all score are nearer to each other too. what should i do. Please help

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions