-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
Dear all,
I trained the system with around 1 lakh monolingual corpus and 68k parallel corpus, but not sure i got proper emdedding. when I chk wid bidist, i am not getting relevant words. I tried for morophologically rich language. should i need to do some preprocessing in prior . Please gimme ur sugeestion
For example i tried for 'he' (the english translation is given in braces)
i got it as
நீ (you) 0.997540
? 0.995166
எனக்கு (me) 0.994865
அவள் (she) 0.994344
நாங்கள் (we) 0.993825
அவன் (he) 0.993440
அவர்கள் (they) 0.990702
உனக்கு (you) 0.987608
என்னால் 0.986669
wen i tried for she
இந்த (this) 0.986125
உள்ளது (is) 0.970964
இருக்கிறது (is) 0.966018
இடம் (place) 0.965937
இங்கு (here) 0.963809
எல்லா (all) 0.960640
அதன் 0.959040
ஒரு (a) 0.958450
அதனுடைய 0.958169
in above two words, for he the output seems good. but for 'she' , the words are not relevant. all score are nearer to each other too. what should i do. Please help
Metadata
Metadata
Assignees
Labels
No labels