Skip to content

mostSimilar outputs numbers when using Fasttext word vectors #14

@please-wait

Description

@please-wait

Hi,

First of all, thanks for the awesome work!

I am trying to import the pre-trained files from the fasttext repo: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md

The model loads without a problem; however, when I try mostSimilar, the most similar words appear to be numbers:

loadedModel.mostSimilar('hi')

> [ { word: '73301', dist: 0.4461598818767161 },
  { word: '266', dist: 0.44462500361860946 },
  { word: '399', dist: 0.44260747560473973 },
  { word: '-0.13061', dist: 0.4250619904094889 },
  { word: '745', dist: 0.4089746546859616 },
  { word: '7', dist: 0.39388342200258686 },
  { word: '233', dist: 0.38675386429631425 },
  { word: '.33347', dist: 0.38672456155896373 },
  { word: '999', dist: 0.3798941950492955 },
  { word: '.5158', dist: 0.3761412428047805 },
  { word: '4785', dist: 0.3756878374324986 },
  { word: '', dist: 0.3753017613199615 },
  { word: '4091', dist: 0.3728785618174816 },
  { word: '0.18393', dist: 0.3702285209309231 },
  { word: '5', dist: 0.3694416515730196 },
  { word: '', dist: 0.3682340927295216 },
  { word: '2', dist: 0.3682152969462404 },
  { word: '68', dist: 0.36721353813091373 },
  { word: '10285', dist: 0.36564681449501635 },
  { word: '', dist: 0.36526450978156066 },
  { word: '014575', dist: 0.36389461240841203 },
  { word: '468', dist: 0.36371019302454455 },
  { word: '-0.00046764', dist: 0.3637013226972051 },
  { word: '.012665', dist: 0.36367885124101007 },
  { word: '142', dist: 0.3636392745394945 },
  { word: '574', dist: 0.36060934864973193 },
  { word: '0.6865', dist: 0.3602319353978014 },
  { word: '91', dist: 0.357913584485305 },
  { word: '53', dist: 0.35790250493633724 },
  { word: '925', dist: 0.3576282053138198 },
  { word: '1942', dist: 0.35588944804722655 },
  { word: '', dist: 0.3558833583782604 },
  { word: '3', dist: 0.3546257354328858 },
  { word: '-0.059739', dist: 0.3546232535404894 },
  { word: '', dist: 0.35400407472165496 },
  { word: '08', dist: 0.3536348589615367 },
  { word: '093', dist: 0.35353088901048624 },
  { word: '0.11736', dist: 0.3529077373455495 },
  { word: '.12359', dist: 0.3511316591255266 },
  { word: '10224', dist: 0.35079793819829935 } ]

I also tried hello it says it is out of the dictionary. How can I import the Fasttext files so that this won't happen?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions