Skip to content

Training took forever to finish #4

@nganhtua

Description

@nganhtua

For testing purposes, I extracted only 200 files (100 pairs) from the VietBibleVox zip data. I then ran the prepare_vbx_tfdata.ipynb notebook, which resulted in the following:

  • The JSON files in "./data/VietBibleVox" directory.
  • The "./data/tfdata/test" directory was created with one file named "part_000.tfrecords" that is approximately 56 MB in size.
  • The "./data/tfdata/train" directory was created with 256 files named "part_*.tfrecords", but all of them are empty (0 bytes).
  • The files "lexicon.dict", "lexicon.txt", "phone_set.json", and "vbx_mfa.zip" are non-empty files.
  • A directory named "MFA" was created in the "$HOME/Documents" directory, with a total size of 86 MB.

Afterwards, I attempted to run "python3 train.py", but the process repeatedly prints "0it [00:00, ?it/s]" to the screen. I waited for approximately 1 hour before interrupting the process. I believe this is an excessively long time for such a small dataset.

Since the tfrecords files should not be empty, according to the discussion here: #2 (comment), I suspect that something went wrong during the preparation process, but I am unable to identify the specific issue.

My equipments:

  • OS: Debian testing, Wayland session.
  • CPU: Intel i5-6300HQ.
  • RAM: 12 GB.
  • GPU: GTX 950M.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions