Skip to content

Cached files #13

@boersmamarcel

Description

@boersmamarcel

Hi Aleksei,

The cached files is awesome! However, if the directory doesn't exist yet it crashes:

Traceback (most recent call last):
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-dev-12/b_experiments/experiment.py", line 21, in <module>
    embds = get_embs_TF(df, embed_size = 2, walks_per_node = 2, num_steps=200, use_cached_skip_grams= False)
  File "/Users/mboersma/Documents/phd/students/alex/NetEmbs-dev-12/NetEmbs/SkipGram/tensor_flow.py", line 233, in get_embs_TF
    pd.DataFrame(embs).to_pickle(WORK_FOLDER[0] + WORK_FOLDER[1] + "cache/snapshot.pkl")
  File "/Users/mboersma/PycharmProjects/networkembedding/venv/lib/python3.7/site-packages/pandas/core/generic.py", line 2593, in to_pickle
    protocol=protocol)
  File "/Users/mboersma/PycharmProjects/networkembedding/venv/lib/python3.7/site-packages/pandas/io/pickle.py", line 73, in to_pickle
    is_text=False)
  File "/Users/mboersma/PycharmProjects/networkembedding/venv/lib/python3.7/site-packages/pandas/io/common.py", line 430, in _get_handle
    f = open(path_or_buf, mode)
FileNotFoundError: [Errno 2] No such file or directory: '2_walks30_pressure30_window3/TFsteps200000batch64_emb32/cache/snapshot.pkl'

I added a couple of lines such that it creates the directory when it is not found, this seems to be working:

in utils.py I added

        skip_gr = tr.encode_pairs(get_pairs(N_JOBS, version, walk_length, walks_per_node, direction))
        if not os.path.exists(WORK_FOLDER[0]):
            os.makedirs(WORK_FOLDER[0])
        with open(WORK_FOLDER[0] + "skip_grams_cached.pkl", "wb") as file:
            pickle.dump(skip_gr, file)

os.makedirs

in tensor flow.py


    if not os.path.exists(WORK_FOLDER[0] + WORK_FOLDER[1] + 'cache/'):
        os.makedirs(WORK_FOLDER[0] + WORK_FOLDER[1] + "cache/")
    pd.DataFrame(embs).to_pickle(WORK_FOLDER[0] + WORK_FOLDER[1] + "cache/snapshot.pkl")

such that it creates a cache folder.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions