Skip to content
This repository was archived by the owner on Sep 7, 2020. It is now read-only.
This repository was archived by the owner on Sep 7, 2020. It is now read-only.

Add way to add each document's file_hash and pages value to database #11

@anthonydb

Description

@anthonydb

Upon upload, the DocumentCloud API response does not include the values for file_hash or pages, probably because those get calculated during the processing of the document and are not available when the file is dropped off.

I'd like to add a function in db.py to walk through the database of uploaded files and retrieve those values for each doc. It should include multiprocessing on supported platforms.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions