Skip to content

Feature: Storing associated blobs and data with Entries #23

@gavento

Description

@gavento

Some computations have multiple outputs, and some of those are naturally files. E.g. training a neural net outputs: the model parameters (data or file), resulting stats (data), TF summarywriter logs (file), sometimes graphs or images (files), stout/stderr captures (data). It would be great if some of those types could be also displayed in the browser (images, text files, logs, ...)

Table / properties

Add a table for storing blobs, every currently valid Entry has associated blobs. It would make sense to include the serialized output value (for consistency or e.g. external blob storage).

Every blob has:

  • id
  • entry - reference to entry, M:1 (TODO: update to match the current schema)
  • data (blob)
  • name - filename (relative to the workdir) or empty (for pickled returned value) or any name withut slash (just a blob, may still be instantiated as a file)
  • Some notion of kind/type/intent - which should be displayed in browser, which are images, which are (viewable) text files, how to highlight the text, etc. Plugins may define more (e.g. tensorboard).
    • Mime seems to be too much and insufficient (e.g. TF logs)? (But good for browser open/download)
    • We can just have tag field mixing role (full, thumbnail, ...) and type (text, json, png, jpg)
    • Or we can have both mime for type and tags for role/intent/plugin (for distinguishing e.g. TF logs ..).

API

Managed through context for creation (see #22):

  • ctx.add_blob(data, name, mimetype, tags=()) - add data blob
  • ctx.add_file(path, name=None, mimetype, tags=()) - add an existing file
    And some type-specific functions (more for text/logs, etc.)
  • ctx.add_figure(fig, name, tags=('thumbnail', )) - render and insert Matplotlib/plotly/bokeh/... image
  • ctx.add_pickled(obj, name, tags=(pickled)) - pickle and add object

Properties and methods on Entry:

  • Entry.files - dictionary name: EntryFile

EntryFile (bikesheddable) has similar properties to the table above. In addition, it has methods:

  • EntryFile.write_file(filename=None) - write as real file, returns Path object
  • EntryFile.as_file() - return a readable file-like object (SQLite supports this)
  • EntryFile.data() - return binary data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions