Collect datasets

# Description

This is a catch-all issue to remind everyone that we want to collect 'relevant' (whatever that means -- there is really no perfect way to determine that) datasets. This would be datasets we imagine could be used as inputs for one or several of our workflows.

In addition to the datasets itself, we want to collect as much metadata about them as possible (origin/authors, description, schema, how they were produced, whether they were already pre-processed and how, etc...).

Also, it would be great to describe the imperfections of each dataset, what would ideally be needed to make them 'perfect' as inputs to workflows and how they would look like once that was done (meaning, there would be no further pre-processing steps needed, they could be reliably used as inputs without having to worry about data quality).

# Task list

- [ ] determine how/where to store those datasets
- [ ] create a minimal schema of information we want to collect for each dataset
- [ ] create a schema for optional metadata we would like to collect for each dataset, if easily possible
- [ ] collect datasets (open ended)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect datasets #2

Description

Task list

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Collect datasets #2

Description

Description

Task list

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions