Data organization issue

Context:

For running a dataset, we need to specify a task_name to pipeline.py, and this task_name will be the directory containing the datasets, for example, /hypothesis-generation/data/task_A

Then when running the task, a BaseTask object will be created using this task_name, and then retrieving the data, metadata, config, etc.

In the config.yaml files, users need to specify another "task_name", which will be **only used to find its extract_label register**.

The duplicate definition of task_name can cause some confusion or bugs.

Maybe we can change the name or handling of this. We will likely need more organized datasets in the future as the number of datasets is growing quickly, and sometimes one task can have different configurations, so creating multiple extract_label functions for the same task is unnecessary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data organization issue #39

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Data organization issue #39

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions