Better tracking of source annotations for adjudication

Currently the sources are passed around as a collection of `json.dumps` strings associated with the annotation dataclasses.  This is bloated and also creates some issues with duplicates when we convert the internal annotation objects back to dictionaries for dumping to Label Studio input, since annotation dataclasses correspond to a class of annotation type we want to score individually (DocTimeRel, Adverse Event, etc.), but since we're keeping track of sources by ID and offset this means sources might be shared.

To me there seem to be three better possibilities:
1. Mapping from ID and offsets to their collection of internal annotations (problem, giant omniscient data structure)
2. Have source be exactly one entity (still some bloat)
3.  (Favorite obviously) develop better models of Label Studio annotation types and have conversion method from internal dataclasses to those (can use this as a pivot to bring in Pydantic and fold into Ian's code if that's ever helpful)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better tracking of source annotations for adjudication #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Better tracking of source annotations for adjudication #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions