Skip to content

query output duplicates 'entities' header for each batch #36

@ththvseo

Description

@ththvseo

one would expect that '''query''' produces an output file that can be read back by '''upsert'''.
this does not work, because query will duplicate the '''entities:'' key for each batch, essentially writing a corrupt yaml file.
(for yaml, but i guess other formats are similarly affected, it's probably a similiar issue for json, but i have not tried; for csv it likely does not cause issues because there is no header?)

even worse, upsert will accept such a yaml file as input, but apparently only use the last batch (because the parser internally overwrites repeating keys?)
but that is probably not a bug in dsio itself, because the yaml parser is from a library and not part of dsio.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions