Skip to content

huggingface interface #21

@cmatKhan

Description

@cmatKhan

This is going to be a new repo, but it will be very closely based on this one. In particular, we're going to need a modified AbstractAPI (maybe called HfAbstractAPI() ), which provides an interface for the huggingface hub API to manage downloading and querying data.

methods:

  • parse_datacard()
    Each repo has a datacard that provides the structure, factor levels, etc. Right now, in particular, I'm interested in ways of reducing how much is in the datacard -- is there a better way of extracting the factor levels, especially those available at each level of partitioning, programmatically?

  • downoad()
    this will wrap huggingface_hub.snapshot_download

  • query()
    An interface to submit sql cmds to duckdb

Things to think about:

  1. What if you have multiple datasets and what to combine them? Eg, a callingcards hops record and a promoter set?

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions