Skip to content

Neko : Add more details on datasets #4

@bhavul

Description

@bhavul

Ideally instead of the simple bullet list, it would be useful to have a table defining a few key things about the datasets.

As an example, this could look like this but we can iterate and add/modify anythings in the design:

Dataset Source Size Approx # Tokens Modalities Remarks
Conceptual Captions https://ai.google.com/research/ConceptualCaptions/ X GB/TB XYZ

This would help us identify the right datasets to include while training our Neko model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or requestgood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions