Skip to content

Add tokenization benchmarks comparing with YAML, XML, JSON #209

@konard

Description

@konard

We need to see how much tokens are used in GPTs contexts.

https://github.com/toon-format/toon - see for benchmark reference, do not include TOON itself in benchmark, but you can take similar examples to benchmark on from there, but adapt it to Links Notation style.

We should need to do bechmarks in UTF-8 encoding character count.

Make sure we have benchmark implemented in all our supported languages, but we should use Rust version to Run it CI/CD to automatically generate benchmarks markdown pages on push to main branch. These should be separate workflows, we should update markdown documents only if any changes or benchmark didn't fail using GITHUB_TOKEN (or how the default GitHub Actions token is called).

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions