toon-py

Token-Oriented Object Notation (TOON) for Python

A compact, human-readable format for passing structured data to LLMs with 30-60% fewer tokens than JSON.

Python port of @byjohann/toon.

Why TOON?

LLM tokens cost money. TOON reduces token usage by:

Removing redundant punctuation (braces, brackets, most quotes)
Using indentation for structure
Tabularizing arrays of objects
Writing inline primitive arrays without spaces

Installation

As a CLI tool

For standalone CLI usage:

# Using uv (recommended - installs in isolated environment)
uv tool install toon-py

# Using pip (installs CLI in current Python environment)
pip install toon-py

As a Python library

To use in your Python project:

# Using uv (adds to project dependencies)
uv add toon-py

# Using pip (installs library + CLI in current environment)
pip install toon-py

Quick Start

Python API

from toon_py import encode

data = {
    "user": {
        "id": 123,
        "name": "Ada",
        "tags": ["reading", "gaming"],
        "active": True
    }
}

print(encode(data))

Output:

user:
  id: 123
  name: Ada
  tags[2]: reading,gaming
  active: true

CLI

# From file
toon data.json

# From stdin
cat data.json | toon

# From string
toon '{"tags": ["foo", "bar"]}'

# With options
toon data.json --delimiter tab --length-marker -o output.toon

Token Savings

Note

Measured with gpt-tokenizer using o200k_base encoding (used by GPT-5 and other modern models). Savings will vary across models and tokenizers.

Retrieval Accuracy

Accuracy across 3 LLMs on 159 data retrieval questions:

gpt-5-nano
  toon         ████████████████████  99.4% (158/159)
  yaml         ███████████████████░  95.0% (151/159)
  csv          ██████████████████░░  92.5% (147/159)
  json         ██████████████████░░  92.5% (147/159)
  xml          ██████████████████░░  91.2% (145/159)

claude-haiku-4-5
  toon         ███████████████░░░░░  75.5% (120/159)
  xml          ███████████████░░░░░  75.5% (120/159)
  csv          ███████████████░░░░░  75.5% (120/159)
  json         ███████████████░░░░░  75.5% (120/159)
  yaml         ███████████████░░░░░  74.2% (118/159)

gemini-2.5-flash
  xml          ██████████████████░░  91.8% (146/159)
  csv          █████████████████░░░  86.2% (137/159)
  toon         █████████████████░░░  84.9% (135/159)
  json         ████████████████░░░░  81.8% (130/159)
  yaml         ████████████████░░░░  78.6% (125/159)

Advantage: TOON achieves 86.6% accuracy (vs JSON's 83.2%) while using 46.3% fewer tokens.

Features

Objects

encode({"id": 1, "name": "Ada"})

id: 1
name: Ada

Primitive Arrays (Inline)

encode({"tags": ["admin", "ops", "dev"]})

tags[3]: admin,ops,dev

Arrays of Objects (Tabular)

encode({
    "items": [
        {"sku": "A1", "qty": 2, "price": 9.99},
        {"sku": "B2", "qty": 1, "price": 14.5}
    ]
})

items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

Encoding Options

from toon_py import encode, EncodeOptions

data = {"items": [{"id": 1, "name": "Widget"}]}

# Tab delimiter
options = EncodeOptions(delimiter="\t")
print(encode(data, options))

# Pipe delimiter
options = EncodeOptions(delimiter="|")
print(encode(data, options))

# Length marker
options = EncodeOptions(length_marker="#")
print(encode(data, options))
# Output: items[#1]{id,name}: ...

# Custom indent
options = EncodeOptions(indent=4)
print(encode(data, options))

CLI Options

toon [INPUT] [OPTIONS]

Arguments:
  INPUT                 JSON file, JSON string, or stdin

Options:
  -i, --indent INT      Spaces per indent level (default: 2)
  -d, --delimiter TEXT  Delimiter: comma, tab, or pipe (default: comma)
  -l, --length-marker   Add '#' prefix to array lengths
  -o, --output PATH     Output file (default: stdout)
  --help                Show help message

Format Rules

Quoting

Keys and values are quoted only when necessary:

# Unquoted
{"name": "hello world"}  # -> name: hello world

# Quoted (contains comma)
{"note": "hello, world"}  # -> note: "hello, world"

# Quoted (looks like number)
{"code": "123"}  # -> code: "123"

# Quoted (key with space)
{"full name": "Ada"}  # -> "full name": Ada

Tabular Format

Arrays of objects use tabular format when:

All elements are objects
All objects have identical keys
All values are primitives (no nested arrays/objects)

encode({
    "users": [
        {"id": 1, "name": "Alice", "active": True},
        {"id": 2, "name": "Bob", "active": False}
    ]
})

users[2]{id,name,active}:
  1,Alice,true
  2,Bob,false

Empty Containers

encode({})            # -> (empty output)
encode({"items": []}) # -> items[0]:
encode({"config": {}})# -> config:

Type Conversions

Python Type	TOON Output
`None`	`null`
`True`/`False`	`true`/`false`
`123`	`123`
`-0.0`	`0`
`float('nan')`	`null`
`float('inf')`	`null`
`datetime(...)`	`"2025-01-01T00:00:00Z"`

Use in LLM Prompts

Wrap TOON data in code blocks:

Here's the data in TOON format:

```
user:
  id: 123
  tags[2]: reading,gaming
  active: true
```

Please analyze this data...

Development

# Clone and setup
git clone https://github.com/shammianand/toon-py.git
cd toon-py
uv sync --all-extras

# Run tests
uv run pytest

# Format code
uv run black src/
uv run ruff check src/

License

MIT License - see LICENSE

Credits

Python port of @byjohann/toon by Johann Schopplich

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
src/toon_py		src/toon_py
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

toon-py

Why TOON?

Installation

As a CLI tool

As a Python library

Quick Start

Python API

CLI

Token Savings

Retrieval Accuracy

Features

Objects

Primitive Arrays (Inline)

Arrays of Objects (Tabular)

Encoding Options

CLI Options

Format Rules

Quoting

Tabular Format

Empty Containers

Type Conversions

Use in LLM Prompts

Development

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

toon-py

Why TOON?

Installation

As a CLI tool

As a Python library

Quick Start

Python API

CLI

Token Savings

Retrieval Accuracy

Features

Objects

Primitive Arrays (Inline)

Arrays of Objects (Tabular)

Encoding Options

CLI Options

Format Rules

Quoting

Tabular Format

Empty Containers

Type Conversions

Use in LLM Prompts

Development

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages