TOON vs JSON for LLMs: Performance & Accuracy Deep Dive
While many laud TOON for its token savings, treating it as âcheaper JSONâ for LLMs misses the point.
The real difference between TOON and JSON in AI workflows is structural: how clearly the format tells a model what the data looks like, and how much room it leaves for the model to guess.
And every time an LLM has to guess about structure, your risk of hallucinations and subtle data errors goes up.
What Are TOON and JSON, and Why Does This Comparison Matter for LLMs?

For two decades, JSON (JavaScript Object Notation) has been the default format for data exchange on the web. Itâs flexible, human-readable, and universally supported. Most APIs, webhooks, and config systems speak JSON fluently.
But JSON was never designed with large language models in mind.
- Itâs hierarchical, with arbitrarily nested objects and arrays.
- It doesnât enforce a schema by itself â that lives in your code or JSON Schema.
- And when you tell an LLM âreturn JSON like thisâ in a prompt, thereâs no built-in structural safety net.
TOON (Token-Oriented Object Notation) is almost the opposite philosophy:
- It encodes the same JSON data model, but in a syntax optimised for LLM prompts.
- It is schema-aware and tabular-first: you declare field names and array lengths up front.
- Itâs indentation-based, closer to YAML + CSV than to classic
{}/[]JSON.
Hereâs a simple example: a list of users.
JSON:
[
{
"id": 101,
"username": "alex",
"role": "admin"
},
{
"id": 102,
"username": "casey",
"role": "editor"
}
]
TOON (conceptual example):
users(id, username, role):
- 101, "alex", "admin"
- 102, "casey", "editor"
Both encodings represent the same logical structure. But in the TOON version:
- The schema is declared once (
id, username, role). - Each row is just values, like a CSV line.
- Thereâs less punctuation and duplication for the LLM to deal with.
Thatâs why multiple benchmarks find 30â60% fewer tokens when using TOON instead of JSON for similar data, without losing structure.
For LLMs, this isnât just about cost. Itâs about how clearly the input dataâs shape is described.
Why Does JSONâs Structure Lead to More LLM Mistakes?

First, an important nuance:
JSON itself isnât âbadâ.
Itâs a generic data syntax. The problems show up when we ask LLMs to generate or interpret JSON without any enforced schema, purely from a prompt.
In a lot of real-world LLM workflows, the pattern looks like this:
- âHere is some JSON. Answer questions about it.â
- âReturn JSON in exactly this shape:
{ "field1": ..., "field2": ... }.â
Two big issues appear:
Models struggle with strict syntax + escaping
Benchmarks from tools like Aider show that when you force models to wrap code or answers inside JSON, reliability often drops versus plain text or markdown: more failures, more syntax issues, and more time spent fixing quotes and braces instead of solving the actual task.
Even with todayâs âstrict JSONâ modes, JSON is still a lot of punctuation for the model to keep perfectly aligned.
There are no built-in structural guardrails
Unless you use a separate mechanism (JSON Schema, OpenAIâs structured outputs, Vertex
responseSchema, etc.), JSON doesnât tell the model:- How many items are expected.
- Which keys are mandatory vs optional.
- What types each field must have.
So the model has to infer structure from examples in the prompt. Thatâs where hallucinations sneak in:
- Extra fields that werenât in the original data.
- Misinterpreting a number as a string or vice versa.
- Dropping or duplicating records in long arrays.
For human developers, these are obvious bugs. For an LLM, theyâre just âplausible text continuationsâ.
Key takeaway: JSONâs flexibility is great for systems and humans, but when LLMs work only from prompt examples, that same flexibility becomes ambiguity.
How Do TOONâs Schema-Aware Guardrails Help LLMs?

TOONâs design goal is simple:
âEncode the JSON data model in a way that is compact and obvious to an LLM.â
The key ideas:
- Schema first: TOON encourages declaring the structure up front (field names, sometimes lengths, and sometimes types).
- Less noise: It removes a lot of repeated keys and punctuation that JSON needs.
- Table-friendly: For arrays of similar objects, it behaves like a table with headers and rows â a pattern models handle well.
Hereâs a more explicit (simplified) TOON example:
users(3) name:string age:int city:string
"Alice" 30 "New York"
"Bob" 25 "London"
"Charlie" 35 "Paris"
A model reading this can immediately infer:
- There are 3 user records.
- Each has exactly three fields:
name,age,city. ageis an integer, the rest are strings.
Compare that with JSON:
[
{"name": "Alice", "age": 30, "city": "New York"},
{"name": "Bob", "age": 25, "city": "London"},
{"name": "Charlie", "age": 35, "city": "Paris"}
]
JSON is still clear to us, but for a model:
- The schema is implicit and repeated every row.
- A single missing key or comma can shift everything.
- It has to piece together the structure from multiple examples.
With TOON, the shape is declared once, and every row must conform. That acts as a strong bias against hallucinating extra columns or misaligning values.
Key takeaway: TOON shifts work from âLLM guessing the structureâ to âLLM being told the structureâ, which is exactly what you want when you care about reliable structured data.
TOON vs JSON: What Do Benchmarks Actually Show?

Several public benchmarks now compare TOON and JSON for LLM workloads:
- The official TOON repo reports that on its structured retrieval benchmark, TOON reaches 73.9% accuracy vs 69.7% for JSON while using about 39â40% fewer tokens.
- An independent analysis summarised in TOON Benchmarks: A Critical Analysis of Different Results finds a similar pattern on a slightly different setup: 68.7% vs 65.7% with â39.5% fewer tokens.
- Another dev-focused write-up shows 70.1% accuracy for TOON vs 65.4% for JSON and roughly 46% token reduction across several models (GPT-5 nano, Claude Haiku, Gemini Flash, Grok).
Across these sources, the pattern is consistent:
- Token usage: TOON often reduces tokens by 30â60% for structured/tabular data.
- Accuracy: TOON usually delivers a 3â5 percentage point improvement in retrieval accuracy on those benchmarks.
That might sound small, but in production this can mean:
- Fewer mis-parsed records in logs or analytics.
- Fewer hallucinated fields or misaligned values.
- Less manual clean-up of model outputs.
Important caveats
Itâs not magic, and itâs not always better:
- Independent tests highlight cases where TOON underperforms JSON or markdown-style formats, especially with deeply nested or irregular data that doesnât fit a neat table. ([towardsdeeplearning.com][5])
- TOONâs strength is uniform, structured data (lists of customers, products, transactions, events), not arbitrary document trees.
Honest summary: TOON tends to win on ârows and columnsâ style data, with better accuracy and fewer tokens. For weird, deeply nested data, JSON (or even YAML / markdown) may still be the better choice.
When Should You Use TOON vs JSON in LLM Projects?

A simple way to decide:
Is the primary consumer of this data an LLM, or another system/human?
Use TOON whenâŚ
Youâre feeding large, uniform datasets into a model:
customer lists, transaction logs, product catalogs, events.Youâre building agents or tools that repeatedly query structured data and you want:
- fewer hallucinated fields,
- fewer off-by-one errors in arrays,
- more predictable parsing behaviour.
You care about token cost and context window limits and are happy to adopt a more specialised format for LLM-facing data.
In these scenarios, benchmarks suggest TOON can give you both cost savings and a few percentage points of extra accuracy â which is often more valuable than the cost itself.
Stick with JSON whenâŚ
- Youâre building public or general-purpose APIs, webhooks, or config files.
- Your main consumers are services and humans, not LLMs.
- You need universally understood, battle-tested tooling (every language has a JSON parser; not yet true for TOON).
- Your data is deeply nested and irregular, where TOONâs tabular bias doesnât buy you much.
In those cases, JSON remains the pragmatic choice. You can still combine it with:
- JSON Schema
- OpenAI structured outputs / Vertex
responseSchema - Strong validation in your application code
âŚto get robust structure without changing the wire format.
Quick decision table
| Scenario | Recommended Format | Why |
|---|---|---|
| LLM agent over customers / products / events | TOON | Better token density + clearer structure for the LLM. |
| Repeated structured queries (analytics, logs) | TOON | Fewer hallucinations, more predictable retrieval. |
| Public REST APIs / webhooks / configs | JSON | Ubiquitous support and human readability. |
| One-off, conversational LLM calls | JSON / plain text | Flexibility is enough; structure is less critical. |
Bottom line:
JSON is still the default language of the web. TOON is emerging as a specialised âLLM-firstâ format: worth adopting where structured accuracy and token efficiency really matter, but not a universal replacement.
References
TOON Format Specification â GitHub: official spec and rationale.
https://github.com/toon-format/specTOON Benchmarks: A Critical Analysis of Different Results â Towards Deep Learning.
https://www.towardsdeeplearning.com/toon-benchmarks-a-critical-analysis-of-different-results-d2a74563adcaTOON GitHub Repository & Benchmarks â Official repo with accuracy and token metrics.
https://github.com/toon-format/toonTOON vs JSON: The New Format Designed for AI â dev.to article with 70.1% vs 65.4% accuracy and ~46% token reduction.
https://dev.to/akki907/toon-vs-json-the-new-format-designed-for-ai-nk5What the TOON Format Is (Token-Oriented Object Notation) â OpenAPI.com overview and 30â60% token savings.
https://openapi.com/blog/what-the-toon-format-is-token-oriented-object-notationLLMs are bad at returning code in JSON â Aider benchmark discussion of JSON-wrapped outputs vs plain text.
https://aider.chat/2024/08/14/code-in-json.html


