YAML vs JSON vs TOML: strengths, use cases, and YAML's implicit-typing foot-guns
Where each format is canonical
Every data-serialization format wins on its own turf. JSON (JavaScript Object Notation, standardised as ECMA-404 and RFC 8259) is the universal language of HTTP APIs: virtually every REST endpoint speaks it, every language has a zero-dependency JSON parser, and its grammar fits on a single page. When a Python script, a Go service, and a browser extension need to exchange structured data, JSON removes all negotiation. YAML (YAML Ain't Markup Language) dominates configuration files in DevOps tooling — Kubernetes manifests, GitHub Actions workflows, Ansible playbooks, and Docker Compose files are all YAML. Its multiline strings and comment support make it far more comfortable than JSON for files humans edit daily. TOML (Tom's Obvious, Minimal Language) is the canonical choice for project-level configuration: Cargo.toml for Rust packages, pyproject.toml for Python build metadata, and Hugo site configuration all use TOML. Its strongly-typed literals and [section] syntax make config files easy to read and impossible to mistype silently.
These domains are not accidents — they reflect each format's design priorities. JSON prioritises interoperability and determinism; YAML prioritises human readability and expressiveness; TOML prioritises type safety and predictable parsing. Understanding those priorities makes it easier to choose the right tool and avoid importing one format's assumptions into another's problem space.
YAML's implicit typing: the Norway problem
YAML's most notorious footgun is its aggressive implicit type coercion. In YAML 1.1 (the version used by most parsers until recently, including PyYAML < 6.0 and Ruby's Psych before 4.0), an unquoted scalar is tested against a sequence of regular expressions to determine its type. The boolean tests match yes, no, on, off, true, false — all case-insensitively. This caused a real-world data-integrity bug in software using ISO 3166-1 alpha-2 country codes: the code for Norway is NO, which YAML 1.1 silently converts to the boolean false. The same problem hits fi (Finland → parsed as false in some parsers), and any configuration key whose value is exactly yes or no without quotes. A mapping like norway: NO becomes {norway: false} at runtime — no error, no warning.
YAML 1.2 (published 2009, adopted by ruamel.yaml and newer parsers) tightened the boolean set to only true and false (case-sensitive), eliminating the yes/no/on/off variants. However, millions of production files and the tooling that reads them still run against YAML 1.1 parsers. The safe rule: always quote string values that could be misread — country codes, flag-like words, numeric-looking strings. Use 'NO' not NO, '1e2' not 1e2 (which becomes the float 100.0), and '2024-01-01' not 2024-01-01 (which some parsers convert to a date object). The YAML specification itself acknowledges this ambiguity as a known source of errors across implementations.
Other YAML footguns: tabs, anchors, and the document separator
Indentation is YAML's syntax — and tabs are never valid YAML indentation. The specification explicitly forbids tab characters in indentation positions. Yet tabs are invisible in most editors by default, and a single accidental tab produces a scanner error that points to the wrong line number in some parsers. The fix is to enable 'show whitespace' in your editor and configure it to expand tabs to spaces in .yaml files. A related trap is mixing block and flow style: key: {a: 1, b: 2} is valid (flow mapping as a value), but indenting flow-style content is subtle — a newline inside a flow sequence resets the column counter, and what looks right visually can parse differently.
YAML anchors (&) and aliases (*) allow one node to be reused across a document, which is genuinely useful for long Kubernetes manifests with repeated container specs. But they also enable 'billion laughs'-style expansion attacks on parsers that do not limit alias depth: a document with 9 levels of aliases, each expanding 10 references, produces over a billion nodes. Most production parsers (PyYAML 5.1+, snakeyaml with SafeLoader) limit alias depth; always use safe-load APIs. The --- document separator allows multiple YAML documents in one file — kubectl apply -f relies on this — but some parsers return only the first document unless explicitly iterated, silently discarding the rest.
TOML's explicit types and datetime advantage
TOML assigns a type to every value at the syntax level, with no implicit coercion. Integers are written without quotes (port = 8080), floats require a decimal point (timeout = 1.5), booleans are lowercase true or false, and strings always need quotes. This means a typo that produces an unquoted word is a parse error, not a silent type change — the parser refuses to load the file rather than proceeding with wrong data. TOML 1.0 (released 2021, after years as a 0.5 draft) also specifies four datetime types that are first-class literals: offset_date_time (1979-05-27T07:32:00Z), local_date_time, local_date, and local_time. No other major serialization format treats dates as a built-in type — JSON has none; YAML 1.1 has a date type but its parsing is inconsistent across implementations.
The [table] and [[array of tables]] syntax maps cleanly to nested objects and arrays of objects, which is TOML's answer to YAML's indented block structure. A [[servers]] block followed by another [[servers]] block appends a second element to the servers array — explicit, readable, and impossible to break with a stray space. The trade-off: TOML does not support anchors or multi-document files, so deep config hierarchies with repeated subtrees are more verbose than their YAML equivalents. For project metadata files that are written once and read often, that verbosity is rarely a problem.
JSON's strictness as a feature
JSON's grammar has no optional features, no type coercion, and no document-level directives. A compliant parser reading {"active": true} on any platform will produce a boolean — not the string "true", not the integer 1. This determinism is why JSON became the default wire format for APIs despite being less human-friendly than YAML. RFC 8259 (which superseded RFC 7159 and RFC 4627) tightens the spec further: a JSON text is a serialised value — a single object, array, string, number, boolean, or null. Duplicate keys within an object are explicitly 'should not' (implementations may accept them but the result is undefined), and the encoding must be UTF-8 with no BOM.
The lack of comments is the most-cited JSON complaint. Workarounds exist: JSON5 (a superset with comments and trailing commas), JSONC (used by VS Code's settings.json), and simply stripping // lines before parsing. For configuration files read by both humans and machines, these comment extensions help — but they are not standard JSON and many parsers reject them. The idiomatic solution for config that needs comments is to use TOML or YAML for authoring and JSON only as a machine interchange format, generated by a build step. This is exactly the pattern used by package.json tool ecosystems: humans configure via .eslintrc.yaml or tsconfig.json with JSONC, and tooling emits JSON.
Choosing the right format
The practical decision rule is simple: JSON for any data that crosses a service boundary (HTTP APIs, message queues, config generated by code); YAML for configuration files that operators edit by hand and that need comments or multiline strings — especially in ecosystems that mandate it (Kubernetes, GitHub Actions); TOML for project-level configuration where strong typing and a flat-file structure are more important than YAML's expressiveness. When in doubt, favour the format your ecosystem already uses — fighting the convention costs more than its worth.
If you inherit a large YAML configuration file and suspect implicit-typing bugs, a quick audit is to run it through a YAML 1.2 linter (yamllint, spectral) and search for unquoted values that match boolean or numeric patterns. For new projects, consider adding a schema (JSON Schema works for all three formats via tooling) — it catches type errors before they reach production regardless of which format you chose. The TeaFun YAML / JSON / TOML converter lets you move between all three in the browser to spot structural differences without any install.