Why not Just Use JSON?

Tony Garnock-Jones tonyg@leastfixedpoint.com
September 2018.

JSON offers syntax for numbers, strings, booleans, null, arrays and string-keyed maps. However, it suffers from two major problems. First, it offers no semantics for the syntax: it is left to each implementation to determine how to treat each JSON term. This causes interoperability and even security issues. Second, JSON’s lack of support for type tags leads to awkward and incompatible encodings of type information in terms of the fixed suite of constructors on offer.

There are other minor problems with JSON having to do with its syntax. Examples include its relative verbosity and its lack of support for binary data.

JSON syntax doesn’t mean anything

When are two JSON values the same? When are they different?

The specifications are largely silent on these questions. Different JSON implementations give different answers.

Specifically, JSON does not:

assign any meaning to numbers,¹
determine how strings are to be compared,²
determine whether object key ordering is significant,³ or
determine whether duplicate object keys are permitted, what it would mean if they were, or how to determine a duplicate in the first place.⁴

In short, JSON syntax doesn’t denote anything.⁵ ⁶

Some examples:

are the JSON values 1, 1.0, and 1e0 the same or different?
are the JSON values 1.0 and 1.0000000000000001 the same or different?
are the JSON strings "päron" (UTF-8 70c3a4726f6e) and "päron" (UTF-8 7061cc88726f6e) the same or different?
are the JSON objects {"a":1, "b":2} and {"b":2, "a":1} the same or different?
which, if any, of {"a":1, "a":2}, {"a":1} and {"a":2} are the same? Are all three legal?
are {"päron":1} and {"päron":1} the same or different?
is "\uD834" a legal string? Is "\uDD1E"? If so, is either one the same as ""?⁷

JSON can multiply nicely, but it can’t add very well

JSON includes a fixed set of types: numbers, strings, booleans, null, arrays and string-keyed maps. Domain-specific data must be encoded into these types. For example, dates and email addresses are often represented as strings with an implicit internal structure.

There is no convention for labelling a value as belonging to a particular category. Instead, JSON-encoded data are often labelled in an ad-hoc way. Multiple incompatible approaches exist. For example, a “money” structure containing a currency field and an amount may be represented in any number of ways:

{ "_type": "money", "currency": "EUR", "amount": 10 }
{ "type": "money", "value": { "currency": "EUR", "amount": 10 } }
[ "money", { "currency": "EUR", "amount": 10 } ]
{ "@money": { "currency": "EUR", "amount": 10 } }

This causes particular problems when JSON is used to represent sum or union types, such as “either a value or an error, but not both”. Again, multiple incompatible approaches exist.

For example, imagine an API for depositing money in an account. The response might be either a “success” response indicating the new balance, or one of a set of possible errors.

Sometimes, a pair of values is used, with null marking the option not taken.⁸

{ "ok": { "balance": 210 }, "error": null }
{ "ok": null, "error": "Unauthorized" }

The branch not chosen is sometimes present, sometimes omitted as if it were an optional field:

{ "ok": { "balance": 210 } }
{ "error": "Unauthorized" }

Sometimes, an array of a label and a value is used:

[ "ok", { "balance": 210 } ]
[ "error", "Unauthorized" ]

Sometimes, the shape of the data is sufficient to distinguish among the alternatives, and the label is left implicit:

{ "balance": 210 }
"Unauthorized"

JSON itself does not offer any guidance for which of these options to choose. In many real cases on the web, poor choices have led to encodings that are irrecoverably ambiguous.

Update 20230123. This article discusses another subtle aspect of the problems caused by the lack of tagging in JSON.

Update 20231016. Lack of tagging sometimes causes implementors to rely on specific key-value orderings in JSON objects to make sure their "type" tag appears first in the text, to allow use of streaming parsers in deserialization.

Notes

Section 6 of RFC 8259 does go so far as to indicate “good interoperability can be achieved” by imagining that parsers are able reliably to understand the syntax of numbers as denoting an IEEE 754 double-precision floating-point value. ↩
Section 8.3 of RFC 8259 suggests that if an implementation compares strings used as object keys “code unit by code unit”, then it will interoperate with other such implementations, but neither requires this behaviour nor discusses comparisons of strings used in other contexts. ↩
Section 4 of RFC 8259 remarks that “[implementations] differ as to whether or not they make the ordering of object members visible to calling software.” ↩
Section 4 of RFC 8259 is the only place in the specification that mentions the issue. It explicitly sanctions implementations supporting duplicate keys, noting only that “when the names within an object are not unique, the behavior of software that receives such an object is unpredictable.” Implementations are free to choose any behaviour at all in this situation, including signalling an error, or discarding all but one of a set of duplicates. ↩
The XML world has the concept of XML infoset. Loosely speaking, XML infoset is the denotation of an XML document; the meaning of the document. ↩
Most other recent data languages are like JSON in specifying only a syntax with no associated semantics. While some do make a sketch of a semantics, the result is often underspecified (e.g. in terms of how strings are to be compared), overly machine-oriented (e.g. treating 32-bit integers as fundamentally distinct from 64-bit integers and from floating-point numbers), overly fine (e.g. giving visibility to the order in which map entries are written), or all three. ↩
Section 8.2 of RFC 8259 discusses unpaired UTF-16 surrogate code points such as these, and remarks that implementations differ in their treatment of them. Some reject unpaired surrogates, some discard them, and some retain them. ↩
What is the meaning of a document where both ok and error are non-null? What might happen when a program is presented with such a document? ↩