Compressing GraphQL Global Node ID

Hyeseong Kim - Apr 10 '23 - - Dev Community

You may be familiar with Global Object Identification(GOI), especially if you've used Relay.

GOI is one of best practices to build a good GraphQL API, makes possible to interact with clients more efficiently.

It only requires to implement this interface:

interface Node {
  id: ID!
}

type Query {
  node(id: ID!): Node
}
Enter fullscreen mode Exit fullscreen mode

But how?

Typical implementation

const query = {
  node(root, args) {
    // root => {}
    // args => { id: string }
    // 
    // You need a way to specifing which type of node you should load.
    // ... But you only have args.id here.
  },
};

const node = {
  __resolveType(node) {
    // node is the return type of the `query.node` resolver above.
    // How can you determine the typename?
  },
};
Enter fullscreen mode Exit fullscreen mode

You should implement resolvers for make GOI work, but as you can see, there is only id argument in the whole context.

Generally, to load an entity, you need at least the typename and id (unless you're using a Graph DB).

So you need to find out the entity type with only the ID, but the database does not support this (again, unless you're using a Graph DB).

Instead, you can convert all ids in the GraphQL application layer to include typenames.

base64(`${typename}:${id}`)
Enter fullscreen mode Exit fullscreen mode

As the result, when client request with that ID, your API can determine its typename by extracting from ID back.

This is very typical GOI implementaion, as same as of graphql-relay's one.

However, this implementation has several drawbacks.

  • It will be much longer than the original ID.
  • Using Base64 string is not URL-safe.

Those drawbacks make it difficult to be used in permalink URLs (e.g. /posts/{node.id}).

But it's just an implementation, not a constraint by definition. How to make a better implementation?

Compressing it!

Technically, the previous ID compression can be represented as

[typename, id] -> CSV(':') -> Base64
Enter fullscreen mode Exit fullscreen mode

You can change it to a better compression algorithm. Here is my suggestion:

[version, dict(version, typename), id] -> CBOR -> Base64URL
Enter fullscreen mode Exit fullscreen mode
  • dict(version, typename): since the typenames are well-known strings, it can be compressed to a short integer using a dictionary.
  • version: Using ID for permalink URLs means that can be permanently exposed. You have to maintain all versions of dictionaries you make.
  • CBOR (Concise Binary Object Representation) is a codec for small JSON object, similar to MessagePack but in the internet standard.
  • Base64 URL is an alternative to Base64, use only URL-safe character set.

How well does it compress than previous one?

import { toGlobalId } from 'graphql-relay';
import { encode } from 'cbor-x';

const node = {
  typename: 'User',
  id: 1234567,
};

function toCompressedGlobalIdV1(typename, id) {
  const dict = {
    User: 1,
    Post: 2,
    Category: 3,
  };
  return encode([1, dict[typename], id]).toString('base64url');
}

console.log('graphql-relay toGlobalId :', toGlobalId(node.typename, node.id));
console.log('toCompressedGlobalIdV1   :', toCompressedGlobalIdV1(node.typename, node.id));

// graphql-relay toGlobalId : VXNlcjoxMjM0NTY3
// toCompressedGlobalIdV1   : gwEBGgAS1oc
Enter fullscreen mode Exit fullscreen mode

It does pretty well if the original ID format is numeric. If the original ID format is a string like cuid or UUID, this has little effect, but it won't make it any longer than the old way, and still guarantee URL-safety.

Your implementation is not necessarily identical to this one. To summarize the key points:

  • Compress typenames using a dictionary
  • Use Base64 URL, Base62, Base58 to get URL-safe string
  • Keep original ID short

(Or, you can use a Graph DB in the first place 😉)

. . .