Postgres in Ibiza is more than a conference. It gathers a community of PostgreSQL users and contributors for three days of conference and unconference talks and discussions around modern topics in PostgreSQL and compatible forks and extensions. This year, JSON in PostgreSQL was a hot: FerretDB was sponsoring (They add a MongoDB API on top of PostgreSQL) and a great talk was about the right datatypes to store JSON in Postgres.

PostgreSQL datatypes for JSON

Modern applications often store data in both relational structures and JSON documents within the same SQL database. PostgreSQL offers powerful formats, operators, and indexes for this purpose. The question arises: which format to choose— text, json, or jsonb datatypes? JSONB is often the best default choice, but it may depend on your access patterns.

During this conference, Waltton Morais delivered an insightful session titled "Leveraging the Power of JSON/JSONB Fields: Key Considerations for Effective Utilization in PostgreSQL". This presentation is an excellent tech talk, rooted in facts that can be easily reproduced. If you have the opportunity to attend Waltton's presentation, I highly recommend it. His investigation was on PostgreSQL, I'll run the same on YugabyteDB (Open Source PostgreSQL-compatible Distributed SQL database).

YugabyteDB: scaling-out PostgreSQL

Some considerations discussed for PostgreSQL apply to YugabyteDB, but not all. YugabyteDB employs a PostgreSQL fork for the query layer but stores rows as documents in DocDB, its distributed transactional storage. This approach offers several advantages, including no vacuum, no block size limitations (no TOAST needed), transparent compression and encryption, and horizontal scalability for improved elasticity and resilience.

I followed Waltton's method, measuring the size and performance on a reproducible test case, using his code available on GitHub:

waltton / pg_json_bench

pg_json_bench

The goal of this project is to help to compare how different data types and indexes behaves with inserts and queries with json data.

After running the tests, pg_json_bench publishes the result to prometheus gateway and it is possible see the result on the prepared grafana dashboards.

Example of a dashboard generated by pg_json_bench

Presentations

Postgres Ibiza 2023: Leveraging the Power of JSON JSONB Fields
PGConf.EU 2023: Lightning - TEXT is faster than JSON

Running

Spin up docker compose with monitoring stack

docker compose up -d

Prepare the data

mkdir ./data
wget https://raw.githubusercontent.com/algolia/datasets/master/movies/records.json -O ./data/records.json

Prepare the schema

psql -c "create database test;"
psql -d test < ./main.sql

Build

go build -o pg_json_bench

Run Benchmark Example:

DBCONN="dbname=test sslmode=disable" ./pg_json_bench query count_score_over_7 btree_idx_score,gin_idx,gin_idx_path

On the output the link for the metrics will be displayed, maybe you need to refresh. Grafana user and password is admin

…

View on GitHub

I've run this from an Amazon Linux 2 machine (t2.xlarge) with VPC peering to my YugabyteDB Managed service (3 x 4vCPU/16GB).


git clone https://github.com/waltton/pg_json_bench.git
cd pg_json_bench

# start grafana and prometheus with pushgateway
docker compose up -d

# get data
mkdir ./data
wget https://raw.githubusercontent.com/algolia/datasets/master/movies/records.json -O ./data/records.json

# build
go build -o pg_json_bench

To conduct these tests, I configured my PostgreSQL environment to align with the connection parameters of my YugabyteDB Managed cluster. Here are the environment variables I set:

export PGHOST=eu-west-1.06b0a9bb-4b31-4181-bd8a-6a2774f98ba5.cloudportal.yugabyte.com PGPORT=5433 PGDATABASE=test PGUSER=admin PGPASSWORD=bsky-social-swi75-ukqae PGSSLMODE=require

I create the empty tables using the main-sql script (where I modified the index definition):

psql -d yugabyte -c "drop database if exists $PGDATABASE"
psql -d yugabyte -c "create database $PGDATABASE with colocation=false"
psql -ef main.sql

These tables were designed to represent various scenarios of storing and querying JSON data in a data column.
Here are the tables I created:

tbl_text stores JSON as text with no index
tbl_json stores JSON as json with no index
tbl_jsonb stores JSON as jsonb with no index
tbl_btree_idx_score stores JSON as jsonb with a regular index on (data->>'score') optimized for range sharding in ASCending order.
tbl_gin_idx stores JSON as jsonb with a GIN index on data allowing for efficient indexing of all keys and values.
tbl_gin_idx_path stores JSON as jsonb with a GIN index on data jsonb_path_ops, specifically indexing key paths and values.

Insert

Here is the command I used to run the insert workload for the tables:

./pg_json_bench insert \
text,json,jsonb,btree_idx_score,gin_idx,gin_idx_path

This command initiated the insertion process for each table, enabling us to evaluate factors such as insertion speed, table size, and the impact of indexing on the overall performance.

YugabyteDB employs a storage mechanism based on LSM-Tree (Log-Structured Merge-Tree), where the first level resides in memory and is protected from failures by on-disk Write Ahead Logging (WAL). This design is part of the resilience to failure and enables seamless synchronization for followers that may have been disconnected for a period. The default retention period for the WAL is 15 minutes, allowing disconnected replicas to catch up on the latest changes without requiring a full bootstrap.

The size displayed here is gathered by pg_total_relation_size. It's important to note that this size approximation can vary due to factors such as WAL expiration and background compaction. However, despite these variations, storing data as JSONB consistently emerges as the most efficient option.

The regular index, referred to as BTREE (though it is, in fact, an LSM-Tree in YugabyteDB), indexes just one value per row. This results in rapid insertions and low overhead. One of the advantages of YugabyteDB's LSM-Tree architecture is the reduce number of reads when updating a non-unique secondary index.

Indexing all attributes in a GIN index takes considerably more time, with a slight increase in time when indexing all keys:

Table and Index size

Regarding the size, I verified it after an hour:

test=> \d+
                           List of relations
 Schema |         Name         | Type  | Owner |  Size   | Description 
--------+----------------------+-------+-------+---------+-------------
 public | tbl_btree_idx_score  | table | admin | 400 MB  | 
 public | tbl_gin_idx          | table | admin | 384 MB  | 
 public | tbl_gin_idx_path     | table | admin | 384 MB  | 
 public | tbl_json             | table | admin | 429 MB  | 
 public | tbl_jsonb            | table | admin | 432 MB  | 
 public | tbl_size_limit_json  | table | admin | 3072 kB | 
 public | tbl_size_limit_jsonb | table | admin | 3072 kB | 
 public | tbl_size_limit_text  | table | admin | 3072 kB | 
 public | tbl_test_toast       | table | admin | 3072 kB | 
 public | tbl_test_toast_seed  | table | admin | 3072 kB | 
 public | tbl_text             | table | admin | 429 MB  | 
(11 rows)

test=> \di+
                                          List of relations
 Schema |              Name              | Type  | Owner |        Table        |  Size  | Description 
--------+--------------------------------+-------+-------+---------------------+--------+-------------
 public | tbl_btree_idx_score_float8_idx | index | admin | tbl_btree_idx_score | 201 MB | 
 public | tbl_gin_idx_data_idx           | index | admin | tbl_gin_idx         | 275 MB | 
 public | tbl_gin_idx_path_data_idx      | index | admin | tbl_gin_idx_path    | 209 MB | 
(3 rows)

test=>

Considering the roughly equal sizes, it should not be the primary critera for selecting the appropriate datatype. The difference will be primarily driven by the access patterns in the queries.

Query: Select All

Now, let's execute the first query, which is select_all:

./pg_json_bench query select_all \
text,json,jsonb,btree_idx_score,gin_idx,gin_idx_path

This query retrieves all documents:

SELECT * FROM tbl_...;

In all scenarios, this results in a Seq Scan , and the execution time remains roughly consistent, with a minor overhead when reading from JSONB:

If YugabyteDB is primarily utilized for storing and scanning JSON documents, it's worth noting that storing them as TEXT or JSON can offer faster performance. The benefits of using JSONB become more apparent when the additional overhead aids in validating the structure and facilitating access to data within the document. It's important to remember that reading and fetching all documents can take approximately 5 seconds on this data set. Subsequent queries involving filtering and aggregation are expected to be faster.

Query: fetch documents with Score Over 7

./pg_json_bench query score_over_7 \
text,json,jsonb,btree_idx_score,gin_idx,gin_idx_path

This query filters based on a range predicate applied to an attribute identified by its path:

SELECT * FROM tbl_... WHERE CAST(data->>'score' AS FLOAT) > 7.0;

It's important to note that when using the text datatype, an additional cast to JSON is required, as follows: WHERE CAST(CAST(data AS JSON)->>'score' AS FLOAT) > 7.0
In this scenario, the regular index enables an Index Scan.

test=> explain (costs off,analyze) SELECT * FROM tbl_btree_idx_score WHERE CAST(data->>'score' AS FLOAT) > 7.0;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Index Scan using tbl_btree_idx_score_float8_idx on tbl_btree_idx_score (actual time=65.385..66.131 rows=354 loops=1)
   Index Cond: (((data ->> 'score'::text))::double precision > '7'::double precision)
 Planning Time: 0.060 ms
 Execution Time: 66.189 ms
 Peak Memory Usage: 940 kB
(5 rows)

The execution time is significantly faster, approximately 50 milliseconds, compared to the 5 seconds observed previously:

The other queries, which lack an index, do not exhibit a noticeable improvement over the previous Seq Scan, as they essentially perform the same type of work:

test=> explain (costs off,analyze) SELECT count(*) FROM tbl_jsonb WHERE CAST(data->>'score' AS FLOAT) > 7.0;
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate (actual time=5542.866..5542.867 rows=1 loops=1)
   ->  Seq Scan on tbl_jsonb (actual time=50.681..5542.552 rows=354 loops=1)
         Filter: (((data ->> 'score'::text))::double precision > '7'::double precision)
         Rows Removed by Filter: 132896
 Planning Time: 0.047 ms
 Execution Time: 5542.928 ms
 Peak Memory Usage: 865 kB
(7 rows)

All rows were read, and the filtering occurred subsequently with the indication of Rows Removed by Filter. This approach is not efficient and often suggests the necessity of having an index for this specific condition.

Query: Count documents with Score Over 7

./pg_json_bench query count_score_over_7 \
text,json,jsonb,btree_idx_score,gin_idx,gin_idx_path

This query performs the same task but counts the rows without retrieving all documents:

SELECT count(*) FROM tbl_... WHERE CAST(data->>'score' AS FLOAT) > 7.0;

As with the previous case, the only fast execution occurs when utilizing the index for the range scan:

It's worth noting that if you defined the index without specifying ASC, it would have been hash sharded, limiting its use to equality queries. Therefore, it's crucial to ensure that range queries have a corresponding range index.

Query: Count documents containing Year 2000

./pg_json_bench query count_year_2000_at_gt \
text,json,jsonb,btree_idx_score,gin_idx,gin_idx_path

This query employs the JSON operator @> to search for records from the year 2000:

SELECT COUNT(*) FROM tbl_... WHERE data @> '{"year": 2000}';

Interestingly, the query using the GIN index exhibits the slowest performance, even slower than the Seq Scan:

The underlying reason for this behavior may be that the selectivity of the query is not high enough to take full advantage of the Index Cond filtering. Additionally, GIN indexes may introduce false positives during indexing, which are subsequently removed, contributing to the slower performance:

test=> explain analyze SELECT COUNT(*) FROM tbl_gin_idx WHERE data @> '{"year": 2000}';
                                                                    QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=18.51..18.52 rows=1 width=8) (actual time=6561.718..6561.719 rows=1 loops=1)
   ->  Index Scan using tbl_gin_idx_data_idx on tbl_gin_idx  (cost=8.00..16.01 rows=1000 width=0) (actual time=66.272..6561.251 rows=2348 loops=1)
         Index Cond: (data @> '{"year": 2000}'::jsonb)
         Rows Removed by Index Recheck: 130902
 Planning Time: 1.795 ms
 Execution Time: 6561.844 ms
 Peak Memory Usage: 35567 kB
(7 rows)

The rapid execution observed with TEXT and JSON types is primarily due to the prompt return of an error, as the @> operator cannot be applied to these datatypes, and can be ignored.

In contrast, the other queries, with execution times around 100 milliseconds, involve Seq Scans but benefit from expression and aggregate pushdowns. In these cases, rows are filtered and counted on each storage node, contributing to their performance:

test=> explain (costs off,analyze) SELECT COUNT(*) FROM tbl_jsonb WHERE data @> '{"year": 2000}';
                               QUERY PLAN
-------------------------------------------------------------------------
 Finalize Aggregate (actual time=64.532..64.532 rows=1 loops=1)
   ->  Seq Scan on tbl_jsonb (actual time=64.519..64.522 rows=3 loops=1)
         Remote Filter: (data @> '{"year": 2000}'::jsonb)
         Partial Aggregate: true
 Planning Time: 0.037 ms
 Execution Time: 64.597 ms
 Peak Memory Usage: 30 kB
(7 rows)

The Remote Filter represents the opposite of Rows Removed by Filter, saying that the predicate was pushed down. It's essential to inspect the execution plan to assess the scalability of the queries effectively.

Query: Count documents with Year Equal to 2000

./pg_json_bench query count_year_2000_eq \
text,json,jsonb,btree_idx_score,gin_idx,gin_idx_path

In this query, the JSON operator @> is not used to search for records from the year 2000. Instead, it filters on the attribute by knowing the full path:

SELECT COUNT(*) FROM tbl_... WHERE data->>'year' = '2000';

Since this query does not have an index, all queries perform a Sequential Scan (Seq Scan) with expression and aggregate pushdowns:

While the predicate can be applied to the JSON datatype, it's worth noting that it performs more efficiently on the JSONB datatype.

Quick comparison with Aurora PostgreSQL

I've quickly run the same on Amazon Aurora to compare. As it is a single-writer deployment, without the same resilience and elasticity as YugabyteDB, the response time can be faster. What matters is that higher latency stays acceptable ands scalable, and then the throughput can be increased by scaling out.

Insert: The size is smaller with 100MB to 166MB with GIN index.
select_all: 800 milliseconds. It was 5 seconds on distributed YugabyteDB because all data had to be transferred between availability zones. That's the cost of resilience and elasticity: data is distributed to multiple nodes. Adding more nodes is the way to increase the throughput.
score_over_7: 20 milliseconds with B-Tree. It was 50ms in YugabyteDB, which is still acceptable given the resilience provided (no downtime on zone failure).
count_score_over_7: 8 milliseconds with B-Tree. It was 40ms with Yugabyte, which still has to move the rows to count them. For analytic queries like select_all or this low selectivity count_score_over_7 it may be preferable to enable Follower Reads to avoid cross-zone transfer.
count_year_2000_at_gt: Without GIN index, the time is around 90ms, which is comparable to YugabyteDB (thanks to pushdowns, fetching the rows through the network is not a problem). In Aurora, the GIN index access is fast with 4ms, without showing the same problem as YugabyteDB (6 seconds because of 'Index Recheck'). In general, PostgreSQL is faster on GIN indexes using Bitmap Scan:

                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Aggregate (actual time=2.994..2.995 rows=1 loops=1)
   ->  Bitmap Heap Scan on tbl_gin_idx (actual time=0.910..2.847 rows=2348 loops=1)
         Recheck Cond: (data @> '{"year": 2000}'::jsonb)
         Heap Blocks: exact=538
         ->  Bitmap Index Scan on tbl_gin_idx_data_idx (actual time=0.838..0.839 rows=2348 loops=1)
               Index Cond: (data @> '{"year": 2000}'::jsonb)
 Planning Time: 0.179 ms
 Execution Time: 3.037 ms
(8 rows)

Bitmap Scan relies on shared buffers, as it keeps a bitmap of buffers to visit in a more efficient way. But shared buffers are in local RAM and cannot be distributed. YugabyteDB distributes rows and cannot use this Bitmap Heap Scan.

count_year_2000_eq: 80ms with a full scan. This one is faster on YugabyteDB as it is run in parallel on all nodes with filtering and aggregation pushed down. It can be further reduced by adding more nodes to the cluster.

Note that I compared with Aurora to give an idea of the additional latency in Distributed SQL but in term of resilience, there's no comparison: YugabyteDB provides High Availability with no downtime on failure, where PostgreSQL and Aurora provide fast recovery but with full downtime. I explained this in a previous post “Multi-AZ”.

Follower Reads in YugabyteDB

I mentioned that for analytic queries you can reduce the data transfer by reading from the Raft followers. You can test it with the following settings that reads a snapshot from 15 seconds before:

 PGOPTIONS="-c  default_transaction_read_only=on -c yb_read_from_followers=on yb_follower_read_staleness_ms=15000"

To run pg_json_bench query with this, you need to comment out the call to singleInsertNoMetrics() in query.go and run it after an insert workload. Or set the transaction read only after it rather than as a PGOPTION. In my configuration the select_all is two times faster when not having to read from the Raft leader.

To Summarize...

Here are some comments and recommendations regarding storing and indexing JSON in YugabyteDB:

Preference for JSONB: It's advisable to prioritize using JSONB over JSON or TEXT when dealing with well-formatted JSON data that you intend to query as JSON.

Seq Scan for low selectivity: When querying a significant portion of rows, especially with low selectivity predicates, additional indexes may not be necessary. However, it's crucial to verify that the query planner is pushing down expressions efficiently.

Index Scan for High Selectivity Predicates: For queries that retrieve a smaller subset of rows with high selectivity predicates, particularly when filtered based on attributes defined by their path, consider adding a regular secondary index. These indexes typically have a modest overhead on inserts. Additionally, you can use partial indexes and specify ASC or DESC for range queries.

Index Only Scan on additional columns: In scenarios where you anticipate querying a larger number of rows, it might be beneficial to move some attributes to a table column to create a covering index. This approach can optimize query performance. It's important to note that if the columns within the JSONB document are frequently updated, it's advisable to consider moving them out of the JSONB document to further enhance performance and reduce the overhead associated with frequent updates.

GIN Indexes: While GIN indexes offer more flexibility, allowing queries without specifying paths and enabling indexing of arrays, they are not necessarily faster than regular indexes when a suitable alternative exists. Note that you can index one or multiple subsets of the document with GIN index rather than the whole JSON.

Above all, it's imperative to thoroughly understand your specific context and conduct testing to validate the conclusions. This blog post, utilizing Waltton's tool amdn method, demonstrates how to effectively perform these tests and make informed decisions about data storage and indexing in YugabyteDB.

JSON as TEXT, JSON, or JSONB datatypes in YugabyteDB