Postgres in Ibiza is more than a conference. It gathers a community of PostgreSQL users and contributors for three days of conference and unconference talks and discussions around modern topics in PostgreSQL and compatible forks and extensions. This year, JSON in PostgreSQL was a hot: FerretDB was sponsoring (They add a MongoDB API on top of PostgreSQL) and a great talk was about the right datatypes to store JSON in Postgres.
PostgreSQL datatypes for JSON
Modern applications often store data in both relational structures and JSON documents within the same SQL database. PostgreSQL offers powerful formats, operators, and indexes for this purpose. The question arises: which format to choose— text, json, or jsonb datatypes? JSONB is often the best default choice, but it may depend on your access patterns.
During this conference, Waltton Morais delivered an insightful session titled "Leveraging the Power of JSON/JSONB Fields: Key Considerations for Effective Utilization in PostgreSQL". This presentation is an excellent tech talk, rooted in facts that can be easily reproduced. If you have the opportunity to attend Waltton's presentation, I highly recommend it. His investigation was on PostgreSQL, I'll run the same on YugabyteDB (Open Source PostgreSQL-compatible Distributed SQL database).
YugabyteDB: scaling-out PostgreSQL
Some considerations discussed for PostgreSQL apply to YugabyteDB, but not all. YugabyteDB employs a PostgreSQL fork for the query layer but stores rows as documents in DocDB, its distributed transactional storage. This approach offers several advantages, including no vacuum, no block size limitations (no TOAST needed), transparent compression and encryption, and horizontal scalability for improved elasticity and resilience.
I followed Waltton's method, measuring the size and performance on a reproducible test case, using his code available on GitHub:
The goal of this project is to help to compare how different data types and indexes behaves with inserts and queries with json data.
After running the tests, pg_json_bench publishes the result to prometheus gateway and it is possible see the result on the prepared grafana dashboards.
I've run this from an Amazon Linux 2 machine (t2.xlarge) with VPC peering to my YugabyteDB Managed service (3 x 4vCPU/16GB).
git clone https://github.com/waltton/pg_json_bench.git
cd pg_json_bench
# start grafana and prometheus with pushgateway
docker compose up -d# get datamkdir ./data
wget https://raw.githubusercontent.com/algolia/datasets/master/movies/records.json -O ./data/records.json
# build
go build -o pg_json_bench
To conduct these tests, I configured my PostgreSQL environment to align with the connection parameters of my YugabyteDB Managed cluster. Here are the environment variables I set:
This command initiated the insertion process for each table, enabling us to evaluate factors such as insertion speed, table size, and the impact of indexing on the overall performance.
YugabyteDB employs a storage mechanism based on LSM-Tree (Log-Structured Merge-Tree), where the first level resides in memory and is protected from failures by on-disk Write Ahead Logging (WAL). This design is part of the resilience to failure and enables seamless synchronization for followers that may have been disconnected for a period. The default retention period for the WAL is 15 minutes, allowing disconnected replicas to catch up on the latest changes without requiring a full bootstrap.
The size displayed here is gathered by pg_total_relation_size. It's important to note that this size approximation can vary due to factors such as WAL expiration and background compaction. However, despite these variations, storing data as JSONB consistently emerges as the most efficient option.
The regular index, referred to as BTREE (though it is, in fact, an LSM-Tree in YugabyteDB), indexes just one value per row. This results in rapid insertions and low overhead. One of the advantages of YugabyteDB's LSM-Tree architecture is the reduce number of reads when updating a non-unique secondary index.
Indexing all attributes in a GIN index takes considerably more time, with a slight increase in time when indexing all keys:
Considering the roughly equal sizes, it should not be the primary critera for selecting the appropriate datatype. The difference will be primarily driven by the access patterns in the queries.
Query: Select All
Now, let's execute the first query, which is select_all:
In all scenarios, this results in a Seq Scan , and the execution time remains roughly consistent, with a minor overhead when reading from JSONB:
If YugabyteDB is primarily utilized for storing and scanning JSON documents, it's worth noting that storing them as TEXT or JSON can offer faster performance. The benefits of using JSONB become more apparent when the additional overhead aids in validating the structure and facilitating access to data within the document. It's important to remember that reading and fetching all documents can take approximately 5 seconds on this data set. Subsequent queries involving filtering and aggregation are expected to be faster.
It's important to note that when using the text datatype, an additional cast to JSON is required, as follows: WHERE CAST(CAST(data AS JSON)->>'score' AS FLOAT) > 7.0
In this scenario, the regular index enables an Index Scan.
The execution time is significantly faster, approximately 50 milliseconds, compared to the 5 seconds observed previously:
The other queries, which lack an index, do not exhibit a noticeable improvement over the previous Seq Scan, as they essentially perform the same type of work:
All rows were read, and the filtering occurred subsequently with the indication of Rows Removed by Filter. This approach is not efficient and often suggests the necessity of having an index for this specific condition.
As with the previous case, the only fast execution occurs when utilizing the index for the range scan:
It's worth noting that if you defined the index without specifying ASC, it would have been hash sharded, limiting its use to equality queries. Therefore, it's crucial to ensure that range queries have a corresponding range index.
Interestingly, the query using the GIN index exhibits the slowest performance, even slower than the Seq Scan:
The underlying reason for this behavior may be that the selectivity of the query is not high enough to take full advantage of the Index Cond filtering. Additionally, GIN indexes may introduce false positives during indexing, which are subsequently removed, contributing to the slower performance:
The rapid execution observed with TEXT and JSON types is primarily due to the prompt return of an error, as the @> operator cannot be applied to these datatypes, and can be ignored.
In contrast, the other queries, with execution times around 100 milliseconds, involve Seq Scans but benefit from expression and aggregate pushdowns. In these cases, rows are filtered and counted on each storage node, contributing to their performance:
The Remote Filter represents the opposite of Rows Removed by Filter, saying that the predicate was pushed down. It's essential to inspect the execution plan to assess the scalability of the queries effectively.
In this query, the JSON operator @> is not used to search for records from the year 2000. Instead, it filters on the attribute by knowing the full path:
Since this query does not have an index, all queries perform a Sequential Scan (Seq Scan) with expression and aggregate pushdowns:
While the predicate can be applied to the JSON datatype, it's worth noting that it performs more efficiently on the JSONB datatype.
Quick comparison with Aurora PostgreSQL
I've quickly run the same on Amazon Aurora to compare. As it is a single-writer deployment, without the same resilience and elasticity as YugabyteDB, the response time can be faster. What matters is that higher latency stays acceptable ands scalable, and then the throughput can be increased by scaling out.
Insert: The size is smaller with 100MB to 166MB with GIN index.
select_all: 800 milliseconds. It was 5 seconds on distributed YugabyteDB because all data had to be transferred between availability zones. That's the cost of resilience and elasticity: data is distributed to multiple nodes. Adding more nodes is the way to increase the throughput.
score_over_7: 20 milliseconds with B-Tree. It was 50ms in YugabyteDB, which is still acceptable given the resilience provided (no downtime on zone failure).
count_score_over_7: 8 milliseconds with B-Tree. It was 40ms with Yugabyte, which still has to move the rows to count them. For analytic queries like select_all or this low selectivity count_score_over_7 it may be preferable to enable Follower Reads to avoid cross-zone transfer.
count_year_2000_at_gt: Without GIN index, the time is around 90ms, which is comparable to YugabyteDB (thanks to pushdowns, fetching the rows through the network is not a problem). In Aurora, the GIN index access is fast with 4ms, without showing the same problem as YugabyteDB (6 seconds because of 'Index Recheck'). In general, PostgreSQL is faster on GIN indexes using Bitmap Scan:
Bitmap Scan relies on shared buffers, as it keeps a bitmap of buffers to visit in a more efficient way. But shared buffers are in local RAM and cannot be distributed. YugabyteDB distributes rows and cannot use this Bitmap Heap Scan.
count_year_2000_eq: 80ms with a full scan. This one is faster on YugabyteDB as it is run in parallel on all nodes with filtering and aggregation pushed down. It can be further reduced by adding more nodes to the cluster.
Note that I compared with Aurora to give an idea of the additional latency in Distributed SQL but in term of resilience, there's no comparison: YugabyteDB provides High Availability with no downtime on failure, where PostgreSQL and Aurora provide fast recovery but with full downtime. I explained this in a previous post “Multi-AZ”.
Follower Reads in YugabyteDB
I mentioned that for analytic queries you can reduce the data transfer by reading from the Raft followers. You can test it with the following settings that reads a snapshot from 15 seconds before:
To run pg_json_bench query with this, you need to comment out the call to singleInsertNoMetrics() in query.go and run it after an insert workload. Or set the transaction read only after it rather than as a PGOPTION. In my configuration the select_all is two times faster when not having to read from the Raft leader.
To Summarize...
Here are some comments and recommendations regarding storing and indexing JSON in YugabyteDB:
Preference for JSONB: It's advisable to prioritize using JSONB over JSON or TEXT when dealing with well-formatted JSON data that you intend to query as JSON.
Seq Scan for low selectivity: When querying a significant portion of rows, especially with low selectivity predicates, additional indexes may not be necessary. However, it's crucial to verify that the query planner is pushing down expressions efficiently.
Index Scan for High Selectivity Predicates: For queries that retrieve a smaller subset of rows with high selectivity predicates, particularly when filtered based on attributes defined by their path, consider adding a regular secondary index. These indexes typically have a modest overhead on inserts. Additionally, you can use partial indexes and specify ASC or DESC for range queries.
Index Only Scan on additional columns: In scenarios where you anticipate querying a larger number of rows, it might be beneficial to move some attributes to a table column to create a covering index. This approach can optimize query performance. It's important to note that if the columns within the JSONB document are frequently updated, it's advisable to consider moving them out of the JSONB document to further enhance performance and reduce the overhead associated with frequent updates.
GIN Indexes: While GIN indexes offer more flexibility, allowing queries without specifying paths and enabling indexing of arrays, they are not necessarily faster than regular indexes when a suitable alternative exists. Note that you can index one or multiple subsets of the document with GIN index rather than the whole JSON.
Above all, it's imperative to thoroughly understand your specific context and conduct testing to validate the conclusions. This blog post, utilizing Waltton's tool amdn method, demonstrates how to effectively perform these tests and make informed decisions about data storage and indexing in YugabyteDB.