When executing a SQL statement, the YugabyteDB query layer (known as YSQL and using PostgreSQL) sends read and write operations to the storage layer (DocDB). They are remote calls (RPC) and must time out if it takes too long, for whatever reasons. This is where you can experience such error: ERROR: Perform RPC (request call id 6171) to <ip_redacted>:9100 timed out after 602.000s
.
The parameters that control this timeout:
- client_read_write_timeout_ms cluster flag for YCQL and YSQL
- ysql_client_read_write_timeout_ms cluster flag for YSQL
- statement_timeout session parameter (PostgreSQL)
- hardcoded value of 600 seconds in YugabyteDB code
Here is the logic in YBCSetTimeout(): There is a timeout set at YSQL level (statement_timeout) and a timeout set at cluster level (ysql_read_write_timeout_ms
). When both are set, the timeout is set to the minimum of the YSQL and DocDB settings, which means increasing one above the other will allow longer waiting.
If statement_timeout
is set to zero, which is the default, it is considered not set, and only the DocDB setting matters.
Here are the defaults for DocDB parameters:
--client_read_write_timeout_ms=60000
--ysql_client_read_write_timeout_ms=-1
What is not visible from the parameters is when ysql_client_read_write_timeout_ms
<0 it doesn't default to client_read_write_timeout_ms
but to a hardcoded of 600 seconds (in YsqlClientReadWriteTimeoutMs())
This explains the error timed out after 600.000s
in YSQL. If you want to allow longer queries to the storage, the solution is to increase statement_timeout
to over 600000 (all those parameters are in milliseconds). If you want to reduce the timeout, the solution is to set ysql_client_read_write_timeout_ms