This is a blogpost about the open source yb_stats tool for YugabyteDB](https://yugabyte.com) to look into the database endpoints to look for the generated logs of the Yugabyte database components.
How are the logs obtained?
The logs of a server process in the Yugabyte database is available at the /logs
endpoint of the tablet servers and the masters.
Caveats. Important!
There is a caveat for using the logs in a YugabyteDB cluster via the /logs
endpoint. That caveat is that it will only provide maximally the last 1 megabyte of the logs.
The logs can be quite verbose, and therefore taking yb_stats
snapshots at an interval of for example one hour is likely to not contain all the log information. This is very much dependent on the busyness of the cluster.
Another caveat is that the logging does not provide a full timestamp, because the year and the timezone information from the timestamp are missing.
How can I use and look at the logs via yb_stats
?
-
When
yb_stats
is invoked with the--print-log
switch without a snapshot number.
In this situationyb_stats
uses either the hosts and ports set with--hosts
and--ports
, the hosts and ports set in.env
or the default values. The logs are directly obtained from the hosts, and thus requires network access. For example:
% yb_stats --print-log --log-severity IWEF --hostname-match '(7000|9000)' 192.168.66.81:7000 2022-12-19 15:25:22.487398 +01:00 I catalog_manager.cc:7115 Peer 3152ba00f652431baeeafa1b36336093 sent incremental report for 1d56fd1c5f52402eb30826afe4afc470, prev state op id: -1, prev state term: 5, prev state has_leader_uuid: 1. Consensus state: current_term: 5 leader_uuid: "9e982449051e43f3902a4fb24d639b84" config { opid_index: -1 peers { permanent_uuid: "3152ba00f652431baeeafa1b36336093" member_type: VOTER last_known_private_addr { host: "yb-1.local" port: 9100 } cloud_info { placement_cloud: "local" placement_region: "local" placement_zone: "local1" } } peers { permanent_uuid: "64dc6522080c433c9cbd2c83efccb025" member_type: VOTER last_known_private_addr { host: "yb-3.local" port: 9100 } cloud_info { placement_cloud: "local" placement_region: "local" placement_zone: "local3" } } peers { permanent_uuid: "9e982449051e43f3902a4fb24d639b84" member_type: VOTER last_known_private_addr { host: "yb-2.local" port: 9100 } cloud_info { placement_cloud: "local" placement_region: "local" placement_zone: "local2" } } } 192.168.66.80:9000 2022-12-19 15:26:27.369014 +01:00 I reactor.cc:466 TabletServer_R000: DEBUG: Closing idle connection: Connection (0x0000000007598138) server 192.168.66.81:56232 => 192.168.66.80:9100 - it has been idle for 65.0998s 192.168.66.80:7000 2022-12-19 15:42:55.568128 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x00000000086d5e78) server 192.168.66.80:42320 => 192.168.66.80:7100 - it has been idle for 65.0005s 192.168.66.81:7000 2022-12-19 15:42:55.599397 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x000000000a8c6a38) server 192.168.66.80:45682 => 192.168.66.81:7100 - it has been idle for 65.0989s 192.168.66.81:7000 2022-12-19 15:42:55.599557 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x000000000a8c6b58) server 192.168.66.81:53784 => 192.168.66.81:7100 - it has been idle for 65.0989s 192.168.66.82:7000 2022-12-19 15:42:55.647686 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x0000000006a18918) server 192.168.66.80:59494 => 192.168.66.82:7100 - it has been idle for 65.1008s
When
yb_stats
is invoked with the--print-log
switch with a snapshot number.
In this situationyb_stats
requires the snapshots to be available in theyb_stats.snapshots
directory in the current working directory, and the snapshot number to be available. The log information is read from the information in the snapshot, and does not require the hosts to be available. For example:
% yb_stats --print-log 0 --log-severity IWEF --hostname-match 7000
192.168.66.81:7000 2022-12-19 15:25:22.487398 +01:00 I catalog_manager.cc:7115 Peer 3152ba00f652431baeeafa1b36336093 sent incremental report for 1d56fd1c5f52402eb30826afe4afc470, prev state op id: -1, prev state term: 5, prev state has_leader_uuid: 1. Consensus state: current_term: 5 leader_uuid: "9e982449051e43f3902a4fb24d639b84" config { opid_index: -1 peers { permanent_uuid: "3152ba00f652431baeeafa1b36336093" member_type: VOTER last_known_private_addr { host: "yb-1.local" port: 9100 } cloud_info { placement_cloud: "local" placement_region: "local" placement_zone: "local1" } } peers { permanent_uuid: "64dc6522080c433c9cbd2c83efccb025" member_type: VOTER last_known_private_addr { host: "yb-3.local" port: 9100 } cloud_info { placement_cloud: "local" placement_region: "local" placement_zone: "local3" } } peers { permanent_uuid: "9e982449051e43f3902a4fb24d639b84" member_type: VOTER last_known_private_addr { host: "yb-2.local" port: 9100 } cloud_info { placement_cloud: "local" placement_region: "local" placement_zone: "local2" } } }
192.168.66.80:7000 2022-12-19 15:42:55.568128 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x00000000086d5e78) server 192.168.66.80:42320 => 192.168.66.80:7100 - it has been idle for 65.0005s
192.168.66.81:7000 2022-12-19 15:42:55.599397 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x000000000a8c6a38) server 192.168.66.80:45682 => 192.168.66.81:7100 - it has been idle for 65.0989s
192.168.66.81:7000 2022-12-19 15:42:55.599557 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x000000000a8c6b58) server 192.168.66.81:53784 => 192.168.66.81:7100 - it has been idle for 65.0989s
192.168.66.82:7000 2022-12-19 15:42:55.647686 +01:00 I reactor.cc:466 Master_R000: DEBUG: Closing idle connection: Connection (0x0000000006a18918) server 192.168.66.80:59494 => 192.168.66.82:7100 - it has been idle for 65.1008s
- When
yb_stats
is invoked with the--tail-log
switch. This switch does not accept a snapshot number, because it doesn't make sense to tail a file that does not change. In this situationyb_stats
uses either the hosts addresses and ports set with--hosts
and--ports
, the hosts and ports set in.env
or the default values. Only log lines that are created after--tail-log
is started are shown. The output of--tail-log
output is identical to the above, so therefore no example of that here.
How does --tail-log
work?
The tail-log functionality works by taking the list of log lines two times with an interval between them, and create a third list from lines that only do exist in the second list, which are the new lines, which are the lines that are shown.
Filtering
In most cases, "just" querying the logs will give you a lot of lines, which is too much to eyeball. In almost any case, you might want to add filters using the --hostname-match
, --log-severity
and --stat-name-match
options.