Overview
In my previous post we simulated a Multi-Region CockroachDB cluster on Minikube.
Today we add tools for Monitoring & Alerting, including accessing a S3 compatible service.
So go ahead and create the cluster first before proceeding.
You can also read more about the stack in the post for the Docker deployment.
Setup
Apply the Kubernetes definition file to create the monitoring stack.
kubectl apply -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/monitoring.yaml
Check that all Pods and Services are all up and running, then ask Minikube for the services address, and open each one in your browser. Take note of the port number, see if you can located them in the deployment YAML file.
$ minikube service minio --url
http://192.168.64.6:31900
$ minikube service prom --url
http://192.168.64.6:31990
$ minikube service alertmgr --url
http://192.168.64.6:31993
$ minikube service grafana --url
http://192.168.64.6:32000
Good job, the stack is ready! Let's review what we have deployed
MinIO
MinIO is a S3 compatible object storage service and it is very popular among private cloud deployments.
From the UI, login using username and password minioadmin
, then can create a bucket named cockroach
.
Load the MovR dataset, then connect to the database
$ cockroach workload init movr "postgresql://root@`minikube ip`:31257/movr?sslmode=disable"
[...]
$ cockroach sql --url "postgresql://`minikube ip`:31257/movr?sslmode=disable"
Execute a backup job pointing at the MinIO server. Notice the endpoint URL and the keys used
BACKUP TO 's3://cockroach?AWS_ENDPOINT=http://minio:9000&AWS_ACCESS_KEY_ID=minioadmin&AWS_SECRET_ACCESS_KEY=minioadmin'
AS OF SYSTEM TIME '-10s';
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
610281440766132226 | succeeded | 1 | 1 | 3 | 11524
(1 row)
Confirm backup went well. Check also the MinIO UI
SHOW BACKUP 's3://cockroach?AWS_ENDPOINT=http://minio:9000&AWS_ACCESS_KEY_ID=minioadmin&AWS_SECRET_ACCESS_KEY=minioadmin';
database_name | parent_schema_name | object_name | object_type | start_time | end_time | size_bytes | rows | is_full_cluster
----------------+--------------------+----------------------------+-------------+------------+----------------------------------+------------+------+------------------
NULL | NULL | system | database | NULL | 2020-11-25 14:05:27.012204+00:00 | NULL | NULL | true
system | public | users | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 99 | 2 | true
system | public | zones | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 201 | 7 | true
system | public | settings | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 374 | 5 | true
system | public | ui | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 155 | 1 | true
system | public | jobs | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 17994 | 21 | true
system | public | locations | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 360 | 7 | true
system | public | role_members | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 94 | 1 | true
system | public | comments | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 0 | 0 | true
system | public | role_options | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 0 | 0 | true
system | public | scheduled_jobs | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 0 | 0 | true
NULL | NULL | defaultdb | database | NULL | 2020-11-25 14:05:27.012204+00:00 | NULL | NULL | true
NULL | NULL | postgres | database | NULL | 2020-11-25 14:05:27.012204+00:00 | NULL | NULL | true
NULL | NULL | movr | database | NULL | 2020-11-25 14:05:27.012204+00:00 | NULL | NULL | true
movr | public | users | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 4911 | 50 | true
movr | public | vehicles | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 3182 | 15 | true
movr | public | rides | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 156387 | 500 | true
movr | public | vehicle_location_histories | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 73918 | 1000 | true
movr | public | promo_codes | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 219973 | 1000 | true
movr | public | user_promo_codes | table | NULL | 2020-11-25 14:05:27.012204+00:00 | 0 | 0 | true
Very good, you can now use MinIO as your Backup & Restore solution!
Monitoring stack: Prometheus, AlertManager, Grafana
Our Monitoring and Alerting stack is made up of 3 components: Prometheus, Alertmanager and Grafana.
Prometheus is an open-source systems monitoring and alerting toolkit. You can use Prometheus to grab the metrics that populate Cockroach AdminUI for your own, separate monitoring and alerting system setup.
Alertmanager is also a product of the Prometheus project.
Grafana is a very popular visualization tool and can connect to Prometheus as a source for the metrics.
Start with downloading Cockroach Labs pre-made Grafana dashboards
wget https://raw.githubusercontent.com/cockroachdb/cockroach/master/monitoring/grafana-dashboards/runtime.json
wget https://raw.githubusercontent.com/cockroachdb/cockroach/master/monitoring/grafana-dashboards/sql.json
wget https://raw.githubusercontent.com/cockroachdb/cockroach/master/monitoring/grafana-dashboards/replicas.json
wget https://raw.githubusercontent.com/cockroachdb/cockroach/master/monitoring/grafana-dashboards/storage.json
In Grafana, add Prometheus as a Data Source. The source URL for Prometheus is http://cockroachdb:9090.
Then, import each dashboard JSON file you downloaded.
As a test, run the YCSB workload
# initiate YCSB dataset
cockroach workload init ycsb "postgresql://root@`minikube ip`.6:31257/ycsb?sslmode=disable"
# run the YCSB workload B load balancing to all 9 nodes (3 services)
cockroach workload run ycsb "postgresql://root@`minikube ip`:31257/ycsb?sslmode=disable" "postgresql://root@`minikube ip`:31258/ycsb?sslmode=disable" "postgresql://root@`minikube ip`:31259/ycsb?sslmode=disable"
And this is what is displayed in Grafana
Perfect! You're all set!
Clean up
Removing the stack is as easy as creating it. Please note, this will remove also the Persistent Volumes
kubectl delete -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/monitoring.yaml