In this blog I will demonstrate how you can trick an application or service, in our case CockroachDB, into thinking it's communicating with AWS S3 when in fact another service lays behind the scenes.
CockroachDB is perfectly capable to interact with any S3 compatible services. Following the instructions of the AWS SDK, you can add parameter AWS_ENDPOINT
to your URI to tell CockroachDB what server to hit instead of the default AWS servers. This is clearly documented here.
However, this blog comes as a follow up from a customer interaction where their requirement was to NOT use AWS_ENDPOINT
in their URI, yet still using their own private S3 compatible service.
Setup
Create a t2.micro or similar instance on your favorite cloud provider.
For this exercise, the instance has the below details:
OS : Ubuntu 20.04 LTS
Public IP : 54.83.105.95
Private IP : 10.10.81.189
Install and Configure MinIO
MinIO is a S3 compatible service and it is my favorite choice for when I want to play locally with the S3 API. We will therefore use the MinIO Server as our custom endpoint instead of AWS S3.
Below the steps to run MinIO, also available in the official docs.
# download the binary
wget https://dl.min.io/server/minio/release/linux-amd64/minio
chmod +x minio
./minio server data
At this point MinIO is running in insecure mode.
It will also have created the configuration folders at ${HOME}/.minio/certs
.
Stop the server with Ctrl+C
, then configure secure access with TLS.
# allow minio to start on port 443
sudo setcap cap_net_bind_service=+ep minio
# create the private key
cd .minio/certs
openssl genrsa -out private.key 2048
chmod 400 private.key
Create a file named openssl.conf
with the content below. Check the alt_names
section carefully: in this context, fabio
is the name of the s3 bucket.
[req]
distinguished_name = req_distinguished_name
x509_extensions = v3_req
prompt = no
[req_distinguished_name]
C = US
ST = VA
L = Somewhere
O = MyOrg
OU = MyOU
CN = MyServerName
[v3_req]
subjectAltName = @alt_names
[alt_names]
IP.1 = 127.0.0.1
IP.2 = 54.83.105.95
IP.3 = 10.10.81.189
DNS.1 = localhost
DNS.2 = s3.amazonaws.com
DNS.3 = fabio.s3.amazonaws.com
Create the public cert
openssl req -new -x509 -nodes -days 730 -key private.key -out public.crt -config openssl.conf
Setup MinIO Server to use Virtual Hosted-Style, as per here.
This is required to match AWS S3 style.
export MINIO_DOMAIN=s3.amazonaws.com
Finally, start the MinIO Server
$ ~/minio server --console-address ":9001" --address ":443" ~/data
API: https://10.10.82.1 https://127.0.0.1
RootUser: minioadmin
RootPass: minioadmin
Console: https://10.10.82.1:9001 https://127.0.0.1:9001
RootUser: minioadmin
RootPass: minioadmin
Command-line: https://docs.min.io/docs/minio-client-quickstart-guide
$ mc alias set myminio https://10.10.82.1 minioadmin minioadmin
Documentation: https://docs.min.io
WARNING: Detected default credentials 'minioadmin:minioadmin', we recommend that you change these values with 'MINIO_ROOT_USER' and 'MINIO_ROOT_PASSWORD' environment variables
Point your browser at https://54.83.105.95:9001 to access the MinIO Console. As the self-signed certificate is invalid, the browser might not allow you to continue. In Brave Browser, and other Chrome based browsers, type thisisunsafe
anywhere on the screen (not on the address toolbar) to continue and disregard the warning.
Login with minioadmin/minioadmin
, which are the default login details. Once logged in, you'll see the Dashboard page, below
Using the menu on the left, create a bucket called fabio
.
Then create a User fabio
with any password you want. Make sure assign the consoleAdmin
, readwrite
and diagnostic
policies to the user, so you don't run into any permission issue. Later on, you can refine these permissions.
Then, click on the user you just created and create a Service Account, and make sure you save the keys. In my example, the keys are:
Access Key: I6SX7TO71RZY79FXDK1T
Secret Key: 7TxSqDyrvOXCV+EWkRvZKSOXETriFik5LKNWmkQm
Finally, go to Settings and change the region to us-east-1
.
This will prompt you to RESTART the server. Do so by stopping the server, and restarting it.
Perfect, you're all set! MinIO Server is up and running in secure mode, and you've a service account to interact with it.
Next, we configure the OS.
Linux OS configuration
For CockroachDB to trust the certificate, we must add it to the system cert pool. We also need to redirect the traffic aimed at AWS servers to MinIO's.
Allow self-signed certs
In Ubuntu, this is how you go about, as per here
sudo mkdir /usr/local/share/ca-certificates/extra
sudo cp ~/.minio/certs/public.crt /usr/local/share/ca-certificates/extra/
sudo update-ca-certificates
Routing
Edit file /etc/hosts
and add these entries to route all traffic directed to Amazon AWS servers to the MinIO Server instead.
Again, fabio
is the name of the bucket.
54.83.105.95 s3.amazonaws.com
54.83.105.95 s3.us-east-1.amazonaws.com
54.83.105.95 fabio.s3.amazonaws.com
Good, we're all set from OS point of view. Next, we test with CockroachDB.
CockroachDB
Install and start CockroachDB. We will use the single-node feature as we're only trying to prove it works from a functionality perspective.
As always, the offical installation docs are here.
curl https://binaries.cockroachdb.com/cockroach-v21.1.10.linux-amd64.tgz | tar -xz && sudo cp -i cockroach-v21.1.10.linux-amd64/cockroach /usr/local/bin/
cockroach start-single-node --insecure --background
Connect to the SQL prompt as root:
cockroach sql --insecure
Create a user - doesn't have to be an admin -, then logout and log back in as that user
CREATE USER fabio WITH CREATEDB;
cockroach sql --insecure -u fabio
As user fabio
, create a simple test table and export it. Make sure you update the keys with your values!
CREATE DATABASE fabio;
USE fabio;
CREATE TABLE test AS
SELECT a
FROM generate_series(1,100) AS a;
-- notice that we don't use AWS_ENDPOINT in the URI
EXPORT INTO CSV
's3://fabio?AWS_ACCESS_KEY_ID=I6SX7TO71RZY79FXDK1T&AWS_SECRET_ACCESS_KEY=7TxSqDyrvOXCV+EWkRvZKSOXETriFik5LKNWmkQm'
FROM TABLE test;
filename | rows | bytes
-------------------------------------------------------------------+------+--------
export16ad4c7f75f4fe180000000000000001-n701156393427501057.0.csv | 100 | 292
(1 row)
Time: 38ms total (execution 38ms / network 0ms)
As root
, take a full cluster backup
BACKUP INTO
's3://fabio?AWS_ACCESS_KEY_ID=I6SX7TO71RZY79FXDK1T&AWS_SECRET_ACCESS_KEY=7TxSqDyrvOXCV+EWkRvZKSOXETriFik5LKNWmkQm';
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
701160820164395009 | succeeded | 1 | 126 | 14 | 5920
(1 row)
Refresh the MinIO Console to see your files
Awesome! The CockroachDB end-user has no clue that the engine behind the scenes is MinIO instead of AWS, and CockroachDB is happy to IMPORT, EXPORT, BACKUP and RESTORE from S3, be it AWS' or Minio's.