In this series I defined an API Rate Limiter with a simple table + function. I still have the table rate_limiting_token_bucket
and function rate_limiting_request
created, you can find their definitions in Part 2. To call it, I'd created a small Java program in Part 3 but we don't need it here. I'll call my function from a REST API and this very simple to implement, with no code. I'll use PostgREST, a standalone web server that turns your PostgreSQL database directly into a RESTful API. And what works on PostgreSQL can also run on YugabyteDB. I have a 3 nodes cluster setup and I'll install PostgREST directly on each node, connecting to the local endpoint. YugabyteDB is a distributed SQL database, you can connect to any node and see the whole database because there is a stateless PostgreSQL on top of the distributed storage and transaction layer.
The simplest install ever
The install is easy, just check the latest release and it extracts to a single executable:
curl -qsL https://github.com/PostgREST/postgrest/releases/download/v9.0.0/postgrest-v9.0.0-linux-static-x64.tar.xz | \
tar -xJvf -
This creates a 16MB postgrest
binary in the current directory.
The simplest config ever
Better than an online help, the executable can generate a configuration file template ready to customize:
./postgrest -e > my_rest_configuration.conf
You can update this configuration, defining at least the connection url db-uri
in the libpq format.
But I'll do even simpler for this demo, using environment variables only:
PGRST_DB_URI="postgres://yugabyte:yugabyte@$(hostname):5433/yugabyte" \
PGRST_DB_SCHEMAS="public" PGRST_DB_ANON_ROLE="yugabyte" \
PGRST_DB_CHANNEL_ENABLED=false ./postgrest &
This starts the PostgREST server with a pool of connections to the local endpoint, database "yugabyte", schema "public". I disabled the "DB Channel" because LISTEN/NOTIFY is not enabled in YugabyteDB (this mecanism is not appropriate for a distributed database).
If you created the procedure after starting the PostgREST deamon, you need to refresh the cache with: pkill -10 postgrest
to signal for cache re-load.
I'm keeping it simple for this demo, but of course, it can be configured for performance and security. With all defaults, it listens to port 3000.
REST calls
I can now use this REST proxy to call my rate limiting function.
There is nothing more to define because PostgREST reads the schema. The namespace of the stored procedures and functions is exposed in http://localhost:3000/rpc
and the parameters are passed as a JSON document. Here is a call getting a token for "user1" with a refill rate of 1 token per second:
echo $(
curl -qs \
http://localhost:3000/rpc/rate_limiting_request \
-X POST -H "Content-Type: application/json" -d '
{ "rate_id": "user1", "refill_per_sec": 1 }
')
Here is the result:
I have 60 tokens remaining because my function has a default window of 60 seconds, filled with the rate of 1 tocken per second.
Threads
From my Java program and with this cluster, I was able to get 1000 tokens per second from threads requesting tokens from different users and 120 per second. Here is a python script to do the same though the REST API:
import threading
import time
import requests
import sys
def request_tokens(id,num):
tokens=0
start = time.time()
for i in range(1,num):
response=requests.post('http://localhost:3000/rpc/rate_limiting_request', json={ "rate_id": id, "refill_per_sec": 1000000 })
if response.status_code==200:
tokens=tokens+1
print("{0} tokens/s".format(tokens/(time.time()-start)))
threads=[]
for i in range(0,int(sys.argv[1])):
threads.append(threading.Thread(target=request_tokens,args=(i,1000)))
for t in threads:
t.start()
total=0
for t in threads:
t.join()
This runs 1000 loops in 50 thread:
python franck.py 50
What is interesting is that the overhead is very small:
On this node, the postgrest
deamon uses less CPU than the calling program python
and the database itself yb-tserver
. You can see 10 postgres
backends because PostgREST has a connection pool, which is the best to guarantee predictable performances.
The beauty of it is that it is easy to install one postgrest
on each database node, opening a REST access for the database schemas.