🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

adriens - Aug 19 - - Dev Community

❔ About

So you're both a - maybe early - PostgreSQL fan... and a recent duckdb adopter.
You like both databases for their strengths and ecosystems... and wonder how it would be

possible to seamlessly send data from/to each other databases... without having to code anything, I mean nothing more that playing sql shell commands in a terminal

: no Python, no Java,...

👉 Well this is exactly what I will talk about in this post thanks to :

🤔 ... but why this post ?

The 3 main reasons of this article and why to pull/put from PostgreSQL/duckdb at this point are :

DuckDB PostgreSQL
Database Model Columnar database Relational database
License MIT BSD like
Serverless Yes No

🎯 What we'll do

We will, only from terminal :

  1. 🐋 Install & boot a containerized PostgreSQL database (with Podman)
  2. 🐘 Create a database
  3. 🔁 Create and feed a little table
  4. 🪄 Read the psql table from duckdb
  5. 🗜️ Export the psql table to a parquet file
  6. 🔬 Inspect parquet file with parquet-cli

Also we will do the reverse move :

  1. Create a table in PostgreSQL from within duckdb
  2. Test table contents from sql

🍿 Demo

📜 shell scripts

Install & boot a PostgreSQL instance:



export PGPASSWORD=mysecretpassword

# Boot a postgresql instance
podman run --name postgres -e POSTGRES_PASSWORD=$PGPASSWORD -d\
    -p 5432:5432 docker.io/library/postgres

# Check container status
podman ps -a


Enter fullscreen mode Exit fullscreen mode

Install psql so we can reach PostgrSQL from outside the contenair:



# Install `psql` on the host so the database can be accessed
# from outside de container
sudo apt install -y postgresql-client


Enter fullscreen mode Exit fullscreen mode

Now, create some PostgreSQL objects:



# Create a demo database
psql -h localhost -p 5432 -U postgres -c "CREATE DATABASE demo;"

# Create a table
psql -h localhost -p 5432 -U postgres -d demo\
    -c "CREATE table customers(id varchar primary key);"

# Feed the PostgreSQL table with some data
psql -h localhost -p 5432 -U postgres -d demo\
    -c "insert into customers values \
    ('Duffy duck'),\
    ('Daisy Duck'),\
    ('Donald Duck'),\
    ('Ludwig Von Drake');"


Enter fullscreen mode Exit fullscreen mode

Install duckdb :



# (Quick and dirty) duckdb install
wget https://github.com/duckdb/duckdb/releases/download/v1.0.0/duckdb_cli-linux-amd64.zip
unzip duckdb_cli-linux-amd64.zip
cp duckdb /usr/bin/
rm duckdb duckdb_cli-linux-amd64.zip


Enter fullscreen mode Exit fullscreen mode

Now do the fun stuff...

Reach PostgreSQL database from duckdb

Let's reach postgres database from duckdb :



duckdb -c "ATTACH 'dbname=demo user=postgres password=mysecretpassword host=127.0.0.1'\
    AS db (TYPE POSTGRES, READ_ONLY);
show all tables;
select * from db.customers;
COPY db.customers TO 'db.customers.parquet' (FORMAT PARQUET);"


Enter fullscreen mode Exit fullscreen mode

... then check the output parquet file:



ls -ltr
file db.customers.parquet


Enter fullscreen mode Exit fullscreen mode

... and read the resulting parquet file from duckdb :



duckdb -c "select * from 'db.customers.parquet';"


Enter fullscreen mode Exit fullscreen mode

Test resulting parquet file with parquet-cli :



pip install parquet-cli
parq -h

parq db.customers.parquet --count
parq db.customers.parquet --head
parq db.customers.parquet --tail

Enter fullscreen mode Exit fullscreen mode




duckdb ➡️ PostgreSQL

Let's:

  1. "Attach" the remote PostgreSQL instance from duckdb runtime
  2. Create a table
  3. Feed the table
  4. Select table contents from psql


duckdb -c "ATTACH 'dbname=demo user=postgres password=mysecretpassword host=127.0.0.1'</span>
AS db (TYPE POSTGRES);
</span>
create table db.heroes(name varchar primary key);
</span>
insert into db.heroes values
</span>
('Dumbo'),
</span>
('Man-Elephant'),
</span>
('Tantra'),
</span>
('Elephant Man'),
</span>
('The Elephantmen'),
</span>
('Mammomax') ;
"

psql -h localhost -p 5432 -U postgres -d demo</span>
-c "select * from heroes;"

Enter fullscreen mode Exit fullscreen mode




⚖️ More about DuckDB vs PostgreSQL

See below this very synthetic breakdown from influxdata:

Image description

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .