I tried a spatial search with Amazon Redshift and DBeaver ๐
This time, I used DBeaver to connect to Amazon Redshift, a cloud data warehouse provided by AWS, and tried to import location data and execute spatial functions!
Advance Preparation of Location Data
In advance, I used QGIS to prepare random points (1 million in blue) and polygons for spatial search (three in yellow) in shapefile format, a standard supported by Amazon Redshift.
Also, upload the data set to S3 for importing into Amazon Redshift.
Creating Amazon Redshift Clusters
To begin, we will create an Amazon Redshift cluster.
Select Amazon Redshift in the AWS console โ Clusters โ Click Create Cluster.
Set the cluster name, free trial, user name, and password โ Click Create Cluster.
After a while, the cluster will be created.
This completes the creation of the Amazon Redshift cluster ๐
Configuring Roles and Public Access
Next, we will configure the role settings for accessing S3 from Amazon Redshift and the public access settings for connecting to Amazon Redshift from DBeaver.
Assign the role you created in the cluster details screen of Amazon Redshift.
Click Actions โ Click "Change Public Accessible Settings" on the Amazon Redshift cluster details screen.
Enable Publicly Accessible โ Click Save Changes.
Verify that it has been enabled.
On the Amazon Redshift cluster details screen, click Security Groups โ Add Inbound Rule and set the type to Redshift โ Click Save Rule.
Verify that the inbound rule is configured.
This will complete the configuration of roles and public access ๐
Importing Location Data with DBeaver
Next, we will use DBeaver to connect to Amazon Redshift and import location data.
Connect to Amazon Redshift with DBeaver. Set the hostname (cluster endpoint), port, database name, user name, password, and role name.
Let's import a random point (1 million points). Create a table in advance, specify the role and destination of the shapefile to be imported, and execute.
CREATE TABLE points (
wkb_geometry GEOMETRY,
id BIGINT
);
COPY points FROM 's3://redshift-geo-data/points.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::xxxxxxx:role/redshift-geo-role-2112';
Once the data is imported, you can check the information while visualizing it on the map.
Let's try importing the three polygons for spatial search. Create a table in advance, specify the destination and role of the shapefile to be imported, and execute.
CREATE TABLE polygon (
wkb_geometry GEOMETRY,
id BIGINT
);
COPY polygon FROM 's3://redshift-geo-data/polygon.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::xxxxxxx:role/redshift-geo-role-2112';
Once the data is imported, you can check the information while visualizing it on the map.
This completes the import of location data in DBeaver ๐
Running a Spatial Function
Finally, we will try to see if we can perform spatial functions in Amazon Redshift. As a typical example, we will use "ST_AsGeoJSON" to convert data into GeoJSON format and "ST_Within" to execute a spatial function that extracts points within a polygon from 1 million points.
First, let's use "ST_AsGeoJSON" to convert the data to GeoJSON format.
SELECT ST_AsGeoJSON(wkb_geometry) FROM public.polygon;
The data is now output in GeoJSON format!
Next, try extracting the points inside the polygon using "ST_Within."
SELECT public.points.id, public.points.wkb_geometry FROM public.points, public.polygon WHERE ST_Within(public.points.wkb_geometry, public.polygon.wkb_geometry);
The data was output only inside the polygon!
Next, use "ST_Within" to extract the count of points within the polygon.
SELECT COUNT(*) FROM public.points, public.polygon WHERE ST_Within(public.points.wkb_geometry, public.polygon.wkb_geometry);
The count โ5167 pointsโ only in the polygon was output!
Finally, I would like to check if the spatial search results match those using QGIS.
We got the same "5167 points"!
We were able to perform a spatial search with Amazon Redshift and DBeaver ๐
Using Amazon Redshift and DBeaver, we confirmed that we could import location data and perform a spatial search with Amazon Redshift. It looks like it can be used successfully for location data analysis in the future.
Initially, Iโve tried to use Query Editor V2, which is available in the service, but it hasnโt seemed to support location data import or spatial functions at present. If it is supported in the future, it may become easier to use, or it may become easier to use with Amazon Redshift Serverless, whose preview version was announced this week ๐