Censorship resistance is a vital topic for decentralized storage solutions like Arweave. Permanent storage is just one variable of the equation. Governments can still prevent access to the perpetual storage, and users might face repercussions when accessing it anyway. Tor solves these two issues by anonymizing the connections between a user and a service like an Arweave gateway.
This article will explain how to turn an AR.IO node into an Onion Service by leveraging Docker Compose.
Target Audience
It would be best if you were comfortable with the Linux command line the Git version control system, and some basic understanding of HTTP and Docker won't hurt either.
Prerequisites
You need Git, Docker, and a browser that's able to handle .onion
addresses (e.g., Tor Browser or Brave).
Background
Let's get some background on the technology you will use first.
What is Tor?
The Tor Project is a set of technologies that allow you to use the Internet anonymously; someone who monitors your traffic when using the Tor network can't find out what websites or services you're accessing. Not even the services themselves know who you are when accessing them.
When you send a service request into the Tor network, it gets relayed through multiple Tor nodes, and in the end, one random exit node calls the service, and its response is relayed back to you. On the way, the request is encrypted repeatedly, in layers, like an onion. That's where the name Tor comes from; initially, it was an acronym for The Onion Router (TOR).
Tor Network Architecture with HTTP Service
What is an Onion Service?
An Onion Service is a proxy server that runs alongside the service it passes through the Tor network.
Usually, you go through 3 types of Tor nodes when accessing services via Tor.
- An entry node accepting a regular connection from a client outside the Tor network.
- Several intermediate nodes will relay the connection around and layer encryptions on them to ensure no single node in the network has all the information to track the request back to the client.
- An exit node that will proxy the connection to the desired service on the Internet.
The big issue in this scenario is the exit node. After the connection leaves the Tor network, it isn't encrypted anymore, so anyone who runs an exit node can read the data and try to use it to identify users of particular services.
An Onion Service gets around this issue by removing the need for an exit node. It's a Tor node that runs alongside the service, making it directly accessible on the Tor network. The service provider terminates the Tor connection on the same machine, or at least the same local network, that hosts the service.
Tor Network Architecture with Onion Service
Why Run an Arweave Gateway as Onion Service?
First, it makes the data on Arweave accessible for everyone, even people in countries where governments block access to Arweave or track who is accessing it for persecution purposes.
Second, converting a service to an Onion Service means putting it into the Dark Web. User identities are unknown to the service, and the server's location and its owner's identity are also unknown to the users. Since hosting an Arweave gateway might be an issue in some countries—it can potentially distribute problematic information—running it as an Onion Service allows a gateway operator to stay safe.
Also, an Onion Service comes with quality-of-life improvements for gateway operators like encryption and NAT punching; you no longer need SSL certificates and can run a gateway on a local network behind a NAT.
Implementation
Now that you understand what Tor this is about, let's get going!
The tasks that await us are:
- Adding a Tor server to the AR.IO node's Docker Compose cluster
- Optional: Improving security by filtering headers via Envoy
- Optional: Improving UX by advertising the Tor address via headers and lowering hops in the Tor network
Creating the docker-compose.yaml File
To get started, create a docker-compose.yaml
with the following content:
---
version: '3.0'
services:
onion:
image: fphammerle/onion-service
ports:
- 80:8000
environment:
VIRTUAL_PORT: '8000'
TARGET: envoy:3000
volumes:
- type: volume
target: /var/lib/tor
- type: volume
target: /onion-service
- type: tmpfs
target: /tmp
tmpfs: {size: 4k}
read_only: true
cap_drop: [ALL]
security_opt: [no-new-privileges]
envoy:
image: ghcr.io/ar-io/ar-io-envoy:latest
build:
context: envoy/
expose:
- 3000:3000
- 9901:9901
environment:
- LOG_LEVEL=info
- TVAL_AR_IO_HOST=core
- TVAL_AR_IO_PORT=4000
- TVAL_GATEWAY_HOST=${TRUSTED_GATEWAY_HOST:-arweave.net}
- TVAL_GRAPHQL_HOST=${GRAPHQL_HOST:-core}
- TVAL_GRAPHQL_PORT=${GRAPHQL_PORT:-4000}
- TVAL_ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
core:
image: ghcr.io/ar-io/ar-io-core:latest
expose:
- 4000:4000
volumes:
- ${CHUNKS_DATA_PATH:-./data/chunks}:/app/data/chunks
- ${CONTIGUOUS_DATA_PATH:-./data/contiguous}:/app/data/contiguous
- ${HEADERS_DATA_PATH:-./data/headers}:/app/data/headers
- ${SQLITE_DATA_PATH:-./data/sqlite}:/app/data/sqlite
- ${TEMP_DATA_PATH:-./data/tmp}:/app/data/tmp
environment:
- NODE_ENV=${NODE_ENV:-production}
- LOG_FORMAT=${LOG_FORMAT:-simple}
- TRUSTED_NODE_URL=${TRUSTED_NODE_URL:-}
- TRUSTED_GATEWAY_URL=https://${TRUSTED_GATEWAY_HOST:-arweave.net}
- START_HEIGHT=${START_HEIGHT:-}
- STOP_HEIGHT=${STOP_HEIGHT:-}
- SKIP_CACHE=${SKIP_CACHE:-}
- SIMULATED_REQUEST_FAILURE_RATE=${SIMULATED_REQUEST_FAILURE_RATE:-}
- INSTANCE_ID=${INSTANCE_ID:-}
- AR_IO_WALLET=${AR_IO_WALLET:-}
- ADMIN_API_KEY=${ADMIN_API_KEY:-}
- BACKFILL_BUNDLE_RECORDS=${BACKFILL_BUNDLE_RECORDS:-}
- FILTER_CHANGE_REPROCESS=${FILTER_CHANGE_REPROCESS:-}
- ANS104_UNBUNDLE_FILTER=${ANS104_UNBUNDLE_FILTER:-}
- ANS104_INDEX_FILTER=${ANS104_INDEX_FILTER:-}
- ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
- SANDBOX_PROTOCOL=${SANDBOX_PROTOCOL:-}
This file is the Docker Compose configuration that ships with the AR.IO node with a few changes.
An added onion
service that handles all the Tor traffic and relays it to the envoy
service, which, in turn, forwards it to the AR.IO node.
The ports of the envoy
and core
services are now private to the cluster, so the gateway is only accessible as an Onion Service.
Running the Cluster
You can run the cluster immediately; no build step is required.
docker compose up
Testing the Gateway
If every container starts correctly, this command will get you the .onion
address:
docker compose exec onion cat /onion-service/hostname
It should look like a random string with .onion
at the end.
You can use it with clients that can handle .onion
addresses, like the Tor Browser and Brave.
Getting the network info:
http://<ONION_ADDRESS>/info
Getting the UDL:
http:///yRj4a5KMctX_uOmKWCFJIjmY8DeJcusVk6-HzLiM_t8
Optional: Filtering Headers
While turning any service into an Onion Service is quite simple, actually staying anonymous depends on more factors. Check out the Tor docs on operational security before attempting to run this setup in production!
One of the steps you can take here is to filter out headers an attacker might use to identify your server or what software it runs.
Cloning the ar-io-node Repository
Since you must update the Envoy image to remove the headers, clone the ar-io-node
repository from GitHub.
git clone https://github.com/ar-io/ar-io-node.git
Updating the Envoy Configuration
Then make the following changes to the ar-io-node/envoy/envoy.template.yaml
file.
Add this code directly under the route_config:
response_headers_to_remove:
- server
- x-server-upstream-envoy
Note: Ensure correct indentation!
If you find any other problematic headers, add them to this list.
Replacing the docker-compose.yaml File
Override the ar-io-node/docker-compose.yaml
file with the one created above and run Docker Compose again, this time with the --build
flag, so the Envoy Docker image contains your changes.
docker compose up --build
After this, the problematic headers are gone.
Optional: Using the Onion-Location Header and Lowering Hops
In many cases, it's fine that the gateway itself isn't anonymous, but you still want to give users the option to access it anonymously.
A person in Europe might not have any issues that the authorities know they're running an Arweave gateway, but a person in China might not be allowed to access this gateway.
If this is the case, you have some UX optimization potential!
- Expose the Envoy publicly so the gateway works over HTTP again.
- Add an
Onion-Location
header via Envoy to advertise the.onion
addresses for the resources the gateway exposes. - Reduce latency by lowering the number of hops between users and the Onion Service. Usually, there are six hops, three from the client and three from the server, but Onion Services can use one hop from the server if only the users need to be anonymous.
Cloning the ar-io-node Repository
Clone the ar-io-node
repository to make changes to the Envoy configuration.
Note: This step is unneccesary if you already did the optional header removal step.
git clone https://github.com/ar-io/ar-io-node.git
Updating the docker-compose.yaml File
As the onion
service will write its .onion
address on the filesystem, the onion
and envoy
services will share a volume
The Envoy proxy will be public again since this scenario doesn't require hiding it.
Finally, the single-hop mode lowers the latency in the Tor network.
Replace the ar-io-node/docker-compose.yaml
content with the following code:
---
version: "3.0"
volumes:
onion-data:
services:
onion:
image: fphammerle/onion-service
ports:
- 80:8000
environment:
VIRTUAL_PORT: "8000"
TARGET: envoy:3000
NON_ANONYMOUS_SINGLE_HOP_MODE: 1
volumes:
- type: volume
target: /var/lib/tor
- type: volume
source: onion-data
target: /onion-service
- type: tmpfs
target: /tmp
tmpfs: { size: 4k }
read_only: true
cap_drop: [ALL]
security_opt: [no-new-privileges]
envoy:
image: ghcr.io/ar-io/ar-io-envoy:latest
build:
context: envoy/
ports:
- 3000:3000
- 9901:9901
environment:
- LOG_LEVEL=info
- TVAL_AR_IO_HOST=core
- TVAL_AR_IO_PORT=4000
- TVAL_GATEWAY_HOST=${TRUSTED_GATEWAY_HOST:-arweave.net}
- TVAL_GRAPHQL_HOST=${GRAPHQL_HOST:-core}
- TVAL_GRAPHQL_PORT=${GRAPHQL_PORT:-4000}
- TVAL_ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
volumes:
- type: volume
source: onion-data
target: /onion-service
core:
image: ghcr.io/ar-io/ar-io-core:latest
build:
context: .
expose:
- 4000:4000
volumes:
- ${CHUNKS_DATA_PATH:-./data/chunks}:/app/data/chunks
- ${CONTIGUOUS_DATA_PATH:-./data/contiguous}:/app/data/contiguous
- ${HEADERS_DATA_PATH:-./data/headers}:/app/data/headers
- ${SQLITE_DATA_PATH:-./data/sqlite}:/app/data/sqlite
- ${TEMP_DATA_PATH:-./data/tmp}:/app/data/tmp
environment:
- NODE_ENV=${NODE_ENV:-production}
- LOG_FORMAT=${LOG_FORMAT:-simple}
- TRUSTED_NODE_URL=${TRUSTED_NODE_URL:-}
- TRUSTED_GATEWAY_URL=https://${TRUSTED_GATEWAY_HOST:-arweave.net}
- START_HEIGHT=${START_HEIGHT:-}
- STOP_HEIGHT=${STOP_HEIGHT:-}
- SKIP_CACHE=${SKIP_CACHE:-}
- SIMULATED_REQUEST_FAILURE_RATE=${SIMULATED_REQUEST_FAILURE_RATE:-}
- INSTANCE_ID=${INSTANCE_ID:-}
- AR_IO_WALLET=${AR_IO_WALLET:-}
- ADMIN_API_KEY=${ADMIN_API_KEY:-}
- BACKFILL_BUNDLE_RECORDS=${BACKFILL_BUNDLE_RECORDS:-}
- FILTER_CHANGE_REPROCESS=${FILTER_CHANGE_REPROCESS:-}
- ANS104_UNBUNDLE_FILTER=${ANS104_UNBUNDLE_FILTER:-}
- ANS104_INDEX_FILTER=${ANS104_INDEX_FILTER:-}
- ARNS_ROOT_HOST=${ARNS_ROOT_HOST:-}
- SANDBOX_PROTOCOL=${SANDBOX_PROTOCOL:-}
Updating Envoy's docker-entrypoint.sh File
You must put the .onion
address into an environment variable to use in the Envoy configuration.
Add the following line into the ar-io-node/envoy/docker-entrypoint.sh
file, just below the # update env vars
comment.
export TVAL_ONION_HOST=$(cat /onion-service/hostname)
The ytt
line that follows will gather all environment variables starting with TVAL_
and use them to generate the envoy.yaml
. Since you mounted the volume the onion
service uses to save its address, the enovy
service can read it.
Updating the Envoy Configuration
Now that you have the .onion
address, you must put it in the right header.
Add this code to the ar-io-node/enovy/envoy.template.yaml
file directly below the route_config:
line.
response_headers_to_add:
- header:
key: "Onion-Location"
value: #@ "http://" + data.values.ONION_HOST + "%REQ(:path)%"
Note: Check the indentation!
A few things will now happen every time the envoy
service starts.
- It mounts the
onion-data
volume shared with theonion
service - It reads the
.onion
address from theonion-data
volume and puts it in theTVAL_ONION_HOST
environment variable. - It replaces the
#@ "http://" + data.values.ONION_HOST + "%REQ(:path)%"
with"http://<ONION_HOST>%REQ(:path)%"
. Where<ONION_HOST>
is the content of theTVAL_ONION_HOST
variable.
Then, on every request, the Enovy proxy handles.
- It will replace
%REQ(:path)%
with the path that was requested. - It will append that path to the
.onion
address. - IT will add the new URL as an
Onion-Location
header to the response.
If you request https://example.com/info
, the header could look like this:
Onion-Location: http://32r2f29g9gc9gd3rtap1e10qj0d38h4f.onion/info
This header allows Tor-compatible browsers to display a button that opens the gateway via Tor.
Summary
Setting up an Arweave gateway as Onion Service is surprisingly easy, especially if you're building on Docker.
Getting the address of the Onion Service into your own service is more work, especially if it runs in a Docker container you can't build yourself.
To review, we now have three ways to set up an Arweave gateway, each with its pros and cons.
Three Different AR.IO/Onion Setups
- The default AR.IO setup has the lowest latency.
- The pure Onion Service setup gives the best privacy for clients and servers but also has quite high latency because of six additional indirections in the Tor network.
- The hybrid setup only protects the clients, but has lower latency than the pure setup, since only four indirections happen in Tor.