Decentralized Databases: WeaveDB

K - Jun 21 '23 - - Dev Community

This is the second part of my series of decentralized databases, where I check out if Web3 has more to offer than slow and expensive blockchains.

In the last article, I looked into ComposeDB, the IPFS-powered graph database that runs on the Ceramic Network; this time, I chose WeaveDB. So, if you're into censorship-resistant infrastructure and want to know where to store your data, read on!

What is WeaveDB?

WeaveDB is a NoSQL database similar to Firestore. Its main data structure is nestable JSON documents, which are stored in collections but can also hold sub-collections of their own, making them very flexible.

WeaveDB is a smart contract database, meaning the whole database code is implemented as a smart contract on Arweave.

Permissions are handled via blockchain network addresses. For database administration, supported network addresses are EVM, Dfinity, Intmax, and Arweave. Regular users can also use Lens Protocol and WebAuthn.

The database owner adds their key and becomes the database admin at creation time. After creating the database, the owner can add other keys or define access controls for collections based on keys in documents.

Excursion: Arweave Basics

For people who don't know Arweave, here a quick intro.

What is Arweave?

Arweave is a bit like a blockchain; the difference is that it's optimized for permanent storage. It uses a data structure called blockweave, which offers more and cheaper storage than a blockchain. The "miners" in the network ask each other about random data in the blockweave and only get rewards (mine tokens) when they can prove they stored the data. This incentivizes miners to hold as much of the blockweave as possible.

The smart contract specification used on Arweave is called SmartWeave. Unlike Ethereum's smart contracts, SmartWeaves aren't executed by the miners/nodes but by the clients (i.e., browsers, Node.js).

What is SmartWeave?

A SmartWeave contract is just JavaScript or WebAssembly and an initial state JSON object stored as transactions (TXs) on Arweave. The actual contract is a pure reduce/fold function that takes the initial state and applies inputs to it to calculate the current state. The inputs are also stored as TXs on Arweave. A client loads all TXs and calculates the latest state locally before it can send a new input TX.

Here is an illustration in pseudo-code:



// code in contract TX
function handler(state, { input }) {
  if(input.function === "inc") return {value: state.value++}
  if(input.function === "dec") return {value: state.value--}
}

// code in initial state TX
{"value": 0}

// code in action TXs
{"input": {"function": "inc"}}
{"input": {"function": "dec"}}
{"input": {"function": "inc"}}

// client code
const contractCode = await getContract("<CONTRACT_TX_ID>")
eval(contractCode) // creates a handler function

const initialState = await getState("<INITIAL_STATE_TX_ID>")
let state = JSON.parse(initialState)

const actions = await getActions("<CONTRACT_TX_ID>")

state = actions.reduce(handler, state) // state is now {value: 1}


Enter fullscreen mode Exit fullscreen mode

This architecture eliminates the need for arbitrary computations from the network nodes and allows each client to scale up performance independently.

You can see SmartWeave contracts as CQRS systems, where Arweave is the event store and the state is the projection.

How does Consensus Work?

In the pseudo-code example, everyone can add new inputs to the contract; the only validity criterion is the correct format.

If you add your blockchain address or public key to the initial state, the contract can check that every input was signed with your private key before applying it.

An attacker can still add invalid inputs to Arweave, but your contract would filter them out at evaluation time.

Which Implementations of SmartWeave Exist?

While writing this article, I found two implementations of the SmartWeave specification.

The first is the reference implementation by the Arweave team. It's rather basic but works directly with the Arweave network without additional services, but it's also very slow.

The second is Warp Contracts, which comes with many quality-of-life improvements over the reference implementation. It uses services hosted by the company that maintains Warp Contracts and other third parties (like Bundlr and Streamr) but is quite performant.

What's a Smart Contract Database?

Since gas prices are no concern for SmartWeave contracts and you can write your contract in JavaScript or WebAssembly, this platform is much more flexible than other smart contract specifications.

People are creating all kinds of software with the SmartWeave spec, for example, databases that have their code and initial state stored on Arweave and are executed on the client. Every time a client writes data, a TX is sent to Arweave so other clients can read that TX and update their state.

Since it stores the data on Arweave, everyone who wants to write the database has to pay for the Arweave TXs. This can either be the database operator (like it's done on Web2) or the user.

Note: Services hosted by Bundlr or Redstone (the company behind Warp Contracts) are currently subsidizing TXs. This means small TX are free to write.

If the state grows too large for a client, or you want to subsidize TXs payments for your users, you can put a server between Arweave and your client. Since the data is stored permanently and publicly, everyone can verify that the data is valid.

An intriguing concept, right? So, let's use WeaveDB to build a blog!

Architecture

The architecture of a WeaveDB-powered backend would look something like this:

WeaveDB Architecture

While the code and the data are on Arweave, using it directly would be slow; that's why the WeaveDB SDK uses the Warp SDK to interact with the Warp Network.

The Warp Sequencer ensures the interactions are ordered and sends it to the Bundlr Network, which, in turn, will persist on Arweave.

The Warp Gateway will index the interactions on Arweave and the ones not yet finalized (and still only on the Bundlr Network) for quick access.

The Warp Evaluator will gather the indexed interactions and send them to a decentralized network of nodes that calculate the state. This relieves the browser from doing all these calculations on the user's device.

The replicas either run directly in the browser or on Node.js. WeaveDB first writes changes locally and sends them as TXs to Arweave.

On Node.js, the WeaveDB SDK will automatically subscribe to Warp's notification service, so each local replica will get notified when other replicas added TXs to Arweave that need to be replayed locally.

If WeaveDB runs in the browser, you must poll for updates by other means. Either periodically or when a user navigates between pages.

To sum it up, the whole architecture is based on third-party services. The interactions are stored on Arweave, and Warp will calculate the current state when a browser loads the application.

Prerequisites

Functional Steps

We will start with the functional steps because WeaveDB comes with a web console that allows us to create a database with a few clicks.

Creating a New Database

Open the WeaveDB Console in your browser and click on the "Deploy WeaveDB" button.

Database Creation Dialog

Click the "Connect Owner Wallet" button and select an account/address.

Note: Ensure you use a development account with access to the private key since we need to copy it to the Node.js code that will connect to the database later.

This can take a bit, but it will give you a new database in the WeaveDB instances list.

WeaveDB Instances

After that, you can click the "Connect with DB" button at the top right corner and choose the same address again. Now, we can model the data.

Modeling the Data

First, we create collections for profiles, articles, and comments.

To create a collection, you have to click on the "Data Collections" tab in the navigation on the left and then click the small + symbol beside the "Collections" header of the table.

Create Collection

Create three collections, each with one of these collection IDs:

  1. profiles
  2. articles
  3. comments

Everyone who can sign a transaction can create documents in these collections by default.



{
  "allow write": true
}


Enter fullscreen mode Exit fullscreen mode

Replace it for every collection with the following access rules:



{
  "allow create": {
    "and": [
      { "!=": [{ "var": "request.auth.signer" }, null] },
      {
        "==": [
          { "var": "request.auth.signer" },
          { "var": "resource.newData.user_address" }
        ]
      }
    ]
  },
  "allow update": {
    "==": [
      { "var": "request.auth.signer" },
      { "var": "resource.data.user_address" }
    ]
  },
  "allow delete": {
    "==": [
      { "var": "request.auth.signer" },
      { "var": "resource.data.user_address" }
    ]
  }
}


Enter fullscreen mode Exit fullscreen mode

First, we require everyone who wants to create a document to provide their blockchain address, with no write access for anonymous users.

Also, users can only create documents that contain their own addresses in the user_address field. The allow update and allow delete rules use the user_address field to check if a user has permission to modify that document.

  • resource.data contains the document before modification
    • Used when creating a document to ensure the user supplied their address.
  • resource.newData contains the document after modification
    • Used when updating or deleting a document to ensure users only modify their own documents.

In the case of the profiles collection, it should look like this:

Profiles collection creation dialog

Repeat this process for the articles and comments collections.

After we set up collections and their access rules, let's update the schema for each collection that makes more sense.

  • Click on the "Schema" tab on the navigation on the left
  • Choose the profiles collection in the table
  • Click the edit symbol on the top right corner of the table

The default schema for a collection looks like this:



{
  "type": "object",
  "required": [],
  "properties": {}
}


Enter fullscreen mode Exit fullscreen mode

It means, the collection accepts any document, regardless of its shape.

To prevent users from adding documents that our frontend couldn't understand, update the schema for the profiles collection to the following JSON:



{
  "type": "object",
  "required": ["profile_id", "user_address", "name"],
  "properties": {
    "profile_id": {
      "type": "string"
    },
    "user_address": {
      "type": "string"
    },
    "name": {
      "type": "string"
    },
    "bio": {
      "type": "string"
    }
  }
}


Enter fullscreen mode Exit fullscreen mode
  • Click on the articles collection
  • Click the edit symbol on the top right corner of the table

Update the articles schema to the following JSON:



{
  "type": "object",
  "required": ["article_id", "user_address", "title", "content"],
  "properties": {
    "article_id": {
      "type": "string"
    },
    "user_address": {
      "type": "string"
    },
    "title": {
      "type": "string"
    },
    "content": {
      "type": "string"
    }
  }
}


Enter fullscreen mode Exit fullscreen mode
  • Click on the comments collection
  • Click the edit symbol on the top right corner of the table

Update the comments schema to the following JSON:



{
  "type": "object",
  "required": ["comment_id", "article_id", "user_address", "content", "date"],
  "properties": {
    "comment_id": {
      "type": "string"
    },
    "article_id": {
      "type": "string"
    },
    "user_address": {
      "type": "string"
    },
    "content": {
      "type": "string"
    },
    "date": {
      "type": "number"
    }
  }
}


Enter fullscreen mode Exit fullscreen mode

The schemas are very simple and use IDs for relations; another way of modeling this would be to embed the comments directly into the articles, so we can load them in one go.

Implementation

Now that we've set up our database with schema and access controls, we can implement queries that read and write documents.

Creating a Node.js Project

First, we have to create a new Node.js project.

$ mkdir weavedb-blog
$ cd weavedb-blog 
$ npm init
Enter fullscreen mode Exit fullscreen mode

Installing Dependencies

Next, we need the WeaveDB SDK to connect to our database. To install it, use the following command:

$ npm i weavedb-sdk weavedb-sdk-node
Enter fullscreen mode Exit fullscreen mode

Setting up the Database Connection

To connect to WeaveDB, create an index.js file with this code:



const WeaveDB = require("weavedb-sdk-node")

const contractTxId = "<DATABASE_TX_ID>"
const wallet = {
  getAddressString: () => "<ADMIN_ADDRESS>".toLowerCase(),
  getPrivateKey: () => Buffer.from("<ADMIN_PRIVATE_KEY>", "hex"),
}

async function main() {
  const db = new WeaveDB({ contractTxId })
  await db.initializeWithoutWallet()
  db.setDefaultWallet(wallet, "evm")

  process.exit(0)
}

main()


Enter fullscreen mode Exit fullscreen mode

First, we need to replace the <DATABASE_TX_ID>, which is available in the WeaveDB console.

Select your database under "WeaveDB Instances," and then you can copy the TX ID from the "Settings" on the right.

The <ADMIN_ADDRESS> and the <ADMIN_PRIVATE_KEY> are the ones you used to create the database.

The address needs to be lower case, and the private key needs to be a buffer, but the code accommodates that.

We have to call initializeWithoutWallet because we don't use an Arweave wallet; we could pass it directly to the constructor. Calling setDefaultWallet allows us to use the EVM address and private key as the default identity for our write operations.

After that, we can read from and write to the database. Remember that the database will load all TXs and replicate the current state locally inside the Node.js process.

Creating a Profile

We need to insert a document into the profiles collection to create a profile.

The collection's schema requires every profile to have a profile_id, a user_address, and a name. It also allows an optional field, bio, to describe the user.

To write a document with a custom ID, we have to use the set method and pass it an object with our data, the collection name, and our desired ID for the document. This operation will use our default wallet when creating the database connection. You can pass it after the ID argument if you want to sign with a different wallet.



const data = {
  profile_id: "p1",
  name: "K",
  bio: "Web enthusiast.",
  user_address: wallet.getAddressString(),
}

const tx = await db.set(
  data,
  "profiles",
  data.profile_id
)


Enter fullscreen mode Exit fullscreen mode

There can't be a profile without a user_address, and it has to be the same one that has signed the related TX, so we take it right from the wallet object.

WeaveDB allows us to create documents without IDs inside the object data, but it's pretty handy to have it inside the document, so we added it manually.

While the call to set is asynchronous, this doesn't mean the data persists on Arweave when it resolves. This makes the data available in the local replica of WeaveDB and enqueues it for being persisted as TX later. This can take a few seconds, so if the operation isn't time critical, you can keep working locally with the optimistic update.

To wait until the data is really on Arweave and available for all other replicas, we can call the getResult method of the tx object.



const result = await tx.getResult()


Enter fullscreen mode Exit fullscreen mode

Note: The tx object won't have a getResult method if the set method fails because of missing permissions or required fields in the data object. So, ensure you catch any errors before calling it!

Creating an Article

An article needs an article_id, user_address, title, and content.

In this case, the user_address will facilitate the link between a user's profile and their articles. Yet, we could also use a different foreign key or remove relations completely and add the article as a nested document to the profile. Stand-alone article collections just make interacting with articles independently from profiles easier. For example, when you want to list, sort, and filter all articles on the home page later.



const data = {
  article_id: "a1",
  title: "What is WeaveDB?",
  content: "WeaveDB is a decentralized database built with SmartWeave contracts on top of Arweave.",
  user_address: wallet.getAddressString(),
}

const tx = await db.set(
  data,
  "articles",
  data.article_id
)


Enter fullscreen mode Exit fullscreen mode

Creating a Comment

To comment on an article, we need an article ID to know which article we're commenting on.

In our comment schema, we also used the date for a timestamp when the comment was posted.



const data = {
  comment_id: "c1",
  article_id: "a1",
  user_address: wallet.getAddressString(),
  date: 1687337848805,
  content: "WeaveDB seems interesting. Thanks for the article, I'll check it out!",
}

const tx = await db.set(
  data,
  "comments",
  data.comment_id
)


Enter fullscreen mode Exit fullscreen mode

Reading Data

Now that we created a few documents, we can read them with the get command. It takes a collection and an optional ID. If no ID is passed, it will return an array of documents.

So, we could use the following code to create a profile list:



const profiles = await db.get("profiles")

const html = `
<ul>
  ${profiles.map(
    ({ profile_id, name }) => `
    <a href="/profiles/${profile_id}">
      <li>${name}</li>
    </a>`
  )}
</ul>`


Enter fullscreen mode Exit fullscreen mode

To display the profile details with the articles of a user on a separate page, we can use this code:



const profile = await db.get("profiles", profileId)
const articles = await db.get("articles", ["user_address"], [ "user_address", "==", profile.user_address ])

const html = `
<h1>Profile of ${profile.name}</h1>
<h2>Bio</h2>
<p>${profile.bio}</p>
<h2>Articles</h2>
<ul>
  ${articles.map(
    ({ article_id, title }) => `
    <a href="/articles/${article_id}">
      <li>${title}</li>
    </a>`
  )}
</ul>`


Enter fullscreen mode Exit fullscreen mode

Finally, a page that displays an article and all its comments.



const article = await db.get("articles", articleId);
const comments = await db.get(
  "comments",
  ["article_id"],
  ["article_id", "==", article.article_id]
);

const html = `
<h1>${article.title}</h1>
<p>${article.content}</p>
<ul>
  ${comments.map(
    ({ comment_id, content, user_address, date }) => `
      <li>
        "${content}" 
        - by ${user_address} 
        at ${new Date(12345678).toLocaleString()}
      </li>`
  )}
</ul>`


Enter fullscreen mode Exit fullscreen mode

Deleting Comments

To remove data from WeaveDB, we can use the delete command.

The following code illustrates how to delete a comment:



await db.delete("comments", commentId)


Enter fullscreen mode Exit fullscreen mode

Note: The delete only removes a document from the state; it doesn't delete the TX that created the document. As mentioned, SmartWeave is a CQRS system, where the state is the projection and the TX on Arweave are the events. If someone calls the get command with a block height in the past, they could get the deleted document when it wasn't deleted then. Also, they can browse all TXs of the database on Arweave directly to get the data.

Summary

WeaveDB is an exciting choice for a decentralized database. Arweave guarantees all data is permanently stored, but the state machine allows the local data to grow and shrink as needed.

It comes with powerful data modeling, querying, access control features, integration for popular non-Arweave wallets, and much more that this introduction article didn't cover.

In its newest version, it even integrates with Lit and Lens Protocol. It allows you to leverage document encryption and user profiles.

Note: WeaveDB is still alpha software, so don't use it for your production data!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .