Update 3/24/24

This post is out-of-date. I would not recommend this method and my package is no longer maintained. I would use the Fan-out method depending on the use case.
https://code.build/p/building-a-scalable-follower-feed-with-firestore-wCeklv

Original Post

Scalable Arrays Now Possible

In order to get to the scalable follower feed that I want to make possible in Firestore, I first need scalable arrays. I added this feature today (9/4/21) to my adv-firestore-functions npm package. Now you can have a scalable array on any document in any collection for searchable purposes.

To install the package in your firebase functions, go to your functions directory and run npm i adv-firestore-functions.

Products Purchased by a User

So, let's say you have purchased many items from a company. If you're like me, you hate Amazon for their monopolistic practices, but still buy stuff from em all the time :(

Let's say I have bought more than 10,000 items. I cannot store 10,000 items in an array (I don't know the exact limit because it depends on how big your document is, but the limit is close). How can I model this so that I can find:

All products I have purchased

This is easy. Either use a sub-collection with:

users/{userID}/products/ID --> {
  productID: 12jsk3,
  userID: 23929
  createdAt: timestamp,
  updatedAt: timestamp
}

or the compound index on a root collection:

products/productID__userID --> {
  productID: 12jsk3,
  userID: 23929
  createdAt: timestamp,
  updatedAt: timestamp
}

Note: The createdAt / updatedAt can be good for sorting in certain circumstances.

Get all users who purchased a product (productID)

db.collectionGroup('products')
.where('productID', '==', productID)
.where('userID', '==', userID);

db.collection('products')
.where('productID', '==', productID)
.where('userID', '==', userID);

But, you still have to grab the user document on the frontend with rxjs (we love rxjs, but hate using it):

this.afs.collection('products',
    ref => ref.where('productID', '==', productID)
  ).valueChanges({ idField: 'id' }).pipe(
    switchMap((r: any) => {
      const docs = r.map(
        (d: Post) => this.afs.doc(`users/${d.userID}`).valueChanges()
      ) as Observable<any>[];
      return combineLatest(docs).pipe(
        map((d: any[]) => {
          return d.map(
            (doc: any, i: number) => {
              return { ...r[i++], user: doc };
            }
          );
        })
      );
    }),
  );

But then we still have the docID that is not joined! We could technically do it by using more combinedLatest (why we saved createdAt on the compound document), but that is a mess!

Scalable Arrays Function

So, I created array-index that basically lets you have scalable arrays by creating indexes on multiple documents. You must, however, create a Firebase Function that gets triggered on the products onWrite.

Example:

import { arrayIndex } from "adv-firestore-functions";

functions.firestore
  .document("users/{userId}/products/{productId}")
  .onWrite(async (change: any, context: any) => {

    await arrayIndex(change, context);

  });

This will generate a users/{userID}/products_index collection that will be searchable as:

db.collectionGroup('products_index')
.where('products', 'array-contains', productID);

There are also options for maps, pre-sorting, etc. See the docs.

The products_index collection will look like this:

users/{userID}/product_index/{product_index_id} --> {
  products: [
    3k2lsle,
    2k3k2l,
    62221,
  ],
  user: {
    ... all user doc info here
  },
  createdAt: timestamp,
  updatedAt: timestamp

The beauty of this, is it automatically scales. It creates a new index document after 10,000 items (you can set this with max, see the docs). It will automatically remove a product from the index if the product doc is delete from the collection. It automatically adds. I have id sorting and value sorting options that I spoke about in Part 3.

There are no joins to get the user info. If you need the product info, you need to do a frontend join. However, since you're searching by Product ID, it assumes you have that info.

One final caveat, is you may need to update the index when the user doc gets updated. Luckily, I had already written this code with my updateJoinData function.

Usage:

import { updateJoinData } from 'adv-firestore-functions';

functions.firestore
  .document("users/{userId}")
  .onWrite(async (change: any, context: any) => {

    const userID = context.params.userId;
    const queryRef = db.collectionGroup('products_index')
    .where('userId', '==', userID)
    const joinFields = ['displayName', 'photoURL'];
    await updateJoinData(change, queryRef, joinFields, 'user');

  });

This will make sure the products_index always stays up-to-date. For more info on updateJoinData see the docs. There are many options here as well.

Get all products a user purchases...

Obviously you can just query the subcollection as normal, or use a collecitionGroup query.

IN SUM

There are so many options and customizations for these arrays, but this allows you to basically create scalable, automatic arrays (or maps) of any type that grow and can be searched for, easily.

Try it out, let me know bugs on github etc. Pretty complex code, but I wanted to handle every situation automatically.

I also realize there is now Firebase 9. You could translate these frontend functions easily. That is on you as I do not have infinite time.

You should also note that this alone makes Fireship.io's follower feed function, scalable. Get those courses, as I find them invaluable.

The final touch on my scalable follower feed, will be to create an index using this index. You can already get mostly there with these two functions alone, but there are some caveats I will need to address, write more code, etc.

The biggest problem will be updating some of these documents that change...

To be continued more on the follower feed...

Firestore Many-to-Many: Part 5 - Scalable Arrays