Update 3/24/24
This post is out-of-date. Please see my latest post on Many-to-Many Arrays with the latest Firestore enhancements.
https://code.build/p/firestore-many-to-many-arrays-X9mf6s
Original Post
Not all many-to-many situations are possible nor impossible in Firestore. I figured I would try and list all of them I can think of with their limitations.
There are 3 common examples:
- Customers and Products
- Classes and Students
- Followers and Following
They each also have different use cases that can dramatically change the limitations.
Note: All of these examples use angular firestore in typescript, but the data modeling and rxjs usage is the same in other languages.
Let's start simple:
Classes and Students
The beauty of this example, is that a there are a limited number of students a class can have, and a limited number a classes a student can take. Even if that number were as high as 1000, it will never be as high as 10,000, which is theoretically the most amount of information you want to have in one Firestore Document.
Model
Classes / ClassID: {
data...
students: [
studentID1,
studentID2,
...
]
}
Students / StudentID: {
data...
classes: [
classID1,
classID2
]
}
You honestly don't need both arrays, and can choose either or, but you do need both collections. However, as you will see, it is much easier to query when you use both.
I suggest you do use both, and use batch to add and update both. That way you can always query in the cleanest way.
🗊 Note: Angular uses a more complex ref for queries, so I simplified them in the examples below that use db instead of this.afs. See here for Angular usage.
Add
A student takes a class
this.afs.doc('classes/' + classID).update({
students: firebase.firestore.FieldValue.arrayUnion(studentID)
});
OR
this.afs.doc('students/' + studentID).update({
classes: firebase.firestore.FieldValue.arrayUnion(classID)
});
Update
A student drops a class
this.afs.doc(`classes/${classID}`).update({
students: firebase.firestore.FieldValue.arrayRemove(studentID)
});
OR
this.afs.doc(`students/${studentID}`).update({
classes: firebase.firestore.FieldValue.arrayRemove(classID)
});
Batch
const batch = this.afs.firestore.batch();
const studentID = 'you-student-doc-id';
const classID = 'your-class-doc-id';
const studentRef = this.afs.doc(`students/${studentID}`).ref;
batch.set(studentRef, {
classes: firebase.firestore.FieldValue.arrayUnion(classID)
});
const classRef = this.afs.doc(`classes/${classID}`).ref;
batch.set(classRef, {
students: firebase.firestore.FieldValue.arrayUnion(studentID)
});
await batch.commit();
Query
Get all classes a student is taking
db.collection('classes')
.where('students', 'array-contains', studentID);
OR
this.afs.doc(`students/${studentID}`).valueChanges().pipe(
switchMap((r: any) => {
const docs: Observable<any>[] = r.classes.map(
(id: any) => this.afs.doc(`classes/${id}`).valueChanges()
);
return combineLatest(docs);
})
);
You have a list of classes in the classes array, then you grab the documents one-by-one. In this case, you're getting one more read.
Get all students taking a class
db.collection('students')
.where('classes', 'array-contains', classID);
OR
this.afs.doc(`classes/${classID}`).valueChanges().pipe(
switchMap((r: any) => {
const docs: Observable<any>[] = r.students.map(
(id: any) => this.afs.doc(`students/${id}`).valueChanges()
);
return combineLatest(docs);
})
);
Like above, you have a list of students in the students array, then you grab the documents one-by-one. In this case, you're getting one more read as well.
So, while you have 2 different ways to add, update, and query, you don't necessarily need to keep both arrays up-to-date at all times, but you will have to be creative in your queries. If you want simpler queries in all cases, keep both arrays up-to-date by using batch when adding to the database.
Multiple Where Clauses
You can easily add something like:
db.collection('classes')
.where('students', 'array-contains', studentID)
.where('status', '==', 'active');
if you wanted to get all active students. The default sorts will be by ID without orderBy()
.
Sorting
Once you want to sort the results, you need to create an index. This can be one step more complicated. By clicking the link in the console the index will be built automatically.
Get all students taking a class sorted by their name
db.collection('classes')
.where('students', 'array-contains', studentID)
.orderBy('name');
This requires an index. This kind of index is not bad, as you know the name of the students array and the students' name field.
Note: You will need to create an index for EACH where clause you add to this query.
db.collection('classes')
.where('students', 'array-contains', studentID)
.where('status', '==', 'active')
.orderBy('name');
OR
Here you avoid the index. The frontend is not as clean, but in this example just add another map field after switchMap:
Add a map to sort
// sort by name
map((s: any) => s.sort((a: any, b: any) => {
const f = 'name';
if (a[f] < b[f]) { return -1; }
if (b[f] < a[f]) { return 1; }
return 0;
}))
After the switchMap
this.afs.doc(`students/${studentID}`).valueChanges().pipe(
switchMap((r: any) => {
const docs: Observable<any>[] = r.classes.map(
(id: any) => this.afs.doc(`classes/${id}`).valueChanges()
);
return combineLatest(docs);
}),
// sort by name
map((s: any) => s.sort((a: any, b: any) => {
const f = 'name';
if (a[f] < b[f]) { return -1; }
if (b[f] < a[f]) { return 1; }
return 0;
}))
);
You see the pattern here, which would be the same for getting all classes a student is taking sorted by the class name
Where Clause on Frontend Joins
It is tempting to think you should just get all the documents, then filter them after like so:
map((a: any[]) => a.filter((f: any) => f.status === 'active'))
In context:
this.afs.doc<any[]>('students/' + studentID).valueChanges().pipe(
switchMap((r: any) => {
const docs: Observable<any>[] = r.classes.map(
(id: any) => this.afs.doc('classes/' + id).valueChanges()
);
return combineLatest(docs);
}),
map((a: any[]) => a.filter((f: any) => f.status === 'active'))
);
While this technically works, it gives you MORE reads than you need. You should use a where clause on the docs, filter out the undefined results, then reduce the top array. This will only read the documents that matches the where clause, saving you reads quickly.
this.afs.doc<any[]>('students/' + studentID).valueChanges().pipe(
switchMap((r: any) => {
const docs: Observable<any>[] = r.classes.map(
(id: any) => this.afs.collection('classes',
ref => ref
.where(firebase.firestore.FieldPath.documentId(), '==', id)
.where('status', '==', 'active')
).valueChanges()
);
return combineLatest(docs);
}),
map((arr: any[]) => arr
.filter((f: any) => f && f[0])
.map((m: any[]) => m[0])
)
);
So as you can see, it gets arduous to use all these rxjs joins. Keep both arrays at all times, and you should be able to query from either direction witout having to use these joins. However, you need to know how to use them anyway.
I will make this a Series. In the next post, I will talk about the benefits of using a map type instead of an array type in Firestore.
I will eventually get to complex cases for scaling issues.
Let me know if I missed something.
J