Look at the gif below. It shows a proof-of-concept implementation of collaborative undo-redo, including "history mode undo", conflict handling, and async operations. In the process of designing and implementing it, I found myself digging down rabbit holes, questioning my own assumptions, and reading academic papers. In this post, I will share my learnings. The source code is available online.
Context and motivation
Undo-redo is a staple of software systems, a feature so ubiquitous that users just assume apps would have it. As users, we absolutely love it because:
- It saves us when we make mistakes (e.g. delete or move something unintentionally).
- It encourages us to experiment and learn by doing (let's try clicking that button and see what happens; worst case we'll undo).
- Together with redo (which is, in fact, an undo of the undo), it allows us to back-track and iterate at zero cost.
- Both undo and redo (when implemented correctly, more on that later) are non-destructive operations (!), giving us a sense of safety and comfort.
As developers, however, implementing a robust undo-redo system is a non-trivial undertaking, with implications that penetrate every part of the app. In my previous post, I listed undo/redo as one of the hard problems in web development, so naturally I wanted to see what it would take to add this feature to my little collaborative todo-mvc toy app. I approached it with respect because I had evidence of its tricky nature:
- In a previous role, I witnessed a peer team struggling to add undo/redo to a WYSIWYG editor. Although I wasn't involved at the technical level, I was aware of some of the pain and challenges they faced, saw how long it took, and noticed how many bugs they had.
- When searching for undo-redo libraries, I couldn't find any that supported the multiplayer use-case. Even the library from the company that made Replicache didn't.
- As a user, I can't think of a single collaborative app that implements undo/redo in a way that feels right (if you know any, please leave a comment). It's also very noticeable in its absence in some major, popular apps. Is it because it was to hard to implement?
- When googling for information about undo-redo, I found (and eventually read) multiple academic papers and master's theses on the subject (e.g., this) . They don't write those about straightforward concepts, do they?
- I also found blog-posts from big players in the field, specifically one from Liveblocks and a short discussion about undo/redo in a post by Figma. I later learned that Figma's UX around undo/redo leaves a lot to be desired, which shows that identifying the problems is easier than coming up with good solutions.
Undo/ redo is a strange beast
I always knew that undo/redo is a deeply "strange" feature, but it was only when I started thinking about implementing it myself that some of its weirder aspects became more salient:
- We all do this: Undo, undo, undo... (repeat N times); copy something; redo, redo, redo (repeat N times); paste.
- And we get annoyed by this: Undo, undo, type a character, realise you can't redo anymore. (Does a disabled "redo" button mean some of the editing-history is lost forever? Spoiler alert: It depends on the implementation, as we'll see in the next section).
- And get anxious over this: Closed the window? Bye bye undo history. Opened the app on another device or tab to keep editing from where you left off? No undo history for you (hopefully the original tab is still open somewhere).
- And what happens if the original tab is still open, you edit on another tab, and then go back to the original tab and hit undo? Would that unintentionally corrupt the state?
- And when working with others on a collaborative whiteboard (e.g., Figma) or similar apps, did you notice how everyone carefully stay away of other people's way and stick to their own "territory"? What if two of them edited the same entity but not at the same time and one of them hits undo?
Points 1 and 2 are intentional design decisions that make undo/redo the feature we know and love. It sacrifices flexibility for speed and simplicity.
Points 3-5 are very sus and require a deeper look.
Modelling a branching state-graph
Let's dive into that second point first: What actually happens when you undo a few times, make a change and see your redo button disabled? The answer is: it depends. I'll explain.
State changes in a system can be represented as a graph. Undo allows us to traverse the graph "backwards in time," and redo allows us to trace our steps back (so "forwards in time") to the present.
Imagine we have the following sequence of operations on an element:
- Edit "" -> "Hello".
- Edit "Hello" -> "Hello World".
- Undo (the state becomes "Hello" again)
- Edit "Hello" -> "Hello Friend"
We can build a graph in which each unique state is represented by a node (like a state machine), as follows:
In this model, step 4 creates a new branch in the graph, and we lose our ability to return to the "Hello World" state. Going backwards from "Hello Friend," which is the present state, leads to "Hello" and then to "". Going forwards from there using "redo" takes us to "Hello" and then to "Hello Friend." That's because while there is always a single path backwards from any state to the initial state, there can be multiple paths forward but "redo" can only follow a single path - the path that leads to the "present" state.
Many systems follow this model, for example: Google Slides as you can see below:
This way of implementing undo/redo leads to increased user anxiety (== bad user experience) because after I undo, I need to be super careful or else I'd lose the ability to restore some states. The browser's "back" and "forward" buttons, which are basically undo/redo for the URL bar, behave this way as well.
But wait, this is not the only way! We can do better (this is where reading academic papers pays off).
What if we built our graph such that every node represented a point in time rather than a state? Time is linear (no branching unless you live in the multiverse) and always moves forward. This approach would represent the same sequence of operations like so:
Now we can never lose any change we've made in the past. Going backwards from "Hello Friend" takes us through every change we've ever made, as shown below.
This flavour of undo is called "History Undo" (see page 4 in this excellent academic paper). It was first introduced by Emacs. It leads to a much better user experience and feels very intuitive to use.
Notice that in both cases the "redo" button gets disabled when the user edits "Hello" to "Hello Friend" (which makes sense if you remember that redo is undo of undo). The difference is in how undo behaves after that point (and as a result, any subsequent "redo").
I implemented both modes in my proof of concept. "History undo" mode is enabled by default.
If you want to get into the nitty-gritty have a look at the source code.
The question of scope
An important property of a good undo-redo implementation is that it operates in the correct scope. In most applications, the expectation is for the undo-manager's scope to be the entire session. As I continuously hit "undo," I expect all the actions I took in the session to be rolled back. If I close the session, I expect to my undo stack to be lost.
If the scope is smaller than the whole session, it can be disorienting for users. For example, if your text-editor has a separate undo stack from the rest of the app, people will call you out on Twitter:
In web applications or desktop applications that supports multiple tabs (e.g., VSCode), a session equals a single tab.
The expectations I described above carry over from single-user to collaborative, multi-user applications. Users expect to only be able to undo/redo their own actions. Users don't seem to expect the undo/redo stack to exceed session scope e.g. be shared between multiple tabs or multiple devices (on which they have the app running). I wonder if it's just because no app has done that yet and once someone does, it would become the new norm. Wouldn't it be great to have your history available on all your devices?
In my implementation I remained within session-scope, like all standard implementations. I did it because it was the easier and quicker option. I might try to extend it to user-scope in future iterations. I am sure it will present interesting technical challenges.
Memento vs. Command pattern
One of the first resources I stumbled upon when I was doing my research was an article in the official Redux docs, explaining how to implement generic undo/redo using the Memento pattern. The idea is so simple: Just save a copy of the state every time it changes and store these state copies as "past" (array), "present" and "future" (array). Whenever you want to undo, push the present state into the "future" array, pop the head of the "past" array and make it the new present, the app re-renders, voilà. No need for app-specific logic, whatever your state shape is - it just works. It sounds so alluring, so beautiful and elegant - there is only one tiny problem: it doesn't work for anything besides the simplest of apps.
Here is why:
- It doesn't deal with side-effects. Undoing and redoing actions tends to involve much more than state changes on the frontend. Apps need to create or cleanup remote resources, call APIs and do all the stuff that real apps do. These side-effects and the logic for addressing them are specific, and need to be handled on a case by case basis by the app. In other words, transition between states involves more than just replacing one state object with another.
- The idea of replacing the state object wholesale is totally incompatible with simultaneous, collaborative editing. It leads to users constantly erasing each other's changes even if they are modifying different parts of the state.
- Even in React apps that use Redux, not all the state is managed by the central store. There is a bunch of local state in the components. Don't we want the undo/redo manager to be able to account for local component state as well?
Sadly, due to of all of the above, the Memento pattern is off the table. This leave us with the much less plug-and-play Command pattern. Instead of storing states we store commands and reverse-commands and execute them whenever we need to roll the state back or forward. "Commands" is just a fancy name for functions that modify state, e.g. "() => markTodoComplete(id, true)" and its reverse "() => markTodoComplete(id, false)".
The command pattern allows us to update the state granularly with less collisions. It allows us to apply arbitrary logic and deal with side effects, and it doesn't know or care about the application state or where it lives. These advantages come at a cost: For every action our app can perform, we now need to implement a reverse-action, and register both with the undo manager. But wait, there's more...
We can still have conflicts, can't we?
Multiple users making changes to the same "document" at the same time means that conflicts can and will occur. Having undo-redo thrown into the mix increases the likelihood of conflicts by introducing the possibility of unintentional and hidden conflicts. When real-time editing, users usually try to stay out of each other's way but the dimension of time makes that more tricky. If I edited a place at an earlier time, and later someone made changes on top of my changes, what's gonna happen if I casually "undo" ten times? I can unintentionally cause someone else to lose work. This can also happen via indirect conflicts - for example: user A creates an item, user B edits the item, user A clicks undo - deleting the item and deleting user B's work as a result (again, without intending or even realising it).
This sounds bad, right? The whole point of undo/redo is to allow users to experiment and time-travel safely, without worrying about accidentally corrupting the system's state.
Sync engines, like Replicache (which our little todo app uses), have the ability to deal with "realtime" conflicts between clients via an authoritative server that can reject and revert changes. However, we don't want the user experience errors due to rejected undo operations, or have nothing happen because the element they are trying to modify no longer exists. See it happening in Figma in the gif below (taken from here). Notice how some undo operations do nothing and the user needs to keep undoing to drain all the bad operations from the undo-stack. That's poor UX. We need to come up with an elegant way to deal with these situations.
Can we simply ignore conflicts?
Some smart people don't think that conflicts are a big deal in multiplayer systems. Adam Wiggins (co-founder of Heroku) for example, dismissed it in this part of his recent talk (not in the context of undo/redo but as a general concern). He was later challenged about it by a question from the audience at this timestamp but stood his ground. To summarise his reasoning: Users stay out of each others' way - it's a social thing (true). Also, when conflicts do occur, users are smart enough to realise what happened and fix it themselves, no big deal. He does note that this is true for the app he's creating (Muse), but might not apply in all cases.
I have to respectfully disagree. It's cool that users find creative ways to work with broken systems, but we can't use that as an excuse for building sub-par apps. We can and should do better for our users.
Dealing with conflicts - undo-manager perspective (in theory)
So, how can we deal with these nasty conflicts?
This paper lays down a solid foundation. I'll do my best to summarise its main ideas for you. The paper discusses the problem of "undoing actions in collaborative systems" in the context of a distributed text editor called DistEdit. It suggests that the undo-manager takes a "Conflict(A,B)" callback from the app, with the following spec (see section 4.2.1):
The Conflict(A, B) function supplied by the application must return true if the adjacent operations A and B performed in sequence cannot be reordered, and false otherwise . The importance of the notion of conflict is that it imposes an ordering on operations A and B. If Conflict(A, B) is true, then the order of operations A and B cannot, in general, be changed without affecting the results. Furthermore, in general, operation A cannot be undone, unless the following operation B is undone
The paper then offers an insight about the users' intentions (see section 5 "Selective Undo Algorithms"):
If an operation A is undone, we assume that users want their document to go to a state that it would have gone to if operation A had never been performed, but all other operations had been performed
To achieve this, the proposed algorithm first rolls back everything that came after the operation we want to undo, makes a backup copy of the "future" stack, and tries to preform the undo by "bubbling" the operation we are trying to get rid off up the stack, kinda like bubble sort. In each step it checks whether the operation we want to undo (A) has conflict with the next adjacent operation (B). If not, they can be swapped, and we can executed a "transposed" version of B without executing A.
This "transpose(A,B)" function, that the app needs to provide, makes sense in the context of text editing, where the cursor position, for example, could be different if operation A never happened. The algorithm keeps working its way up the stack until it either reaches the top (success) or hits a conflict. If it hits a conflict, it tries to get rid of it by checking whether the "future" has the reverse operation; if yes, both can be safely removed. If the conflict cannot be removed, the algorithm determines that the operation cannot be undone (failure). When that happens, the paper offers two options (see section 8.1.4):
- Show the conflicts to the user and ask them to resolve.
- Tell the user about the conflict, ignore that undo operation, and allow the user to undo older operations.
While I like the general idea, I had some issues with this algorithm:
- For a general-purpose undo-manager, outside of collaborative text editing (which most apps use CRDTs for nowadays), it seemed excessive to expect the app to provide transpose functions, which must satisfy five mathematical properties (see section 4.2.2 in the paper).
- I don't want users to be able try to undo something and end up failing. I want to detect the conflicts ahead of time - for better UX.
- While I think it makes sense to skip bad operations, I am not sure that it's a good idea to ask the user to do something about it. If we wanted that we should have implemented features like version-history or version-control rather than undo/redo. Keeping the user informed is, generally speaking, a good idea, but we should aspire to do it in a non-disruptive manner.
Dealing with conflicts - undo-manager perspective (in practice)
With all this in mind, I ended up with the following implementation:
The undo-manager allows each entry (action) to have the attributes 'scopeName', 'hasUndoConflict', and 'hasRedoConflict'. Unlike the "Conflict(A,B)" function from the paper, which takes two adjacent operations, my functions check whether a single undo or redo operation is valid in the context of the current state of the app.
The conflict checks run on the "head" of the undo and redo stacks after every operation, and remove conflicting entries (and everything else with the same scopeName) until a non-conflicting one is found. This way, the next action the user can take is always a non-conflicting one.
The undo-manager provides a way to tell the user when conflicting entries are removed (via a "change reason"), but in the demo app I used it in a very subtle way.
All in all, here is the type definition for a single undo-redo entry:
export type UndoEntry = {
operation: () => void;
reverseOperation: () => void;
hasUndoConflict?: () => boolean | Promise<boolean>;
hasRedoConflict?: () => boolean | Promise<boolean>;
// determines what gets removed when there are conflicts
scopeName: string;
// will be returned from subscribeToCanUndoRedoChange so you can display it in a tooltip next to the buttons (e.g "un-create item")
description: string;
reverseDescription: string;
};
For the full implementation details and how it is used, have a look at the code, for example here.
Dealing with conflicts - app perspective
So the undo-manager facilitates a way to deal with conflicts, but it's up to the app to provide the actual conflict-checking logic. How should it go about that? One useful concept is "ownership".
In a single-user app, there is no question about who owns the data - there is only one user, but what about a multi-user, collaborative app? We can think about it as follows: The last user who modified a piece of data (direct modification - not via undo or redo) owns it. The owner of a piece of data can safely undo/redo their changes to it without overriding someone else's changes.
For example, if I created a todo item and you modified its description, I shouldn't be able to undo the creation of that item, but I should still be able to undo anything else that I have in my undo stack. The granularity of the ownership is determined by the app. If I modified the text of an item and you then marked it as completed, it is probably okay for me to undo my change because I own the description and you own the completeness. If I later directly modify the completeness, I should be able to undo and redo that, and you shouldn't, because I took ownership over completeness.
The gif below shows a simple example: When the user on the right edits the text of the second item, the user on the left loses the ability to undo any edits to its text or its creation, but still has the ability to undo the creation of the first item (which they still own). A sharp-eyed viewer would notice that the undo icon on the left animates when the conflicting entries are removed (when the user on the right enters the text "nope!").
If you don't agree with the specific logic I applied here, that's okay - the logic is totally flexible. The important thing is that we have an undo-manager that makes this possible.
Getting the user experience right
In the gif above, did you notice that little tooltip (just a "title" attribute in this demo) that tells the user what's going to happen when they click undo/redo when they hover over the button? Did you notice how the buttons animate when there is a change in the undo/ redo stack? To achieve these, the undo-manager provides a simple pub-sub service so that the consumer can stay up to speed.
Dealing with asynchrony
The undo-manager has to be able to handle both synchronous and asynchronous operations because all the Replicache calls are async, and in other real-world systems, any call to the backend or external APIs would be async as well. The challenge with async operations is that they can complete in different order to how they start (depending on how long it takes each promise to resolve) and query the system while it's "between states." For example, an operation starts running, and then while it's awaiting something, an "undo" or a conflict check starts running and make their own changes. To deal with that, I introduced a simple module that executes the async operations serially using a queue.
Places where my demo implementation falls short
If you read my previous post, you’d know that I was thoroughly impressed by Replicache. That's still true, but I have hit some walls (missing features) when adding undo-redo to the app. It’s important to note that the undo-manager itself is generic (agnostic about Replicache) but does pose some expectations that Replicache fails to meet (all seem fixable). Here are the main challenges I faced:
- Distinguishing between self inflicted and external updates: When Replicache informs the app about incoming changes from the server, it doesn’t indicate whether these changes originate in the current session or in some other session (external). We’d like to initiate conflict checks when there are incoming external changes, otherwise they already ran when taking the action (before sending it to the server), but how? I ended up adding some app-specific logic to detect that. Ideally, once Replicache exposes that info (relevant issue here), this kind of hack won’t be needed.
- Failure of an optimistic update: Replicache uses optimistic updates, meaning that the server can reject operations or return with a different than expected outcome. When that happens, the state is rolled back and adjusted to reflect the server state. When Replicache does that, it doesn’t notify the app, it comes in as any other state update. That makes it hard to adjust the undo stack, which still contains the original updates that the server just rejected. I could have probably worked around it but opted to leave it unimplemented in this POC. Ideally, Replicache would expose that information in the future.
- Coming back from offline mode: If you go offline, you can do anything you want (including undo/redo) and when you come back online your changes will be pushed to the server and override anything other users did while you were offline. This problem is not specific to undo/redo but a result of the Last Write Wins conflict-resolution strategy that my app uses. In theory, this could be mitigated by adding more sophisticated logic to the server’s mutators, but that would require more research to get right (maybe a good subject for another post).
- Lack of Integration with CRDT-based undo: Supporting embedded text editors in a way that is user friendly remains a challenge (I haven't attempted it yet, another idea for a future post :)).
Closing thoughts
We've covered a lot of ground in this post. While I spent most of it discussing different aspects of the undo-manager, in reality the majority of my time and effort were spent on carefully thinking through and implementing the app's reverse-operations and conflict-checking logic.
In some cases, I had to refactor the app and break down operations to make them undo/redo-friendly. For example, “completeAllItems” couldn’t remain a simple loop that calls “updateItem” with each item-id; it had to become its own thing with its own reverse logic (because maybe another user added or edited items). Some changes to the backend were required as well, such as adding an “un-delete” operation, which is different from “create” because it preserves the original sort position of the item. The database schema changed because I needed to add an "updatedBy" field on each todo, and these are just some examples. Testing is another task that grows considerably when your app has undo-redo.
In other words, undo-redo is one of those features that make every other feature in your app more complicated and time-consuming to implement and maintain. Is it worth it? I think the answer is a resounding yes for productivity apps and content-editors of any kind, but you need to know what you’re getting yourself into. It is definitely not for the faint of heart.
Thank you for reading. Feel free to leave a comment if you have any questions or insights.