In the world of distributed systems, data consistency is a challenging issue. One elegant solution to this problem is using Conflict-Free Replicated Data Types (CRDTs), among which the Last Write Wins (LWW) strategy is particularly popular. In this post, we'll explore what LWW CRDT (state based) is and how you can implement it in C#.
To make it as simple as possible and try to keep the scope at a minimum, we will create a LWW for a boolean state. We begin this adventure with a unit test to capture the properties of our LWW Bool type.
[TestFixture]
public class LWWBoolFixture
{
[Test]
public void CreateNewInstanceOfLWWBool()
{
// Arrange
var timestamp = DateTime.UtcNow.Ticks;
var lwwBool = new LWWBool("id-1", true);
// Assert
Assert.That(lwwBool.Id, Is.EqualTo("id-1"));
Assert.That(lwwBool.Value, Is.True);
Assert.That(lwwBool.Timestamp, Is.GreaterThan(timestamp));
}
}
The most obvious property is the, Value
, which represents the value (true or false in this case) of our entity. Also the, Timestamp
, property is logical, when was the value set. This property is part of the state. The, Id
, property is not as obvious why it's needed.
State based CRDTs often require unique identifiers for each replica to ensure that the merged state is accurate. However for the LWW "family", an Id
may not be as crucial as the timestamp. The timestamp often serves as the primary means of resolving conflicts. Though, if there's a possibility of simultaneous updates with the same timestamp, an additional Id property could help in further resolving such conflicts, i.e good for traceability and debugging. For the time being, I'm retaining the Id
property to ensure our awareness of it.
The implementation of our LLWBool
class could look like this to fulfill the unit test.
namespace CRDT.Register
{
public class LWWBool
{
public string Id { get;}
public bool Value { get; private set; }
public long Timestamp { get; private set; }
public LWWBool(string id, bool initialValue)
{
Id = id;
Value = initialValue;
Timestamp = DateTime.UtcNow.Ticks;
}
}
}
We only want the Id
to be able to be set at the initialization of the object and the Value
and the Timestamp
should only be able to change through a method call. Notice that the Value
can only keep the current value and this is one of the properties of a LWW register.
Lets keep testing our local instance of the LWWBool
. Next, try to change the value of the instance.
[Test]
public void UpdateValueOfTheLWWBoolToTheSameAsBefore()
{
// Arrange
var lwwBool = new LWWBool("id-1", true);
var timestamp = lwwBool.Timestamp;
// Act
lwwBool.SetValue(true);
// Assert
Assert.That(lwwBool.Value, Is.True);
Assert.That(lwwBool.Timestamp, Is.GreaterThan(timestamp));
}
[Test]
public void ChangeValueOfTheLWWBool()
{
// Arrange
var lwwBool = new LWWBool("id-1", true);
var timestamp = lwwBool.Timestamp;
// Act
lwwBool.SetValue(false);
// Assert
Assert.That(lwwBool.Value, Is.False);
Assert.That(lwwBool.Timestamp, Is.GreaterThan(timestamp));
}
These two tests are trying to document that the LWWBool
should have a mechanism for updating its value. Whenever the value is updated, even to its current value, it should also increment the Timestamp
.
To make these tests green we could update our LWWBool
class to look like this.
public class LWWBool
{
public string Id { get;}
public bool Value { get; private set; }
public long Timestamp { get; private set; }
public LWWBool(string id, bool initialValue)
{
Id = id;
Value = initialValue;
Timestamp = DateTime.UtcNow.Ticks;
}
public void SetValue(bool newValue)
{
Value = newValue;
Timestamp = DateTime.UtcNow.Ticks;
}
}
So far there hasn't been anything "distributed" about this code at all. We have only worked with one replica in a single thread. Perhaps the next natural step would be to try to merge two replicas?
As before, lets create a unit test for how it should work.
[Test]
public void MergeShouldNotChangeTheCurrentStateIfItsTheLatest()
{
// Arrange
var replicaTwo = new LWWBool("id-2", false);
var replicaOne = new LWWBool("id-1", true);
var timestamp = replicaOne.Timestamp;
// Act
replicaOne.Merge(replicaTwo);
// Assert
Assert.That(replicaOne.Value, Is.True);
Assert.That(replicaOne.Timestamp, Is.EqualTo(timestamp));
}
Here we want to make sure that the replica with the latest timestamp will be the "last winner". Since the replica one is created after replica two we must assume that the value of replica one has the latest and most correct value. Therefore we expect that the value is still, True
, and the timestamp hasn't changed on replica one.
A test that verifies that we update the replica one if the replica two has a newer state would be good.
[Test]
public void MergeShouldUpdateTheStateWhenTheOtherReplicaIsNewer()
{
// Arrange
var replicaOne = new LWWBool("id-1", true);
var replicaTwo = new LWWBool("id-2", false);
var timestamp = replicaTwo.Timestamp;
// Act
replicaOne.Merge(replicaTwo);
// Assert
Assert.That(replicaOne.Value, Is.False);
Assert.That(replicaOne.Timestamp, Is.EqualTo(timestamp));
}
This time we create the replica two after replica one, and therefore the replica two has the newest timestamp. As we can see the replica one will have the same value and timestamp as replica two after the merge. Outside of this scope, but perhaps it would be nice to implement, IEquatable<LWWBool>
, to define value equality for this class.
The update of our LWWBool
class to match the unit tests could be implemented like this.
public class LWWBool
{
public string Id { get;}
public bool Value { get; private set; }
public long Timestamp { get; private set; }
public LWWBool(string id, bool initialValue)
{
Id = id;
Value = initialValue;
Timestamp = DateTime.UtcNow.Ticks;
}
public void SetValue(bool newValue)
{
Value = newValue;
Timestamp = DateTime.UtcNow.Ticks;
}
public void Merge(LWWBool other)
{
if (Timestamp > other.Timestamp) return;
Value = other.Value;
Timestamp = other.Timestamp;
}
}
We also want to make sure that the state will updated even if both replicas has the same value.
[Test]
public void MergeShouldUpdateTheStateWhenTheOtherReplicaIsNewerEvenIfValueIsTheSame()
{
// Arrange
var replicaOne = new LWWBool("id-1", true);
var replicaTwo = new LWWBool("id-2", true);
var timestamp = replicaTwo.Timestamp;
// Act
replicaOne.Merge(replicaTwo);
// Assert
Assert.That(replicaOne.Value, Is.True);
Assert.That(replicaOne.Timestamp, Is.EqualTo(timestamp));
}
On the wikipedia page about CRDTs we have this section:
State-based CRDTs are called convergent replicated data types, or CvRDTs. In contrast to CmRDTs, CvRDTs send their full local state to other replicas, where the states are merged by a function which must be commutative, associative, and idempotent.
Lets see if we can create tests and verify that we are compatible with the requirements for the merge function.
[Test]
public void MergeShouldBeCommutative()
{
// Arrange
var replicaOne = new LWWBool("id-1", true);
var replicaTwo = new LWWBool("id-2", false);
// Act
replicaOne.Merge(replicaTwo);
replicaTwo.Merge(replicaOne);
// Assert
Assert.That(replicaOne.Value, Is.EqualTo(replicaTwo.Value));
Assert.That(replicaOne.Timestamp, Is.EqualTo(replicaTwo.Timestamp));
Assert.That(replicaTwo.Value, Is.EqualTo(replicaOne.Value));
Assert.That(replicaTwo.Timestamp, Is.EqualTo(replicaOne.Timestamp));
}
The merge (○
) should be able to be done in any order, A ○ B = B ○ A
, commutative. This test pass without any changes to our implementation.
[Test]
public void MergeShouldBeIdempotent()
{
// Arrange
var replicaOne = new LWWBool("id-1", true);
var timestamp = replicaOne.Timestamp;
// Act
replicaOne.Merge(replicaOne);
// Assert
Assert.That(replicaOne.Value, Is.True);
Assert.That(replicaOne.Timestamp, Is.EqualTo(timestamp));
}
Merging (○
) with itself should not modify its state, A ○ A = A
idempotent. This test pass without any changes to our implementation.
[Test]
public void MergeShouldBeAssociative()
{
// Arrange
var replicaOne = new LWWBool("id-1", true);
var replicaTwo = new LWWBool("id-2", false);
var replicaThree = new LWWBool("id-3", true);
// Act
replicaOne.Merge(replicaTwo);
replicaOne.Merge(replicaThree);
replicaThree.Merge(replicaOne);
replicaThree.Merge(replicaTwo);
// Assert
Assert.That(replicaOne.Value, Is.EqualTo(replicaThree.Value));
Assert.That(replicaOne.Timestamp, Is.EqualTo(replicaThree.Timestamp));
Assert.That(replicaThree.Value, Is.EqualTo(replicaOne.Value));
Assert.That(replicaThree.Timestamp, Is.EqualTo(replicaOne.Timestamp));
}
It shouldn't matter which order the replicas is merged when there are many replicas, (A ○ B) ○ C = A ○ (B ○ C)
, associativity. This test pass without any changes to our implementation.
However the way that our merge
function is implemented makes it very imperative. And the test for commutative behavior might not be good. It would be much nicer if it was implemented like an ordinary binary operator, like +
for example. Lets try to refactor our class with an binary operator for merge, and use the ^
as the binary operator for this.
public class LWWBool : IEquatable<LWWBool>
{
public bool Value { get; private set; }
public long Timestamp { get; private set; }
public LWWBool(bool initialValue)
{
Value = initialValue;
Timestamp = DateTime.UtcNow.Ticks;
}
private LWWBool(bool value, long timestamp)
{
Value = value;
Timestamp = timestamp;
}
public void SetValue(bool newValue)
{
Value = newValue;
Timestamp = DateTime.UtcNow.Ticks;
}
public override bool Equals(object? obj) => Equals(obj as LWWBool);
public bool Equals(LWWBool? b)
{
if (b is null)
{
return false;
}
// Optimization for a common success case.
if (ReferenceEquals(this, b))
{
return true;
}
// If run-time types are not exactly the same, return false.
if (GetType() != b.GetType())
{
return false;
}
// Return true if the fields match.
// Note that the base class is not invoked because it is
// System.Object, which defines Equals as reference equality.
return (Value == b.Value) && (Timestamp == b.Timestamp);
}
public override int GetHashCode() => (Value, Timestamp).GetHashCode();
public static bool operator ==(LWWBool lhs, LWWBool rhs)
{
if (lhs is null)
{
if (rhs is null)
{
return true;
}
// Only the left side is null.
return false;
}
// Equals handles case of null on right side.
return lhs.Equals(rhs);
}
public static bool operator !=(LWWBool lhs, LWWBool rhs) => !(lhs == rhs);
public static LWWBool operator ^(LWWBool a, LWWBool b)
{
if (a.Timestamp > b.Timestamp) return new LWWBool(a.Value, a.Timestamp);
return new LWWBool(b.Value, b.Timestamp);
}
}
The major changes here are that we use the IEquatable
as mentioned before to make the tests more neat when comparing replicas. I have also dropped the Id
property since it didn't give anything in this implementation and would make the IEquatable
and the ^
operator more complex. The merge
function has been replaced with the binary operator ^
.
The tests needed to be updated also to reflect the changes.
[TestFixture]
public class LWWBoolFixture
{
[Test]
public void CreateNewInstanceOfLWWBool()
{
// Arrange
var timestamp = DateTime.UtcNow.Ticks;
var lwwBool = new LWWBool(true);
// Assert
Assert.That(lwwBool.Value, Is.True);
Assert.That(lwwBool.Timestamp, Is.GreaterThan(timestamp));
}
[Test]
public void UpdateValueOfTheLWWBoolToTheSameAsBefore()
{
// Arrange
var lwwBool = new LWWBool(true);
var timestamp = lwwBool.Timestamp;
// Act
lwwBool.SetValue(true);
// Assert
Assert.That(lwwBool.Value, Is.True);
Assert.That(lwwBool.Timestamp, Is.GreaterThan(timestamp));
}
[Test]
public void ChangeValueOfTheLWWBool()
{
// Arrange
var lwwBool = new LWWBool(true);
var timestamp = lwwBool.Timestamp;
// Act
lwwBool.SetValue(false);
// Assert
Assert.That(lwwBool.Value, Is.False);
Assert.That(lwwBool.Timestamp, Is.GreaterThan(timestamp));
}
[Test]
public void MergeShouldNotChangeTheCurrentStateIfItsTheLatest()
{
// Arrange
var replicaTwo = new LWWBool(false);
var replicaOne = new LWWBool(true);
var timestamp = replicaOne.Timestamp;
// Act
var mergedReplica = replicaOne ^ replicaTwo;
// Assert
Assert.That(mergedReplica.Value, Is.True);
Assert.That(mergedReplica.Timestamp, Is.EqualTo(timestamp));
}
[Test]
public void MergeShouldUpdateTheStateWhenTheOtherReplicaIsNewer()
{
// Arrange
var replicaOne = new LWWBool(true);
var replicaTwo = new LWWBool(false);
var timestamp = replicaTwo.Timestamp;
// Act
var mergedReplica = replicaOne ^ replicaTwo;
// Assert
Assert.That(mergedReplica.Value, Is.False);
Assert.That(mergedReplica.Timestamp, Is.EqualTo(timestamp));
}
[Test]
public void MergeShouldUpdateTheStateWhenTheOtherReplicaIsNewerEvenIfValueIsTheSame()
{
// Arrange
var replicaOne = new LWWBool(true);
var replicaTwo = new LWWBool(true);
var timestamp = replicaTwo.Timestamp;
// Act
var mergedReplica = replicaOne ^ replicaTwo;
// Assert
Assert.That(mergedReplica.Value, Is.True);
Assert.That(mergedReplica.Timestamp, Is.EqualTo(timestamp));
}
[Test]
public void MergeShouldBeCommutative()
{
// Arrange
var replicaOne = new LWWBool(true);
var replicaTwo = new LWWBool(false);
// Act
var leftSide = replicaOne ^ replicaTwo;
var rightSide = replicaTwo ^ replicaOne;
// Assert
Assert.That(leftSide, Is.EqualTo(rightSide));
}
[Test]
public void MergeShouldBeIdempotent()
{
// Arrange
var replicaOne = new LWWBool(true);
var timestamp = replicaOne.Timestamp;
// Act
var mergedReplica = replicaOne ^ replicaOne;
// Assert
Assert.That(mergedReplica.Value, Is.True);
Assert.That(mergedReplica.Timestamp, Is.EqualTo(timestamp));
}
[Test]
public void MergeShouldBeAssociative()
{
// Arrange
var replicaOne = new LWWBool(true);
var replicaTwo = new LWWBool(false);
var replicaThree = new LWWBool(true);
// Act
var leftSide = replicaOne ^ replicaTwo ^ replicaThree;
var rightSide = replicaThree ^ replicaOne ^ replicaTwo;
// Assert
Assert.That(leftSide, Is.EqualTo(rightSide));
}
}
And all the tests are still passing. Next step to improve the implementation would maybe be to make the class more generic. Same logic should work for other types than bool
.
class LWW<T> : ....
But I think we will stop here for now. This simple C# implementation demonstrates the core concept of CRDT - Last Write Wins register
, which you can expand upon depending on your specific needs.
Happy merging!